This guide uses the Nextstrain command-line interface (CLI) tool [GitHub] to help you quickly get started running and viewing the pathogen builds you ma see on nextstrain.org with a minimum of fuss. It assumes you are comfortable using the command line and installing software on your computer. If you need help when following this guide, please reach out by emailing us.

When you’re done following this guide, you will have built a local version of our example Zika analysis and viewed the results on your computer. You’ll have a basic understanding of how to run builds for other pathogens and a foundation for understanding the Nextstrain ecosystem in more depth.

Further information:

Table of Contents:

Set up your computer

Before you can do anything else, you need to set up your computer to run the Nextstrain CLI.

Install Python 3

Python 3.5 or higher is required. There are many ways to install Python 3 on Windows, macOS, or Linux, including the official packages, Homebrew for macOS, and Conda. Details are beyond the scope of this guide, but make sure you install Python 3.5 or higher.

You may already have Python 3 installed, especially if you’re on Linux. Check by running python3 --version:

$ python3 --version
Python 3.6.5

Install the Nextstrain CLI

With Python 3 installed, you can use pip to install the nextstrain-cli package:

$ python3 -m pip install nextstrain-cli
Collecting nextstrain-cli
[…a lot of output…]
Successfully installed nextstrain-cli-1.4.1

After installation, make sure the nextstrain command works by running nextstrain version:

$ nextstrain version
nextstrain.cli 1.4.1

The version you get will probably be different than the one shown in the example above.

Install Docker Community Edition (CE)

The Nextstrain CLI tool also currently requires Docker, which is freely available. On Windows or a Mac you should download and install Docker Desktop (also known as “Docker for Mac” and “Docker for Windows”). On Linux, your package manager should include a Docker package.

After installing Docker, run nextstrain check-setup to ensure it works:

$ nextstrain check-setup
nextstrain-cli is up to date!

Testing your setup…

✔ docker is installed
✔ docker run works

All good!

If the final message doesn’t indicate success (as with “All good!” in the example above), something may be wrong with your Docker installation.

Download the Nextstrain environment image

The final part of the set up is running nextstrain update to download the latest Nextstrain environment image used by the CLI:

$ nextstrain update
nextstrain-cli is up to date!

Updating Docker image nextstrain/base…
[…a lot of output…]
Your images are up to date!

You can re-run this command in the future to get updates to the Nextstrain environment.

Optionally: Install git

git is a version control system used by all of the Nextstrain ecosystem. It is free to download and install git.

While git is not required to use this guide, it is recommended and will be necessary for taking your next steps after this guide.

Download the nextstrain/zika-tutorial repository

Now that you’re set up, it’s time to download the example Zika pathogen repository you’re going to build.

If you have git installed, clone the repository we use to keep track of changes to our analysis:

$ git clone https://github.com/nextstrain/zika-tutorial
Cloning into 'zika-tutorial'...
[…more output…]

When it’s done, you’ll have a new directory called zika-tutorial/.

If you don’t have git installed and want to skip installing it for now, you can instead download a snapshot of the repository in a zip file. After unzipping the snapshot, you’ll need to rename the resulting zika-tutorial-master/ directory to just zika-tutorial/ to match the rest of this guide.

Run the build

Nextstrain builds use the augur bioinformatics toolkit to subsample data, align sequences, build a phylogeny, estimate phylogeographic patterns, and save the results in a format suitable for visualization with auspice.

Run nextstrain build zika-tutorial/ to run the build:

$ nextstrain build zika-tutorial/
Building DAG of jobs...
[…a lot of output…]

This should take just a few minutes to complete. In order to save time, the tutorial build uses an example data set which is quite a bit smaller than our live Zika analysis.

Output files will be in the directories zika-tutorial/data/, zika-tutorial/results/ and zika-tutorial/auspice/.

Visualize build results

Now you can run nextstrain view zika-tutorial/auspice/ to view the build results using Nextstrain’s visualizations:

$ nextstrain view zika-tutorial/auspice/
    The following datasets should be available in a moment:
[…more output…]

Open the link shown in your browser.

Screenshot of Zika example dataset viewed in Nextstrain

Next steps

All source code is freely available under the terms of the GNU Affero General Public License. Screenshots etc may be used as long as a link to nextstrain.org is provided.

This work is made possible by the open sharing of genetic data by research groups from all over the world. We gratefully acknowledge their contributions. Special thanks to Kristian Andersen, Allison Black, David Blazes, Peter Bogner, Matt Cotten, Ana Crisan, Gytis Dudas, Vivien Dugan, Karl Erlandson, Nuno Faria, Jennifer Gardy, Becky Garten, Dylan George, Ian Goodfellow, Nathan Grubaugh, Betz Halloran, Christian Happi, Jeff Joy, Paul Kellam, Philippe Lemey, Nick Loman, Sebastian Maurer-Stroh, Louise Moncla, Oliver Pybus, Andrew Rambaut, Colin Russell, Pardis Sabeti, Katherine Siddle, Kristof Theys, Dave Wentworth, Shirlee Wohl and Nathan Yozwiak for comments, suggestions and data sharing.


© 2015-2019 Trevor Bedford and Richard Neher