Here we chart the change in frequency of SARS-CoV-2 variants over time. We use this change in frequency to estimate the relative growth advantage or evolutionary fitness of different variants. We apply a Multinomial Logistic Regression (MLR) model to estimate frequencies and growth advantages using daily sequence counts. We apply this model independently across different countries and partition SARS-CoV-2 variants by Nextstrain clades and separately by Pango lineages.Further details on data preparation and analysis can be found in the forecasts-ncov GitHub repo, while further details on the MLR model implementation can be found in the evofr GitHub repo. Enabled by data from .These analyses are the work of Marlin Figgins, Jover Lee, James Hadfield and Trevor Bedford.Multinomial Logistic Regression is commonly used to model SARS-CoV-2 variant frequencies. However, please apply caution in interpretation of these results.
We gratefully acknowledge the authors, originating and submitting laboratories of the genetic sequences and metadata made available through GISAID on which this research is based.
This work is made possible by the open sharing of genetic data by research groups from all over the world. We gratefully acknowledge their contributions. Special thanks to Kristian Andersen, Josh Batson, David Blazes, Jesse Bloom, Peter Bogner, Anderson Brito, Matt Cotten, Ana Crisan, Tulio de Oliveira, Gytis Dudas, Vivien Dugan, Karl Erlandson, Nuno Faria, Jennifer Gardy, Nate Grubaugh, Becky Kondor, Dylan George, Ian Goodfellow, Betz Halloran, Christian Happi, Jeff Joy, Paul Kellam, Philippe Lemey, Nick Loman, Duncan MacCannell, Erick Matsen, Sebastian Maurer-Stroh, Placide Mbala, Danny Park, Oliver Pybus, Andrew Rambaut, Colin Russell, Pardis Sabeti, Katherine Siddle, Kristof Theys, Dave Wentworth, Shirlee Wohl and Cecile Viboud for comments, suggestions and data sharing.