Estimating COVID-19's $R_t$ in Real-Time for India
Estimating R0 for States as a Time Series
Estimating COVID-19's $R_t$ in Real-Time
Kevin Systrom's work which in turns builds on Bettencourt & Ribeiro 2008
Based onIn any epidemic, $R_t$ is the measure known as the effective reproduction number. It's the number of people who become infected per infectious person at time $t$. The most well-known version of this number is the basic reproduction number: $R_0$ when $t=0$. However, $R_0$ is a single measure that does not adapt with changes in behavior and restrictions.
As a pandemic evolves, increasing restrictions (or potential releasing of restrictions) change $R_t$. Knowing the current $R_t$ is essential. When $R>1$, the pandemic will spread through the entire population. If $R_t<1$, the pandemic will grow to some fixed number less than the population. The lower $R_t$, the more manageable the situation. The value of $R_t$ helps us (1) understand how effective our measures have been controlling an outbreak and (2) gives us vital information about whether we should increase or reduce restrictions based on our competing goals of economic prosperity and human safety. Well-respected epidemiologists argue that tracking $R_t$ is the only way to manage through this crisis.
So far, there has been no similar work in India. This compliments the work Prof. Shamika Ravi where she tracks Compounded Daily Growth to understand the pandemic response.
More importantly, it is not useful to understand $R_t$ at a national level.
Instead, to manage this crisis effectively, we need a local (state, district and/or city) level granularity of $R_t$.
FILTERED_REGIONS = ["India", "State Unassigned"]
Taking a look at the state, we need to start the analysis when there are a consistent number of cases each day. Find the last zero new case day and start on the day after that.
Also, case reporting is very erratic based on testing backlogs, etc. To get the best view of the 'true' data we can, I've applied a gaussian filter to the time series. This is obviously an arbitrary choice, but you'd imagine the real world process is not nearly as stochastic as the actual reporting.
Since our results include uncertainty, we'd like to be able to view the most likely value of $R_t$ along with its highest-density interval.
What is the plot saying?
-
The red dots represents most likely value of $R_t$. The shaded gray area is the 95% confidence interval and true $R_0$ can be within the grey region.
-
When $R_t$ >1, the pandemic will spread exponentially through the entire population. If $R_t$<1, the pandemic will grow to some fixed number less than the population. The lower $R_t$, the more manageable the situation.
Data
Input: We use APIs from @amodm which is turn uses covid19india.org data
$R_t$ curve data: Find the state wise $R_t$ data here as a csv for easy download: https://github.com/NirantK/CovidSeer/blob/master/_data/rt.csv