I recently read an interesting pre-print on the transmission heterogeneity of COVID-19, which basically states that different people affected with the SARS-CoV-2 infection transmit the infection with different efficiencies. Not all cases are known to transmit COVID-19 with equal infectiousness. Some infected individuals seem to be more efficient in spreading the disease than others. This trend was also observed in the previous SARS outbreak, where the phenomenon of the so-called SARS superspreaders was widely known and described. Till date, one thing that has kept niggling away at me has been the fact that most mathematical models have consistently over-estimated the numbers of cases which were likely to result from COVID-19. While a part of this may be ascribed to the large iceberg effect due to a certain proportion of infected people being asymptomatic or presymptomatic, I feel that there is a possibility that this heterogeneity of infectiousness could also be responsible for this variation.
Access full-text hereAnother thing which could be an indirect inference from this concept of heterogeneity of transmission is that because there is a proportion of infected individuals who have a low probability of transmitting COVID-19, we may not have to wait for 60-70% of the populace to get infected before herd immunity is reached. This, again, is conjecture, but given that most cities where seropositivity rates have been looked at have tended to have seropositivities in the region of 15-30%, this could certainly be a plausible hypothesis which could be tested out using modeling approaches.
The modeling work I read was built on the contact tracing data collected in Punjab during the lockdown period. The authors highlighted four key findings.
First, they estimated that 75% of COVID-19 positive individuals do not transmit the infection to their contacts. One needs to remember that this is based off data generated using the commonly accepted method of detecting COVID-19 cases, RT PCR, which is known to have a sensitivity in the region of 70-75%, and can thus result in missed cases.
Second, they defined a metric called per contact infectiousness (PCI), which is the measure of how efficiently people can spread COVID-19 once they become infected. Despite the wide variation they noted in the infectiousness measures, the secondary cases were mainly driven by individuals who had a high PCI and had a large number of contacts. This, of course, makes perfect intuitive sense. However, the authors do concede that COVID-19 affected people "who have high numbers of contacts, but not high PCI, and vice-versa also explain a large number of cases". Without going into the details of this, it seems that the extent of contact and the transmissibility are both related to varying degrees of infectiousness - all going in favor of maintaining measures for physical distancing and reducing crowding in public places.
Third, the authors admit that there is little certianty, if any, in several of the results they computed given that they had a small sample size.
Fourth, they present a very smart and interesting way to approach contact tracing - something our group in India had been discussing (albeit only with theoretical interest) for a bit. The authors state: "The insight is that when a large fraction of infected individuals do not infect further, a sample of contacts can be tested to ascertain the infectiousness of the sick patient and further contact tracing can be eliminated (or reduced) if that infectiousness is zero. The simulation results suggest that a simple two-step strategy of first testing family members and then testing other contacts only if at least one family member is found to be positive can substantially decrease the requirements while still finding most infections. Testing first only five contacts, for example, identified roughly 75% of infections but reduced costs by 2/3. Tracing fifteen contacts identified, on average, over 95% of infections and reduce the number of individuals traced by around 1/3."
The authors have been very careful in their discourse around why this heterogeneity of transmission is observed. They find an association between PCI and gender, but not age. While it is interesting to note that more and more discussions are happening around the need to test infants and neonates born to, or in contact with COVID-19 positive mothers/parents, this analysis does not find any relationship with age. The gendered association could well be reflective of the usual sociocultural conditions prevalent in the part of India from where the data was culled.
Since this paper is still in the pre-publication point and has not been openly distributed, I have not discussed much of the data or other implications which the authors have highlighted. I decided to write about it since I found it very interesting that these trends are emerging. This is also, interestingly, in line with some previous work which tried to look at Cycle Threshold values and secondary attack rate of COVID-19. As expected, individuals with higher CT values were seen to infect a smaller proportion of their secondary and tertiary contacts, compared to those with a lower CT value. This is ostensibly linked to expected higher viral loads in individuals with lower CT values. However, there are multiple confounders, such as intensity of contact tracing, extent and appropriateness of PPE use, and risk profiles of secondary contacts which need to be matched between the two groups.
Hopefully, once the paper is out, I can revisit this issue again. For now, I just find it interesting that the conversations and conjectures we were having about COVID-19 till a few weeks back are coming out in scientifically rigorous models!