A Request For More Detailed Information About The Spread Of The Coronavirus Epidemic

In this brief, Charles Simkins, argues for the release of more detailed information about the coronavirus epidemic.

Introduction

We have been watching the updates on COVID-19 released by the Minister of Health since 17 March 2020. Apart from provincial breakdowns of infections, the data contained in them can be reported in a single page (see the Annexure). We think that the level of reporting has become inadequate for an assessment of what we do and do not know about the extent and spread of the epidemic. Critical policy choices depend on such an assessment. This brief makes the case for a more extensive reporting system – a database which can be updated each time new information becomes available[1].

Does the public need to know?

But, it may be objected, does the public really need more information and is the cost of providing it justified? Can its assessment not be left to the National Command Council, advised by the COVID-19 ministerial advisory committee? Ought the risk of misinterpretation by an untutored public be avoided? The answers in a democracy are yes, no and no. Public policy must be able to withstand critical scrutiny, and scrutiny is not possible if information is hidden. Open debate should be capable of winnowing out poor judgement. And justification engenders consent, in ways the police and the army never can.

The interpretation of data

Drawing inferences from data depend on the methods of data collection. Three are relevant here.

1. Most of the data in the Annexure has been collected from self-chosen presentation of individuals to health care providers – general practitioners, private hospitals, government clinics, public hospitals. Individuals may present with symptoms sufficiently serious to require medical attention, or because they have been identified as at risk through contact tracing, or because they are anxious that they may have been exposed to infection. The decision to test will reflect norms of good practice and, particularly in the public sector, specific testing guidelines. The number of infections identified in this way will always be an underestimate of all infections. Some of the infected will not have symptoms severe enough for them to seek medical help. Some will find the health care system impossible to access. And good practice norms and testing guidelines do not mean that all those infected will be tested.

It would be useful to have information on tests conducted and test results for tests initiated in the private sector and the public sector separately. It may be that the proportions of positive tests differ between the sectors, and that these proportions are evolving differently. Since people obtaining services from the public sector are likely to be poorer than those using the private sector, the disaggregated data would be evidence about the impact of income on the progression of the epidemic. Moreover, the existence of a differential proportions means that a changing mix of testing between the public and private sectors would have an impact on discovery of infections.

2. Pro-active mass screening and testing of populations believed to be particularly vulnerable has begun. There is no information about the selection of sites where this testing is taking place or will take place. There also no information about screening protocols, criteria for whether to refer screened persons to health providers, or mechanisms for maximizing the proportion of referred persons reporting to providers. We do not know what information will be reported by screeners and provider, or how this information will be assembled, processed and reported. Mass screening and testing are intended to identify infection hot spots, but we do not know how such areas will be identified and delineated. All these are programme design issues which will affect the interpretation of results from the programme.

3. Additional to the effort to find hot spots will be the need to obtain information to determine which restrictions on economic activity should be lifted. How to do this is the subject of debate globally, but each country will have to find a strategy which suits its circumstances. Priority will have to be given to those activities most essential for economic and social functioning. The information base will be a combination of general epidemiological data, including information form possible new sentinel sites, and work place measures to ensure the safety of workers. Consideration will need to be given to whether and how information from work places should be reported.

A revised geographical grid

At present, information is being reported on a provincial basis. This grid will prove too coarse. Metros should be reported on separately and a way found to divide the rest of the country up, possibly according to district municipalities.

The table below indicates the distribution of reported infections across provinces on 15 April. The question arises: is this a reasonable reflection of the spread of the epidemic, or are large swathes of the country terra incognita? A more refined geographical grid would throw more light on the situation.

Provincial distribution of infections (15-Apr-20)

Province	Infections	Population (thousands)	Infections (per million)
Western Cape	657	6844	96
Gauteng	930	15176	61
KwaZulu-Natal	519	11289	46
Free State	97	2887	34
Eastern Cape	199	6712	30
Northern Cape	16	1264	13
North West	23	4027	6
Mpumalanga	22	4592	5
Limpopo	25	5983	4
Unallocated	18
Total	2506	58774	43

Conclusion: detective work, hypothesis testing and model confrontation

Formal statistical hypothesis testing can only be used on data from a properly constructed sample, in which design probabilities are known. However, the information we have, and may have in the short run, is not collected on this basis. So, for now, assessment must be based on detective work and relevant epidemiological understanding.

Models of the epidemic have been built round the world and in South Africa. There are limits to models: experience with COVID-19 is limited, models of country epidemics do not always say the same thing, and even short-term projections can prove way off the mark. Models are best used to bring data into a rational relationship, in the process exposing gaps in knowledge. They should be confronted with one another as part of a critical and constructive process to strengthen insights. In particular, the models being used by government should be exposed to scrutiny.

The government is almost certainly right that the worst of the epidemic is yet to come, and it is probable that the lockdown has flattened the curve somewhat. But the question on everyone’s mind is how the epidemic will evolve and how developments will shape the policy options we have. Data dissemination is essential for informed policy debate.

Charles Simkins
Head of Research
charles@hsf.org.za

Annexure

Publicly released COVID-19 data

		Cumulative infections detected	Deaths		Recoveries	Tests
		Cumulative infections detected	new	cumulative	Recoveries	Total	Private laboratories	Public laboratories
March	17	85
	18	116
	19	150
	20	202
	21	240
	22	274
	23	402				12815	10803	2012
	24	554
	25	709
	26	927
	27	1170				28537
	28	1187
	29	1280	1	2
	30	1307
	31	1353	2	5
April	1	1380
	2	1462		5
	3	1505	2	7
	4	1585		9		53937
	5	1655	2	11		56783
	6	1686	1	12		58098
	7	1749
	8	1845				63776
	9	1934
	10	2003		24	410	73028
	11	2028	1	25		75053
	12	2173				80085	*	*
	13	2272
	14	2415				87022
	15	2506	7			90515

* On 12 April, it was reported that, of 5032 tests conducted in the past day, 3192 were done in public laboratories

[1] An example of such a data base is the Johns Hopkins University COVID-19 dashboard, widely used to follow the spread of the epidemic across the world.