How Coherent Are The 2020 Production And Labour Market Data? I - The Basis For Assessment

The preceding briefs in this series have drawn on a number of sources to assess the evolution of the economy during 2020. These data sources have been compiled for different purposes and to a great extent independently of one another. Yet they overlap and their findings can be compared. The questions arise: How coherent are the data from these sources? What judgements can be made after all the sources have been considered? What uncertainties and puzzles remain?

Introduction

This brief is the tenth in series discussing developments in production, the labour market and macroeconomic policy since the beginning of 2020[1]. The preceding briefs in this series have drawn on a number of sources to assess the evolution of the economy during 2020: the national accounts, monthly production statistics, the Quarterly Labour Force Survey (“QLFS”), Quarterly Employment Statistics (“QES”) and the National Income Dynamics Study’s Coronavirus Rapid Mobile Survey (“NIDS-CRAM”).

This brief will discuss the considerations relevant to assessing coherence. A companion brief will apply them to a discussion of convergence and divergence between data sources on production and employment in 2020.

Comparability issues

When comparisons are made between data sources, attention must be paid to three issues:

1. Differences in definitions. These can be found both in the spheres of production and employment.

Production. National accounts are built up on the basis of value added, in order to avoid double counting, and stages of price formation. Enterprises buy inputs in the form of raw materials or intermediate goods produced by other enterprises and add value to them before selling them. The value added is appropriated in two ways: by employees in the form of remuneration and by owners in the form of gross operating surplus. Output and value added are different concepts, and making a comparison between them requires an assumption. The assumption in the analysis in this series of briefs is that value added is proportional to output in the short run during which technological change is limited, so that the intermediate inputs per unit of output does not change.
Employment. The definition of employment varies across sources. Employment in the QLFS occurs if a person has worked for an hour or more in a reference week (the week before the interview) or has a job or business from which they were temporarily absent. In NIDS-CRAM, the reference period is a month. Only in the NIDS-CRAM data is there a recorded distinction between paid and unpaid temporary absence from work, QLFS data are collected throughout the relevant quarter. NIDS-CRAM data are collected for specific months. By contrast, employed persons in the QES are those on the payroll at the end of the relevant quarter. Persons who are temporarily absent from their jobs and who are not being paid are not counted as employed.

Variations in the classification of the employed into those employed in the formal and informal sectors are even more complex. he QLFS defines informal employment as precarious employment, irrespective of whether or not the entity for which the employed work is in the formal or informal sector. Persons in informal employment comprise all persons in the informal sector, employees in the formal sector, and persons working in private households who are not entitled to or receive basic benefits such as pension or medical aid contributions from their employer, and who do not have a written contract of employment. The informal sector has the following two components: employees working in establishments that employ fewer than five employees, who do not deduct income tax from their salaries/wages, and employers, own-account workers and persons helping unpaid in their household business who are not registered for either income tax or value-added tax. This definition of informal employment is elaborate and classification depends on the answers to several questions.

By contrast, one can make the formal/informal distinction in the NIDS-CRAM data only on the basis of the presence or absence of a written contract in the case of the employed, and on registration for VAT and/or income tax in the case of businesses. The QES is different again. Whether employment is counted depends entirely on VAT registration of the business worked for. While businesses must register for VAT if their expected annual turnover is more than R 1 million per year, they may do so if their turnover was more than R 50 000 in the preceding year. The QES sample includes only VAT registered businesses with a turnover of at least R 300 000. Employment as measured by the QES is most closely related to formal sector employment outside agriculture, but some QES employment may be regarded as informal by the QLFS, and the QES may miss some formal employment.

2. Sampling error. Statisticians make a distinction between populations and samples. If one enumerates an entire population accurately, relevant characteristics are known precisely. If a population is sampled, one obtains estimates of relevant characteristics which approximates the values of the population characteristics. A different sample would produce a different estimate. Statistical theory can be used to assess variations in estimates associated with sampling, and these variations are known as standard errors. The general rule is that the sampling errors decrease as sample size increases, though sampling error is also affected by survey design. Sample sizes vary considerably between sources of information. The NIDS-CRAM sample is the smallest. The QES and QLFS samples are considerably larger. Statistics South Africa publishes sampling errors of some of their estimates from both the QES and the QLFS.

Sampling error can show up in unexpected places. One does not generally associate it with the national accounts but, insofar as its estimates are based on sample surveys, sampling error is present. Monthly production statistics are also obtained from surveys.

3. Non-sampling error. This is the hardest to assess but crucial, because it may introduce bias into results. Sample design generally avoids bias. That is to say that, although estimates have a degree of uncertainty, the upside and downside risk of deviations from population values are the same. Bias means that estimates are systematically higher or lower than population values. Three factors may introduce bias. The first is a low response rate, the proportion of selected units which actually provide information. Bias enters the picture when responsiveness is correlated with the characteristics being measured. The NIDS-CRAM Survey had the lowest response rate. There was also marked drop in the response rate in the QLFS in the second quarter and much smaller one in the QES. The second is a poorly designed questionnaire, especially when it comes to sensitive questions, leading to concealment of the truth by respondents. The third is poor quality field work, which may arise from poor training of enumerators or inadequate control with the result that enumerators fail to find the respondent units identified by the sample design, or simply fabricate information instead of gathering it.

Moreover, missing data blur the picture, since one cannot be sure about the extent to which gaps are correlated with characteristics of interest.

Conclusion

Circumspection in making comparisons is called for. And, at the end of the day, judgement cannot be based on statistical technicalities alone. The companion brief will bring the various considerations to bear on 2020 production and employment measurement issues.

Charles Simkins
Head of Research
charles@hsf.org.za

[1] The nine preceding briefs are Charles Simkins, (1) Decision making in a time of uncertainty, 11 June, (2) The Adjustment Budget and beyond, 30 June, (3) Has the Supplementary Budget betrayed the promise of a R 500 billion stimulus package? 15 July, (4) Austerity and a permanent income shock, 15 July, (5) The implications of the second quarter Gross Domestic Product data, 11 September, (6) (with Charles Collocott) July production statistics: an indication of a V-shaped recovery? 28 September, (7) The April to June Quarterly Labour Force Survey: a cautionary note, 30 September, (8) The National Income Dynamics Study’s Coronavirus Rapid Mobile Survey: the labour market in the first and second quarters of 2020, 14 October, and (9) August production estimates and April to June Quarterly Employment Statistics.