Challenges in Constructing Large-Scale Economic Microdata: The Historical Income Panel of the Netherlands, 1850-1920

Auke Rijpma, Utrecht University
Eva van der Heijden, Utrecht University
Rick Schouten, Utrecht University
Paul Puschmann, Radboud University Nijmegen

This paper discusses the Historical Income Panel for the Netherlands (HIP-NL), a project aiming to create an income panel for the Netherlands in the period 1850-1920. By using municipal tax records, we estimate incomes for 100.000s individuals from a sample of 10% of Dutch municipalities and trace them over time. Moreover, the project aims to link these income estimates to individual records in databases of civil and population registries. In doing so the project will craete a valuable resources to scholars working on topics such as living standards, inequality, and intergenerational mobility. Besides the general aims of the project, this paper focuses on the infrastructural challenges HIP-NL faces. We discuss the modern workflows from archives to database, including efficient OCR for printed versions of the tax registers and HTR as a support tool for handwritten records. Next, we discuss two challenges flowing from the heterogeneous nature of the source material. First, as every municipality had its own income tax system, harmonising the taxes and income estimates is crucial, but difficult. Second, we discuss automated record linkage in this setting, where the features available to perform linkage vary strongly from one subset of the data to the next.

No extended abstract or paper available

 Presented in Session 144. New Historical Data Infrastructure I