Current historical studies of career mobility often focus on linkage of personal records such as baptism records. More qualitative sources, such as biographies contain vital information as well, but are labour intensive to process. We propose a combination of Robust Semantic Parsing and Linked Data conversion tools to automatically derive career patterns from 35,000 biographies in the Biography Portal in the period 1815-1940. Substantively, we answer the question what career patterns looked like and changed over the long Nineteenth century. Methodologically, we evaluate to what extent current CLARIAH tools are up to automate this process. We will progress the semantic parsing tools by improving the linguistic expression set related to HISCO, adding an OCR cleaning step to the pipeline and experimenting with alternative CLARIAH tools for Dutch. This will result in a detailed report on the performance of CLARIAH tools on this data.
Update 2020-03-20: The code for the simple tagger tool is available via: https://github.com/cltl/SimpleTagger