Aurelius Noble, organiser
The seminars covered a basic introduction to the application of data science methods to economic history. In particular, it focussed on automated data collection, transcription, and labelling. Advances in this field have made it possible for researchers to rapidly transcribe and annotate millions of documents. The seminars provided a broad introduction to the field: web-scraping, automated transcription using machine learning, and natural language processing. The main focus was on transcription. Namely, how to use computer vision to transcribe a variety of historical documents: from printed directories, to tables, to handwritten documents. The seminars will contain a theoretical overview of: the state of the field, central concepts, pipelines and tools. They also incorporated a brief workshop demonstrating some basic implementations of these tools in Python.
The sessions were as follows:
• Transkribus, with Sara Mansutti (Transkribus), 5th March (Tuesday), 12:00-13:30pm, Online.
• Automatic Transcription in Python I, Printed Documents, with Aurelius Noble (LSE), 8th March (Friday), 14:00-17:00
• Automatic Transcription in Python II, Handwritten Documents, with Aurelius Noble (LSE), 10th May (Friday), 14:00-17:00