│By Sarah L Ketchley, Senior Digital Humanities Specialist│
An integral part of the workflow of any digital humanities project involving text generated automatically by Optical Character Recognition (OCR) is the correction of so-called OCR errors. The process is also called ‘data cleaning’. This post will explore some of the considerations researchers should be aware of before starting to clean their data in Gale Digital Scholar Lab, using the built-in text cleaning tool. It will also offer additional resources for working with data in other formats outside of the Lab.