Text extraction in business
Think of a child, they come into the world not knowing how to speak or read but give the child enough examples on how an ‘a’ and viola!
Table of contents
No headings in the article.
Long before computers or the internet, doctors were still treating patients, students were still taking notes in class, lovers exchanged letters, and authors were still publishing their books. You get the drift. The art of writing has been around for centuries. Digitization came with the extra burden of storing the handwritten scans as images or having to type every word on the manuscripts to convert them to softcopies. Then OCR came along, but it was still not perfect. Access to handwritten text data and the cheap processing power have enabled humans to train models that ‘learn’ like a human being to ‘read’ text. The concept of learning comes from how the human brain works. Think of a child; they go into the world not knowing how to speak or read but give the child enough examples on how an ‘a’ looks, and it becomes second nature to identify the letter. It’s hard not to read text when you see it. Your brain will read any text it can identify almost subconsciously.
Text extraction using AI models that work on handwritten text or print scans saves the time needed to convert manuscripts to softcopies. However, the process faces a few challenges:
- Some handwritings are so hard to read for humans and machines. Doctors are known to have illegible handwriting. Even after passing a model on their notes, you would still need to have the extracted text autocorrected by a human or have a specialized model in medicine that can autocorrect according to a medical context
- Lighting can be the determining factor between 99% and 40% accuracy. Photos taken in poor lighting conditions will result in poor results from the text extraction.
- Folds and creases on the hardcopy can affect the legibility of words. If the folds are at crucial places, the word extracted may not make sense, thus the need for semantic checks.
- Speaking of semantics, the Shakespearean era had a lot of thou, thy, thee. Current generalized models are not trained on many Shakespearean writings, and thus, even with semantic checks in place, the performance on autocorrecting manuscripts from this era would still need some form of human intervention
- Teachers can automate the process of marking exams using a combination of text extraction and models that understand language.
Regardless of the challenges, the perks of this technology outshine the cons.
The doctor can digitize all his patient records and even search according to a keyword. For example, they can search for all the patients whose records show a headache as a symptom.
The ancient writings can be preserved digitally in a format anyone can read.
The student can now turn their classmate’s notes in a neat pdf without struggling to read or having to battle lighting conditions
The agreements over land and other assets now only require a signature
- Having a softcopy also opens other doors of using AI like speech to text and speech translation. The grandmother, who cannot read but speaks Spanish, can now listen to a play by Shakespeare in Spanish.
Only to name but a few interesting use cases.