Handwritten Text Recognition

Many documents used every day are handwritten documents, as for example, postal addresses, bank cheques, medical prescriptions, a big quantity of historical documents, an important part of the information gathered by forms, etc. In many cases it would be interesting to have these documents in digital form rather than paper based, in order to provide new ways to indexing, consulting and working with these documents.

Handwriting text recognition (HTR) can be defined as the ability of a computer to transform handwritten input represented in its spatial form of graphical marks into equivalent symbolic representation as ASCII text. Usually, this handwritten input comes from sources such as paper documents, photographs or electronic pens and touch-screens.

HTR should not be confused with OCR (Optical Character Recognition), because in HTR it is generally impossible to reliably isolate the characters or even the words that compose a handwritten text. HTR, specially for historical documents, is a very difficult task. To some extent HTR is comparable with the task of recognizing continuous speech in a significantly degraded audio file. And, in fact, the nowadays prevalent technology for HTR borrows concepts and methods from the field of Automatic Speech Recognition.