If the text is a little difficult then there are definitely better OCR engines out there such as ABBYY, Prime Recognition, Omnipage and many others although they will cost money. It is free and works well on clean and clear text. If your image quality is good and your fonts are clean and of a decent size then I would recommend using Tesseract OCR from Google and OCROpus as suggested by Michael Mior. Other engines will allow you to train your OCR engine to deal with new fonts and this will help considerably if you have a strange font. You will find these hints will be more effective than the option of being able to choose the exact fonttype to OCR. if you only have numeric characters then the 0 (Zero) character can never get confused with an 'O' or 'o' or 'Ø'. You can also select a subset of characters such as uppercase or numeric only to improve results considerably. Many OCR engines allow you to set some recognition parameters to help improve recognition such as fixed width or proportional, serif or non-serif, machine or hand print. There are better options to pick to improve recognition. If an OCR engine can read your font in the first place then I would just use it and not worry about it. In fact OCR engines don't get as confused if there is only one font to recognise on a page. Most OCR engines will handle this situation quite well.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |