FreeOCR.net: free optical character recognition program converts images to text in multiple languages

Have you ever faced a situation where you needed to obtain editable text out of an image or a PDF file created from a scanned document? What you need in this case is “Optical Character Recognition” (OCR) software that will literally “read” the document and try to identify characters and words visually, and FreeOCR.net is just such a program.

FreeOCR.net performs optical character recognition on images or PDF files that have a scanned origin.It can process PDF, TIF, BMP, JPG, and PNG files and provides an acquire function for running documents through a scanner.The simple user interface allows you to exclude non text elements (such as images or tables), although this has to be done manually.

For documents with multiple pages, each individual page has to be processed by the user separately, although FreeOCR will “pool” the output into a single text. FreeOCR.net is based on the open source Tesseract OCR engine and comes pre-installed with English support, although many other languages can be downloaded and added (including non latin character based languages such as Japanese, Korean, Indonesian, etc.)

FreeOCR1FreeOCR3

This is an excellent basic OCR app that can get the job done. It works really well for use on the occasional document, or at least short documents. It is possible to process long documents (ebooks, etc), but in this case you would be better off with some of the more professional (and paid) apps that are out there.

PROS:

  • Powerful engine: produces excellent results in general, at least for English which I tested. Note that images are recommended to be scanned at 200 dpi or more.
  • Supported formats: processes PDF and most image filetypes (and will not restrict you to TIF as some others do).
  • Supports a wide range of languages: English comes pre-installed, but other languages can be installed separately (see here). Languages include French, Italian, German/Fraktur, Spanish, Dutch, Vietnamese, Bangla, Czech, Catalan, Polish, Lithuanian, Latvian, Bulgarian, Russian, Greek, Korean, Slovakian, Ukranian, Japanese, Indonesian, Norwegian, Hungarian, Serbian, Turkish, Tagalog, Romanian, Chinese (traditional & simplified), and Swedish.
  • Simple interface: allows for selecting chunks of text to process, such as to circumvent pictures and other elements.

CONS:

  • Does not process pages in batch: as it is designed to do one page at a time, which limits its usefulness for large documents.
  • No post-OCR processing: such as spellchecking for example.
  • No user-assisted “learning”: such as employed by some other commercial OCR packages.

The verdict: an excellent free OCR solution. If you need to convert the occasional scanned document to editable text this will do the job. However, if you need to process hundreds of pages it can do the job in theory but is likely to be too labor intensive (much less labor intensive that re-typing though!).

Although I only tested English, the multi language support is quite noteworthy. If you do use for other language (esp. non latin) please post on your experience in the comments section. Thanks.

Version Tested: 3.0

Compatibility: Windows 2000, 2003, XP, Vista, Windows 7.

Go to the program home page to download the latest version (approx 156K).


 
 
 
Okozo desktop: display animated, flash-based interactive wallpapers on your desktop
eXtraButtons: add up to 9 window control buttons on your program or folder title bars
Jan 4, 2011
Samer Kurdi
16
flattr this!
  • http://none Somebody

    Indonesian? Were you maybe thinking of Thai? Fact checking publishing is so easy. No harm done other than a bit of credibility.

    • Samer

      @ Somebody: see the attached image below (from the Google code page where you go to download language support). Whoever created the “Indonesian” support called it that. I am not a language exerpt nor do I claim to be. But I do admit I didn’t know they spoke Thai in indonesia.

      http://www.freewaregenius.com/wp-content/uploads/2011/01/Indonesian.jpg

  • jfjb

    I’ve been using this program for several years with Xtreme satisfaction.
    For sure, I do not need fancy features, simple straight text rendering for my writing, research and various documentation from scans, web images or PDF files.
    It is a very handy application taking 3.05 MB disk space, and 35 MB in RAM — but minimizes to 1.1 MB and restores to 6-7 MB — it understands my French Spanish and German as well.
    In four words, I second your opinion.

    P.S. Hello Samer. Off the record: Bonne Année et Bonne Santé — as we say in French for Happy Year and Good Health — to you, your own personal family and the Says family.

    • Samer

      @ jfjb: happy new year (and happy health) to you too ;)

  • Kyle

    i dont see a korean language pack?

  • http://gnrsu.com G.N.R.S.U

    really cool use and easy to use !!!

  • Stanislavf

    Yes, “Somebody@” needs to pull his head out from the dark, damp, odiferous areas:

    Indonesian (Bahasa Indonesia) is the official language of Indonesia. Indonesian is a normative form of the Riau Islands dialect of Malay, an Austronesian language which has been used as a lingua franca in the Indonesian archipelago for centuries. (Wikipedia and other sources)

    Side note, I found a Czech language pack and it does not seem to work at all. Still trying to investigate what is the matter, but when I do an OCR with Czech selected, I get a blank page response. Same with “old german” … which makes me suspect I am doing something wrong.

    The english works fine…

    I don’t mind the one page at a time, although as I am trying to translate some of my grandmother’s writing it is a bit of a pain.

  • matt

    I do not understand why i cannot use it – it is not reacting when proceeding with ocr – even with that sample text….

  • lin jie

    Hi
    how can i download the simplified chinese language pack? i’ve gone to http://www.freewaregenius.com/wp-content/uploads/, as someone above suggested, but i can’t get access
    thanks

  • Jay

    Hi I am looking for the traditional mandarin language pack that the article says it supports, but I cant seem to find it, could anyone point me in the right direction?

    thx

  • pako

    “Languages currently available are:
    Portuguese(Brazilian), Fraktur(Old German), Dutch, Spanish, German, Italian, Vietnamese, French & English.”, not all you listed..

  • Mem

    be nice to have a ****ING link to the program instead of your crapppy adverts.

  • Cezary

    why do you brag about support for many languages if in fact you offer only a few? False advertising.

    • Samer Kurdi

      @ Cezary: this is a review of a free product. It is not an advertisement, we are not connected to the developers in any way, and nobody is trying to sell you anything. It seems that you are unable to set up support for the language you need; if I were you I would have asked nicely for help instead of pointing fingers, but I suggest you find what you are looking for somewhere else.

  • ran76

    The japanese language pack doesn’t seem to exist. Any suggestions?

  • Charles

    Hey, you still around Sir?