Home » PDF & Ebook readers, Productivity/Organization

Text Mining Tool

Submitted by Samer on October 6, 2007 – 9:41 pm14 Comments

Text Mining Tool screenshotRating: 44 Star Rating

Version tested: 1.1.42

Text Mining Tool is a free program that can extract text from PDF, DOC, RTF, CHM and HTML files without needing to have any other installed programs that open these files such as MS Word or a PDF reader.

If you ever need to extract text out of any of the above mentioned file formats you can use Text Mining Tool to do so quickly and easily. All you have to do is load the source file and it will extract the text and display it for you, afterwhich you can either save the text file or copy it into the clipboard.

Of course, you can always use a free program such as OpenOffice to open and edit a number of these formats, and many PDF readers allow you to select text from the PDF file and copy it into the clipboard. However, if you don’t want to install any programs or have to open these programs and then manually go through the process of grabbing the text, Text Mining Tool can do it all for you in a single click.

The only issue I have encountered with this program is that when I tried to extract text from a .CHM file it did not work, and I got an error message instead. I am assuming that this error will be fixed in future versions of this program. Here are some other notes:

  • Hotkeys: all functions can be performed using hotkeys (open/save/copy to clipboard).
  • Portable: no install necessary, simply extract and run.
  • CLI tool: entitled minetext is included for use from the command line interface.

Text Mining Toolis a nice, simple tool that supports a good range of file types and is 100% free. If you need to extract text from files you will really appreciate this one.

Compatibility: WinAll. Requires Microsoft .NET Framework 2.0.

Go to the program page to get the latest version (approx 8 megs).

14 Comments »

  • [...] Text Mining Tool Review & Download Link | freewaregenius.com (tags: freeware text tool) [...]

  • Jespard says:

    Where is a link to download page? I can’t find this on the usual freeware sites (Snapfiles, Betanews, Download, etc.)

  • Jespard says:

    Never mind. If you want to download this search page for “Go to the program page”. (I always forget this!!!) It is kinda hard to see. Regardless, I love this site. Jespard

  • Joe says:

    It’s great that it has a CLI tool. It is one thing that truly makes it more useful.

  • [...] Kur sužinoti daugiau? Programos svetain?je informacijos nedaug, bet galite paskaityti FreewareGenius apžvalg?. [...]

  • David says:

    I would like to add my results of usage as review.

    Pros:
    - works very well as pdf2text, doc2text utilities
    - gets text from RTF, CHM, HTML files
    - special console utilitiy included for use from the command line interface
    - no installation is needed, just extract the tool and use it
    - hotkeys are really comfortable
    - 100% free

    Cons:
    - text got from HTML files is a bit unformatted (though JavaScript is bitten from it)
    - .NET Framework 2.0 is required

  • Jeff Winterburn says:

    Console utility minetext.exe really impresses! It performs many convertions from PDF to text with one click (or key press :) ).

  • [...] Text: the text was generally handled well. In the ’create text boxes’ scenario the converted document contained dozens of text boxes placed all around the document, which more faithfully approximated the layout and structure of the original PDF (and was more manageable when loaded into Word). Converting while the ’text boxes’ option was unchecked, however, resulted in a single block of text that did not conform as well to where the text layout of the original document. (Note: if its just simple text extraction that you want, you might want to try Text Mining Tool). [...]

  • Asus47 says:

    I just can’t convert a pdf into text. It tells me it doesn’t find the file. I’ve already checked the pdf file exists. Some ideas? Here is my command line : E:\ltsaua0\Archivage\minetext\minetext.exe “E:\ltsaua0\Archivage\1.pdf” “E:\ltsaua0\Archivage\mine.txt”
    Thanks

  • os says:

    Downloaded the program twice, but runtime error when click on the exe, doesnt initialise. How do you use it then? -thanks.

  • Amoxicillin. says:

    Amoxicillin….

    Amoxicillin….

  • Olaf says:

    Runs fine for me but really does not like my PDFs. Bombs with error message

    Unhandled Exception: System.Exception: Error during text extraction from Pdf-file: ..\rates.pdf
    Error getting pdf version: java.lang.NumberFormatException: For input string: “TYP”
    at TextMiningTool.Readers.PdfFileReader.GetText(String fileName)
    at TextMiningTool.FileMaster.GetText(String inputFile)
    at MineText.Program.Main(String[] args)

  • [...] get the latest version from product page click here [via freewaregenius] [...]

  • Amox says:

    That is great.I was said the converter:Nemo PDF Converter 4.0 converts PDF to Word/RTF and Word/Excel to PDF for uses of different situations with speed and 100 accuracy. It keeps intact of the original files and supports batch conversion. You can either batch convert files from the converter or from the button integrated in your documents with ease. Moreover, its user-friendly interfaces will make you a veteran from a new user in minutes.
    http://www.nemopdf.com/index.html

Leave a comment!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.