Text Mining Tool

Text Mining Tool screenshot

Text Mining Tool is a free program that can extract text from PDF, DOC, RTF, CHM and HTML files without needing to have any other installed programs that open these files such as MS Word or a PDF reader.

If you ever need to extract text out of any of the above mentioned file formats you can use Text Mining Tool to do so quickly and easily. All you have to do is load the source file and it will extract the text and display it for you, afterwhich you can either save the text file or copy it into the clipboard.

Of course, you can always use a free program such as OpenOffice to open and edit a number of these formats, and many PDF readers allow you to select text from the PDF file and copy it into the clipboard. However, if you don’t want to install any programs or have to open these programs and then manually go through the process of grabbing the text, Text Mining Tool can do it all for you in a single click.

The only issue I have encountered with this program is that when I tried to extract text from a .CHM file it did not work, and I got an error message instead. I am assuming that this error will be fixed in future versions of this program. Here are some other notes:

  • Hotkeys: all functions can be performed using hotkeys (open/save/copy to clipboard).
  • Portable: no install necessary, simply extract and run.
  • CLI tool: entitled minetext is included for use from the command line interface.

Text Mining Toolis a nice, simple tool that supports a good range of file types and is 100% free. If you need to extract text from files you will really appreciate this one.

Version tested: 1.1.42

Compatibility: WinAll. Requires Microsoft .NET Framework 2.0.

Go to the program page to get the latest version (approx 8 megs).


 
 
 
Samer Kurdi

Samer Kurdi

Has been reviewing software since 2006 when he started Freewaregenius.com
Samer Kurdi
We've just launched a new site design for Freewaregenius http://t.co/xaq1ZzmLlW -- tell us what you think - 39 days ago
October 6, 2007
Samer Kurdi
14
flattr this!
  • Pingback: diogomoura´s » links for 2007-10-08

  • Jespard

    Where is a link to download page? I can’t find this on the usual freeware sites (Snapfiles, Betanews, Download, etc.)

  • Jespard

    Never mind. If you want to download this search page for “Go to the program page”. (I always forget this!!!) It is kinda hard to see. Regardless, I love this site. Jespard

  • Joe

    It’s great that it has a CLI tool. It is one thing that truly makes it more useful.

  • Pingback: Teksto kasykla : nežinau.lt

  • David

    I would like to add my results of usage as review.

    Pros:
    - works very well as pdf2text, doc2text utilities
    - gets text from RTF, CHM, HTML files
    - special console utilitiy included for use from the command line interface
    - no installation is needed, just extract the tool and use it
    - hotkeys are really comfortable
    - 100% free

    Cons:
    - text got from HTML files is a bit unformatted (though JavaScript is bitten from it)
    - .NET Framework 2.0 is required

  • Jeff Winterburn

    Console utility minetext.exe really impresses! It performs many convertions from PDF to text with one click (or key press :) ).

  • Pingback: Convert PDF to Word with “Free PDF to Word Doc Converter” Review & Download Link | freewaregenius.com

  • Asus47

    I just can’t convert a pdf into text. It tells me it doesn’t find the file. I’ve already checked the pdf file exists. Some ideas? Here is my command line : E:\ltsaua0\Archivage\minetext\minetext.exe “E:\ltsaua0\Archivage\1.pdf” “E:\ltsaua0\Archivage\mine.txt”
    Thanks

  • os

    Downloaded the program twice, but runtime error when click on the exe, doesnt initialise. How do you use it then? -thanks.

  • Pingback: Amoxicillin.

  • Olaf

    Runs fine for me but really does not like my PDFs. Bombs with error message

    Unhandled Exception: System.Exception: Error during text extraction from Pdf-file: ..\rates.pdf
    Error getting pdf version: java.lang.NumberFormatException: For input string: “TYP”
    at TextMiningTool.Readers.PdfFileReader.GetText(String fileName)
    at TextMiningTool.FileMaster.GetText(String inputFile)
    at MineText.Program.Main(String[] args)

  • Pingback: Text Mining Tool: Extracts text from almost any file | TechoCrunch

  • Amox

    That is great.I was said the converter:Nemo PDF Converter 4.0 converts PDF to Word/RTF and Word/Excel to PDF for uses of different situations with speed and 100 accuracy. It keeps intact of the original files and supports batch conversion. You can either batch convert files from the converter or from the button integrated in your documents with ease. Moreover, its user-friendly interfaces will make you a veteran from a new user in minutes.
    http://www.nemopdf.com/index.html