People Companies Advertise Archives Contact Us Jason Dowdell

Main > Archives > 2009 > March > OCR Terminal Converts PDF and Images to Searchable Text

Thursday, March 05, 2009

OCR Terminal Converts PDF and Images to Searchable Text

How many times have you stared at copy on a PDF file or screenshot and wished you could simply copy and paste the text?

It's always frustrating to to type-out all that material, word-for-word, only because there's no alternative.
Well here's some good news:

With OCR Terminal, you can convert PDF and words from  images to .txt or *.rtf formats. Free membership is available at the OCR Terminal homepage. Once you've completed the registration process, you may upload,scan and convert up to 30 pages of popular document image formats- PDF,TIFF, jpgs and screenshots.If you need to convert more than 30, you can contact OCR.

Currently, OCR Terminal doesn't convert languages other than English, but developers are planning to add more languages. Other projects include a desktop client for multiple file upload and storage for all OCred documents. Product updates are posted on the company blog and Twitter page.

I wanted to test the conversion tool earlier this week, but due to a mass influx of traffic, the site was down for maintenance. Fortunately,it was up and running this morning .and I uploaded a PDF file, (a copy of my profile on Linkedin), Converting the file was a simple, four-step process.

  1. Browse and select the PDF file or image you want to convert the file to.ocrterminal browse
  2. Clck Upload.ocr terminal upload
  3. Click Yes- begin processingocr process.
  4. Choose and click  the text format you wish to convert toocr terminal convert: .txt, .doc, ,rtf,.pdf.

My Linkedin profile converted smoothly to Notepad, and Microsoft Word , so I decided to try a JPG.

I chose our company logo, Labitat. to convert to Word. OCR successfully scanned and converted it to Word in 8 point font in Arial, without the color scheme. I wasn't sure if that was the default format, so I uploaded a 336x40 black and red  marketingshift  logo.marketingshift banner logo I was surprised when  LARGER of the two images converted to a SMALLER font size (Arial 5), AND the color scheme remained intact.

Converting to .txt resulted in an inaccurate translation of the text.The  "ing" in marketing was interpreted as "M", so the .txt version  read "IMARKETMG SHIFT." To be fair, OCR is still in Beta mode, and those small errors are simple for the user correct.

As OCR Terminal is tweaked and the news spreads, demand will continue to grow for this valuable service. I expect the tool to become a major asset for anyone in academia or business.

By Matt O'Hern at 11:28 AM | Comments (1)

(1) Thoughts on OCR Terminal Converts PDF and Images to Searchable Text

Would be nice to see the original document and the resulting text. You know, a before and after.

Comments by Jason : Friday, March 06, 2009 at 08:09 AM

Post a Comment

Subscribe to Marketing Shift PostsSubscribe to The MarketingShift Feed