Tesseract OCR
Download & Install Tesseract
Visit the Tesseract at UB Mannheim
Select the tesseract-ocr-w64-setup-v5.3.x.exe (64 bit) file to download the Tesseract executable installer
Once downloaded, open the executable file and follow the installation prompts
Make sure you have installed the tesseract-64bit in C:\Program Files\Tesseract-OCR
Trained Data Files (Languages)
You can download the .traineddata file for the language you need and place it in Tesseract OCR installation directory C:\Program Files\Tesseract-OCR\tessdata\[here]
(this should be the same as where the tessdata directory is installed)
tessdata https://github.com/tesseract-ocr/tessdata Speed : Faster than tessdata-best Accuracy : Slightly less accurate than tessdata-best
tessdata-best
(Recommended for video games)https://github.com/tesseract-ocr/tessdata_best Speed : Slowest Accuracy : Most accurate
tessdata-fast https://github.com/tesseract-ocr/tessdata_fast Speed : Fastest Accuracy : Least accurate
Page Segmentation Modes
The PSM allows you to select a segmentation method dependent on your particular image and the environment in which it was captured
1
Orientation and script detection (OSD) only.
2
Automatic page segmentation with OSD.
3
Automatic page segmentation, but no OSD, or OCR. (not implemented)
4
Fully automatic page segmentation, but no OSD. (Default)
5
Assume a single column of text of variable sizes.
6
Assume a single uniform block of vertically aligned text.
7
Assume a single uniform block of text.
8
Treat the image as a single text line.
9
Treat the image as a single word.
10
Treat the image as a single word in a circle.
11
Treat the image as a single character.
12
Sparse text. Find as much text as possible in no particular order.
13
Sparse text with OSD.
14
Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific.