# Tesseract OCR

{% embed url="<https://www.youtube.com/watch?v=5Uzl211TxjY>" %}

### **Download & Install Tesseract**

* Visit the [Tesseract at UB Mannheim](https://github.com/UB-Mannheim/tesseract/wiki)
* Select the **tesseract-ocr-w64-setup-v5.3.x.exe (64 bit**) file to download the Tesseract executable installer
* Once downloaded, open the executable file and follow the installation prompts

{% hint style="info" %}
Make sure you have installed the tesseract-64bit in C:\Program Files\Tesseract-OCR
{% endhint %}

### Trained Data Files (Languages)

You can download the `.traineddata` file for the language you need and place it in Tesseract OCR installation directory `C:\Program Files\Tesseract-OCR\tessdata`\\`[here]` \
(this should be the same as where the tessdata directory is installed)

> **tessdata** <https://github.com/tesseract-ocr/tessdata> \
> Speed : Faster than tessdata-best \
> Accuracy : Slightly less accurate than tessdata-best

> **tessdata-best** `(Recommended for video games)` <https://github.com/tesseract-ocr/tessdata_best> \
> Speed : Slowest \
> Accuracy : Most accurate

> **tessdata-fast** <https://github.com/tesseract-ocr/tessdata_fast> \
> Speed : Fastest \
> Accuracy : Least accurate

### Page Segmentation Modes

The PSM allows you to select a segmentation method dependent on your particular image and the environment in which it was captured

<table><thead><tr><th width="69" align="center"> </th><th>Page segmentation modes</th></tr></thead><tbody><tr><td align="center">1</td><td>Orientation and script detection (OSD) only.</td></tr><tr><td align="center">2</td><td>Automatic page segmentation with OSD.</td></tr><tr><td align="center">3</td><td>Automatic page segmentation, but no OSD, or OCR. (not implemented)</td></tr><tr><td align="center">4</td><td>Fully automatic page segmentation, but no OSD. (Default)</td></tr><tr><td align="center">5</td><td>Assume a single column of text of variable sizes.</td></tr><tr><td align="center">6</td><td>Assume a single uniform block of vertically aligned text.</td></tr><tr><td align="center">7</td><td>Assume a single uniform block of text.</td></tr><tr><td align="center">8</td><td>Treat the image as a single text line.</td></tr><tr><td align="center">9</td><td>Treat the image as a single word.</td></tr><tr><td align="center">10</td><td>Treat the image as a single word in a circle.</td></tr><tr><td align="center">11</td><td>Treat the image as a single character.</td></tr><tr><td align="center">12</td><td>Sparse text. Find as much text as possible in no particular order.</td></tr><tr><td align="center">13</td><td>Sparse text with OSD.</td></tr><tr><td align="center">14</td><td>Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific.</td></tr></tbody></table>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.vntranslator.com/user-guide/ocr/ocr-engines/tesseract-ocr.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
