# Understanding OCR and Improving Accuracy

This guide explains how OCR works in VNTranslator and provides practical tips to improve text recognition accuracy.

**Note:** This guide primarily focuses on traditional OCR engines (Tesseract OCR and Windows OCR). If you're using modern OCR engines like Fast OCR, LLM-based engines (Qwen 2.5 VL, GPT-4 Vision, Claude Vision), or cloud-based engines (Google Cloud Vision, Azure Cloud Vision), you can skip most pre-processing adjustments as these engines handle complex backgrounds and colored text automatically.

## How OCR Works in VNTranslator

### **1. Screen Capture**

<figure><img src="https://4121582948-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FKz66WcKqTPRwdFrHi4mM%2Fuploads%2FuspKXK4f23Z5oBj3Lea7%2Focr-screen-new.png?alt=media&#x26;token=d9dc2535-de68-4bcf-a9f2-5469e7717e66" alt=""><figcaption></figcaption></figure>

The first step in the OCR process is capturing an image from the screen. The quality of the captured image significantly impacts the OCR engine's ability to recognize text accurately.

### **2. Pre-processing (Image Processing)**

> **For Traditional OCR Engines Only.**
>
> Pre-processing is primarily needed when using **Tesseract OCR** or **Windows OCR**. Modern OCR engines like **Fast OCR**, **LLM-based engines**, and **cloud-based engines** can handle various text conditions without pre-processing adjustments.

<figure><img src="https://4121582948-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FKz66WcKqTPRwdFrHi4mM%2Fuploads%2Fw2eEKNluzEdP912QBGdC%2Focr-preprocessing-sample.jpg?alt=media&#x26;token=7a41f28a-4505-468f-ae28-96edc86a266c" alt=""><figcaption></figcaption></figure>

During pre-processing, the image is adjusted to display black text on a white background. This contrast makes it easier for traditional OCR engines to recognize the text.

**When to use pre-processing:**

* Using Tesseract OCR or Windows OCR
* Game text has colored backgrounds
* Low contrast between text and background
* Need to improve recognition accuracy for traditional engines

**When pre-processing is NOT needed:**

* Using Fast OCR or modern OCR engines
* Using LLM-based engines (Qwen 2.5 VL, GPT-4 Vision, Claude Vision)
* Using cloud-based engines (Google Cloud Vision, Azure Cloud Vision)

### **3.** Selecting the OCR Engine

Text recognition accuracy depends heavily on the OCR engine you choose. VNTranslator supports three categories of OCR engines:

**Traditional OCR Engines** ⭐

* **Examples:** Tesseract OCR, Windows OCR
* **Best for:** Simple text with black text on white background
* **Limitations:** May struggle with colored text or complex backgrounds
* **Requires:** Pre-processing adjustments for better accuracy

**Modern OCR Engines** ⭐⭐⭐

* **Examples:** Fast OCR, EasyOCR
* **Best for:** Moderate background noise and multi-colored text
* **Advantages:** Better handling of various text conditions without pre-processing
* **Requires:** Minimal to no pre-processing

**AI-based OCR Engines** ⭐⭐⭐⭐⭐

* **Examples:** Google Cloud Vision, Azure Cloud Vision, Qwen 2.5 VL, GPT-4 Vision, Claude Vision
* **Best for:** Complex backgrounds, rotated text, and colored text
* **Advantages:** High accuracy without pre-processing, handles various text conditions automatically
* **Requires:** No pre-processing needed

For a complete comparison of OCR engines, see [OCR Engines](https://docs.vntranslator.com/user-guide/ocr/ocr-engines).

### **4. Post-processing**

After the OCR engine processes the text, the result will be displayed. If recognition is inaccurate, you can make corrections during post-processing using Regular Expressions (RegExp) to refine the results.

Post-processing is useful for all OCR engine types to:

* Remove unwanted characters
* Fix common recognition errors
* Format the output text

***

## Tips for Improving OCR Accuracy

**For Traditional OCR Engines (Tesseract, Windows OCR)**

1. **Ensure high-quality image captures:** The better the quality of the screen capture, the higher the accuracy of OCR. Avoid blurry or low-resolution images.
2. **Use effective pre-processing:** Adjust the image to have high contrast (black text on white background) to make text recognition easier for the OCR engine.
3. **Select appropriate threshold settings:** Experiment with threshold values in the pre-processing options to find the best setting for your game.

**For Modern and AI-based OCR Engines**

1. **Ensure high-quality image captures:** Good capture quality still helps, but these engines are more forgiving with image quality.
2. **Skip pre-processing:** Modern and AI-based OCR engines work best with the original image without pre-processing adjustments.
3. **Choose the right engine for your needs:**
   * Use **Fast OCR** for offline, fast recognition with moderate accuracy
   * Use **cloud-based engines** for highest accuracy with complex text
   * Use **LLM-based engines** for maximum flexibility and accuracy

**For All OCR Engine Types**

1. **Utilize post-processing:** If text recognition is incorrect or you want to remove specific characters, use RegExp during post-processing to refine the output.
2. **Position capture area correctly:** Make sure the capture area covers only the text dialogue box to avoid capturing unnecessary elements.
3. **Test different engines:** Try different OCR engines to find which works best for your specific game or visual novel.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.vntranslator.com/user-guide/ocr/understanding-ocr-and-improving-accuracy.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
