# Understanding OCR and Improving Accuracy

This guide explains how OCR works in VNTranslator and provides practical tips to improve text recognition accuracy.

**Note:** This guide primarily focuses on traditional OCR engines (Tesseract OCR and Windows OCR). If you're using modern OCR engines like Fast OCR, LLM-based engines (Qwen 2.5 VL, GPT-4 Vision, Claude Vision), or cloud-based engines (Google Cloud Vision, Azure Cloud Vision), you can skip most pre-processing adjustments as these engines handle complex backgrounds and colored text automatically.

## How OCR Works in VNTranslator

### **1. Screen Capture**

<figure><img src="https://4121582948-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FKz66WcKqTPRwdFrHi4mM%2Fuploads%2FuspKXK4f23Z5oBj3Lea7%2Focr-screen-new.png?alt=media&#x26;token=d9dc2535-de68-4bcf-a9f2-5469e7717e66" alt=""><figcaption></figcaption></figure>

The first step in the OCR process is capturing an image from the screen. The quality of the captured image significantly impacts the OCR engine's ability to recognize text accurately.

### **2. Pre-processing (Image Processing)**

> **For Traditional OCR Engines Only.**
>
> Pre-processing is primarily needed when using **Tesseract OCR** or **Windows OCR**. Modern OCR engines like **Fast OCR**, **LLM-based engines**, and **cloud-based engines** can handle various text conditions without pre-processing adjustments.

<figure><img src="https://4121582948-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FKz66WcKqTPRwdFrHi4mM%2Fuploads%2Fw2eEKNluzEdP912QBGdC%2Focr-preprocessing-sample.jpg?alt=media&#x26;token=7a41f28a-4505-468f-ae28-96edc86a266c" alt=""><figcaption></figcaption></figure>

During pre-processing, the image is adjusted to display black text on a white background. This contrast makes it easier for traditional OCR engines to recognize the text.

**When to use pre-processing:**

* Using Tesseract OCR or Windows OCR
* Game text has colored backgrounds
* Low contrast between text and background
* Need to improve recognition accuracy for traditional engines

**When pre-processing is NOT needed:**

* Using Fast OCR or modern OCR engines
* Using LLM-based engines (Qwen 2.5 VL, GPT-4 Vision, Claude Vision)
* Using cloud-based engines (Google Cloud Vision, Azure Cloud Vision)

### **3.** Selecting the OCR Engine

Text recognition accuracy depends heavily on the OCR engine you choose. VNTranslator supports three categories of OCR engines:

**Traditional OCR Engines** ⭐

* **Examples:** Tesseract OCR, Windows OCR
* **Best for:** Simple text with black text on white background
* **Limitations:** May struggle with colored text or complex backgrounds
* **Requires:** Pre-processing adjustments for better accuracy

**Modern OCR Engines** ⭐⭐⭐

* **Examples:** Fast OCR, EasyOCR
* **Best for:** Moderate background noise and multi-colored text
* **Advantages:** Better handling of various text conditions without pre-processing
* **Requires:** Minimal to no pre-processing

**AI-based OCR Engines** ⭐⭐⭐⭐⭐

* **Examples:** Google Cloud Vision, Azure Cloud Vision, Qwen 2.5 VL, GPT-4 Vision, Claude Vision
* **Best for:** Complex backgrounds, rotated text, and colored text
* **Advantages:** High accuracy without pre-processing, handles various text conditions automatically
* **Requires:** No pre-processing needed

For a complete comparison of OCR engines, see [OCR Engines](https://docs.vntranslator.com/user-guide/ocr/ocr-engines).

### **4. Post-processing**

After the OCR engine processes the text, the result will be displayed. If recognition is inaccurate, you can make corrections during post-processing using Regular Expressions (RegExp) to refine the results.

Post-processing is useful for all OCR engine types to:

* Remove unwanted characters
* Fix common recognition errors
* Format the output text

***

## Tips for Improving OCR Accuracy

**For Traditional OCR Engines (Tesseract, Windows OCR)**

1. **Ensure high-quality image captures:** The better the quality of the screen capture, the higher the accuracy of OCR. Avoid blurry or low-resolution images.
2. **Use effective pre-processing:** Adjust the image to have high contrast (black text on white background) to make text recognition easier for the OCR engine.
3. **Select appropriate threshold settings:** Experiment with threshold values in the pre-processing options to find the best setting for your game.

**For Modern and AI-based OCR Engines**

1. **Ensure high-quality image captures:** Good capture quality still helps, but these engines are more forgiving with image quality.
2. **Skip pre-processing:** Modern and AI-based OCR engines work best with the original image without pre-processing adjustments.
3. **Choose the right engine for your needs:**
   * Use **Fast OCR** for offline, fast recognition with moderate accuracy
   * Use **cloud-based engines** for highest accuracy with complex text
   * Use **LLM-based engines** for maximum flexibility and accuracy

**For All OCR Engine Types**

1. **Utilize post-processing:** If text recognition is incorrect or you want to remove specific characters, use RegExp during post-processing to refine the output.
2. **Position capture area correctly:** Make sure the capture area covers only the text dialogue box to avoid capturing unnecessary elements.
3. **Test different engines:** Try different OCR engines to find which works best for your specific game or visual novel.
