Understanding OCR and Improving Accuracy
This guide explains how OCR works in VNTranslator and provides practical tips to improve text recognition accuracy.
Note: This guide primarily focuses on traditional OCR engines (Tesseract OCR and Windows OCR). If you're using modern OCR engines like Fast OCR, LLM-based engines (Qwen 2.5 VL, GPT-4 Vision, Claude Vision), or cloud-based engines (Google Cloud Vision, Azure Cloud Vision), you can skip most pre-processing adjustments as these engines handle complex backgrounds and colored text automatically.
How OCR Works in VNTranslator
1. Screen Capture

The first step in the OCR process is capturing an image from the screen. The quality of the captured image significantly impacts the OCR engine's ability to recognize text accurately.
2. Pre-processing (Image Processing)
For Traditional OCR Engines Only.
Pre-processing is primarily needed when using Tesseract OCR or Windows OCR. Modern OCR engines like Fast OCR, LLM-based engines, and cloud-based engines can handle various text conditions without pre-processing adjustments.

During pre-processing, the image is adjusted to display black text on a white background. This contrast makes it easier for traditional OCR engines to recognize the text.
When to use pre-processing:
Using Tesseract OCR or Windows OCR
Game text has colored backgrounds
Low contrast between text and background
Need to improve recognition accuracy for traditional engines
When pre-processing is NOT needed:
Using Fast OCR or modern OCR engines
Using LLM-based engines (Qwen 2.5 VL, GPT-4 Vision, Claude Vision)
Using cloud-based engines (Google Cloud Vision, Azure Cloud Vision)
3. Selecting the OCR Engine
Text recognition accuracy depends heavily on the OCR engine you choose. VNTranslator supports three categories of OCR engines:
Traditional OCR Engines ⭐
Examples: Tesseract OCR, Windows OCR
Best for: Simple text with black text on white background
Limitations: May struggle with colored text or complex backgrounds
Requires: Pre-processing adjustments for better accuracy
Modern OCR Engines ⭐⭐⭐
Examples: Fast OCR, EasyOCR
Best for: Moderate background noise and multi-colored text
Advantages: Better handling of various text conditions without pre-processing
Requires: Minimal to no pre-processing
AI-based OCR Engines ⭐⭐⭐⭐⭐
Examples: Google Cloud Vision, Azure Cloud Vision, Qwen 2.5 VL, GPT-4 Vision, Claude Vision
Best for: Complex backgrounds, rotated text, and colored text
Advantages: High accuracy without pre-processing, handles various text conditions automatically
Requires: No pre-processing needed
For a complete comparison of OCR engines, see OCR Engines.
4. Post-processing
After the OCR engine processes the text, the result will be displayed. If recognition is inaccurate, you can make corrections during post-processing using Regular Expressions (RegExp) to refine the results.
Post-processing is useful for all OCR engine types to:
Remove unwanted characters
Fix common recognition errors
Format the output text
Tips for Improving OCR Accuracy
For Traditional OCR Engines (Tesseract, Windows OCR)
Ensure high-quality image captures: The better the quality of the screen capture, the higher the accuracy of OCR. Avoid blurry or low-resolution images.
Use effective pre-processing: Adjust the image to have high contrast (black text on white background) to make text recognition easier for the OCR engine.
Select appropriate threshold settings: Experiment with threshold values in the pre-processing options to find the best setting for your game.
For Modern and AI-based OCR Engines
Ensure high-quality image captures: Good capture quality still helps, but these engines are more forgiving with image quality.
Skip pre-processing: Modern and AI-based OCR engines work best with the original image without pre-processing adjustments.
Choose the right engine for your needs:
Use Fast OCR for offline, fast recognition with moderate accuracy
Use cloud-based engines for highest accuracy with complex text
Use LLM-based engines for maximum flexibility and accuracy
For All OCR Engine Types
Utilize post-processing: If text recognition is incorrect or you want to remove specific characters, use RegExp during post-processing to refine the output.
Position capture area correctly: Make sure the capture area covers only the text dialogue box to avoid capturing unnecessary elements.
Test different engines: Try different OCR engines to find which works best for your specific game or visual novel.