OCR (Optical Character Recognition) enables the Document Converter to extract text from images, scanned documents, and PDFs that contain non-selectable text.Documentation Index
Fetch the complete documentation index at: https://docs.grigori.in/llms.txt
Use this file to discover all available pages before exploring further.
OCR Providers
The Document Converter supports multiple OCR providers, each with different strengths:PaddleOCR
Recommended for production
- 80+ languages supported
- High accuracy
- Free and open source
- Works offline
- GPU acceleration support
EasyOCR
Easy to set up
- 80+ languages supported
- Simple installation
- Good performance
- PyTorch-based
- Active community
Mistral AI
AI-powered
- Context-aware processing
- Excellent accuracy
- Multi-modal understanding
- Requires API key
- Cloud-based
Configuration
- PaddleOCR
- EasyOCR
- Mistral AI
Environment Variables:Supported Languages:Usage Example:
- English (
en) - Chinese Simplified (
ch) - Chinese Traditional (
chinese_cht) - French (
french) - German (
german) - Japanese (
japan) - Korean (
korean) - Spanish (
spanish) - And 70+ more languages
When to Use OCR
Required for:
- Scanned documents (PDF, images)
- Screenshots of text
- Photos of documents
- Handwritten text (limited support)
- PDFs with image-based text
- Old or legacy documents
Not needed for:
- Modern PDF files with selectable text
- Word documents (.docx)
- PowerPoint presentations (.pptx)
- Excel spreadsheets (.xlsx)
- Plain text files (.txt)
- Web pages (HTML)
OCR Quality Factors
Image Quality
Image Quality
Resolution:
- Minimum: 150 DPI
- Recommended: 300+ DPI
- Higher resolution = better accuracy
- High contrast between text and background
- Black text on white background is ideal
- Avoid low contrast color combinations
- Sharp, focused images
- Avoid blurry or pixelated text
- Good lighting conditions
Text Characteristics
Text Characteristics
Font Size:
- Minimum: 10pt font size
- Recommended: 12pt or larger
- Very small text may be missed
- Sans-serif fonts work better
- Avoid decorative or script fonts
- Standard fonts (Arial, Times) are ideal
- Horizontal text works best
- Vertical text supported but less accurate
- Avoid skewed or rotated text
Layout Considerations
Layout Considerations
Structure:
- Clear column separation
- Consistent spacing
- Avoid overlapping text
- Clean, uniform background
- Avoid patterns or textures
- Remove noise and artifacts
- Adequate white space around text
- Clear boundaries between sections
- Avoid text near edges
Performance Comparison
| Provider | Speed | Accuracy | Languages | Offline | GPU Support |
|---|---|---|---|---|---|
| PaddleOCR | Fast | High | 80+ | ✅ | ✅ |
| EasyOCR | Medium | Good | 80+ | ✅ | ✅ |
| Mistral AI | Slow | Very High | All | ❌ | N/A |
Best Practices
Choose the Right Provider
- PaddleOCR: Production workloads, batch processing
- EasyOCR: Development, simple setups
- Mistral AI: Complex documents, maximum accuracy needed
Optimize Images
- Scan at 300+ DPI resolution
- Use high contrast settings
- Ensure proper lighting
- Crop to relevant areas
Configure Languages
- Specify exact languages for better accuracy
- Avoid unnecessary languages (slows processing)
- Use multiple languages only when needed
Advanced Configuration
Troubleshooting
PaddleOCR Issues
PaddleOCR Issues
Common Problems:
- Model download failures
- CUDA compatibility issues
- Memory errors with large images
EasyOCR Issues
EasyOCR Issues
Common Problems:
- PyTorch installation issues
- CUDA version mismatches
- Model loading failures
Mistral AI Issues
Mistral AI Issues
Common Problems:
- API key authentication failures
- Rate limiting errors
- Network connectivity issues
General OCR Issues
General OCR Issues
Poor Accuracy:
- Improve image quality
- Use correct language settings
- Try different OCR providers
- Preprocess images (noise reduction, contrast enhancement)
- Use GPU acceleration
- Reduce image resolution
- Process in batches
- Consider hardware upgrades
- Reduce image size
- Process images sequentially
- Increase system memory
- Use image compression
API Examples
Monitoring OCR Performance
Quality Metrics
- Character accuracy rate
- Word accuracy rate
- Processing time per page
- Error rate by document type
System Metrics
- CPU usage during OCR
- Memory consumption
- GPU utilization (if enabled)
- Network usage (for Mistral AI)
Next Steps
Output Formats
Learn about structured JSON and Markdown outputs
Webhooks
Set up real-time notifications for OCR jobs
Production Deployment
Deploy OCR processing at scale
Examples
See complete OCR integration examples