Spaces:
Running
on
Zero
Running
on
Zero
| title: DeepSeek-OCR Studio | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 4.44.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # π DeepSeek-OCR Studio | |
| Advanced OCR system based on [DeepSeek-OCR](https://github.com/deepseek-ai/DeepSeek-OCR), providing powerful document parsing capabilities. | |
| ## β¨ Features | |
| - **Multi-language OCR**: Support for Chinese, English, and many other languages | |
| - **Table Extraction**: Intelligent table recognition and markdown conversion | |
| - **Chart Analysis**: Extract data from charts and graphs | |
| - **Professional Drawings**: Semantic recognition of CAD drawings, flowcharts, etc. | |
| - **Layout Analysis**: Preserve document structure and formatting | |
| - **PDF Support**: Process PDF documents page by page | |
| ## π Quick Start | |
| 1. **Upload an Image or PDF**: Click to upload your document | |
| 2. **Optional Prompt**: Customize the OCR task (e.g., "Extract tables", "Analyze chart") | |
| 3. **Extract Text**: Click the button and wait for results | |
| ## π Prompt Examples | |
| ### Basic OCR | |
| ``` | |
| Free OCR. | |
| ``` | |
| ### Table Extraction | |
| ``` | |
| Extract all tables and convert to markdown format. | |
| ``` | |
| ### Chart Analysis | |
| ``` | |
| Analyze this chart and extract data in table format. | |
| ``` | |
| ### Multi-language Documents | |
| ``` | |
| Extract all text in multiple languages. | |
| ``` | |
| ### Technical Drawings | |
| ``` | |
| Analyze this CAD drawing and describe its components. | |
| ``` | |
| ## βοΈ Deployment Information | |
| - **Platform**: Hugging Face Spaces with ZeroGPU | |
| - **Model**: [deepseek-ai/DeepSeek-OCR](https://huggingface.co/deepseek-ai/DeepSeek-OCR) | |
| - **Processing Time**: 30-120 seconds per image/page | |
| - **PDF Limitation**: First 3 pages only (ZeroGPU constraint) | |
| ## π§ Local Deployment | |
| For full functionality with unlimited pages and faster processing: | |
| ```bash | |
| # Clone the repository | |
| git clone https://github.com/fufankeji/DeepSeek-OCR-Web.git | |
| cd DeepSeek-OCR-Web | |
| # Install dependencies | |
| bash install.sh | |
| # Start services | |
| bash start.sh | |
| ``` | |
| **Requirements**: | |
| - Linux OS | |
| - GPU with β₯7GB VRAM (16-24GB recommended) | |
| - Python 3.10-3.12 | |
| - CUDA 11.8 or 12.1/12.2 | |
| ## π Documentation | |
| - [Official DeepSeek-OCR](https://github.com/deepseek-ai/DeepSeek-OCR) | |
| - [Web Interface Repository](https://github.com/fufankeji/DeepSeek-OCR-Web) | |
| - [Model on Hugging Face](https://huggingface.co/deepseek-ai/DeepSeek-OCR) | |
| ## π Acknowledgments | |
| - **DeepSeek AI**: For the amazing OCR model | |
| - **Hugging Face**: For providing ZeroGPU infrastructure | |
| - Original project: [DeepSeek-OCR-Web](https://github.com/fufankeji/DeepSeek-OCR-Web) | |
| ## π License | |
| MIT License | |
| ## π Known Limitations on Spaces | |
| - ZeroGPU has time limits (120 seconds per request) | |
| - PDF processing limited to first 3 pages | |
| - First request takes longer (model loading) | |
| - Large images may timeout | |
| For production use, please deploy locally with dedicated GPU. | |