Spaces:

Andertseng
/

DeepSeek-OCR

Running on Zero

App Files Files Community

DeepSeek-OCR / README.md

zinojeng

Deploy DeepSeek-OCR Web Interface

df68ff3 about 2 months ago

preview code

raw

history blame contribute delete

2.82 kB

	---
	title: DeepSeek-OCR Studio
	emoji: 🔍
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 4.44.0
	app_file: app.py
	pinned: false
	license: mit
	---

	# 🔍 DeepSeek-OCR Studio

	Advanced OCR system based on [DeepSeek-OCR](https://github.com/deepseek-ai/DeepSeek-OCR), providing powerful document parsing capabilities.

	## ✨ Features

	- Multi-language OCR: Support for Chinese, English, and many other languages
	- Table Extraction: Intelligent table recognition and markdown conversion
	- Chart Analysis: Extract data from charts and graphs
	- Professional Drawings: Semantic recognition of CAD drawings, flowcharts, etc.
	- Layout Analysis: Preserve document structure and formatting
	- PDF Support: Process PDF documents page by page

	## 🚀 Quick Start

	1. Upload an Image or PDF: Click to upload your document
	2. Optional Prompt: Customize the OCR task (e.g., "Extract tables", "Analyze chart")
	3. Extract Text: Click the button and wait for results

	## 📝 Prompt Examples

	### Basic OCR
	```
	Free OCR.
	```

	### Table Extraction
	```
	Extract all tables and convert to markdown format.
	```

	### Chart Analysis
	```
	Analyze this chart and extract data in table format.
	```

	### Multi-language Documents
	```
	Extract all text in multiple languages.
	```

	### Technical Drawings
	```
	Analyze this CAD drawing and describe its components.
	```

	## ⚙️ Deployment Information

	- Platform: Hugging Face Spaces with ZeroGPU
	- Model: [deepseek-ai/DeepSeek-OCR](https://huggingface.co/deepseek-ai/DeepSeek-OCR)
	- Processing Time: 30-120 seconds per image/page
	- PDF Limitation: First 3 pages only (ZeroGPU constraint)

	## 🔧 Local Deployment

	For full functionality with unlimited pages and faster processing:

	```bash
	# Clone the repository
	git clone https://github.com/fufankeji/DeepSeek-OCR-Web.git
	cd DeepSeek-OCR-Web

	# Install dependencies
	bash install.sh

	# Start services
	bash start.sh
	```

	Requirements:
	- Linux OS
	- GPU with ≥7GB VRAM (16-24GB recommended)
	- Python 3.10-3.12
	- CUDA 11.8 or 12.1/12.2

	## 📚 Documentation

	- [Official DeepSeek-OCR](https://github.com/deepseek-ai/DeepSeek-OCR)
	- [Web Interface Repository](https://github.com/fufankeji/DeepSeek-OCR-Web)
	- [Model on Hugging Face](https://huggingface.co/deepseek-ai/DeepSeek-OCR)

	## 🙏 Acknowledgments

	- DeepSeek AI: For the amazing OCR model
	- Hugging Face: For providing ZeroGPU infrastructure
	- Original project: [DeepSeek-OCR-Web](https://github.com/fufankeji/DeepSeek-OCR-Web)

	## 📄 License

	MIT License

	## 🐛 Known Limitations on Spaces

	- ZeroGPU has time limits (120 seconds per request)
	- PDF processing limited to first 3 pages
	- First request takes longer (model loading)
	- Large images may timeout

	For production use, please deploy locally with dedicated GPU.