Confusion1398
commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -3,4 +3,258 @@ license: apache-2.0
|
|
| 3 |
base_model:
|
| 4 |
- PaddlePaddle/PaddleOCR-VL
|
| 5 |
base_model_relation: quantized
|
| 6 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
base_model:
|
| 4 |
- PaddlePaddle/PaddleOCR-VL
|
| 5 |
base_model_relation: quantized
|
| 6 |
+
---
|
| 7 |
+
# deepseek-ocr.rs π
|
| 8 |
+
|
| 9 |
+
Rust implementation of the DeepSeek-OCR inference stack with a fast CLI and an OpenAI-compatible HTTP server. The workspace packages multiple OCR backends, prompt tooling, and a serving layer so you can build document understanding pipelines that run locally on CPU, Apple Metal, or (alpha) NVIDIA CUDA GPUs.
|
| 10 |
+
|
| 11 |
+
> δΈζζζ‘£θ―·η [README_CN.md](README_CN.md)γ
|
| 12 |
+
|
| 13 |
+
> Want ready-made binaries? Latest macOS (Metal-enabled) and Windows bundles live in the [build-binaries workflow artifacts](https://github.com/TimmyOVO/deepseek-ocr.rs/actions/workflows/build-binaries.yml). Grab them from the newest green run.
|
| 14 |
+
|
| 15 |
+
## Choosing a Model π¬
|
| 16 |
+
|
| 17 |
+
| Model | Memory footprint* | Best on | When to pick it |
|
| 18 |
+
| --- | --- | --- | --- |
|
| 19 |
+
| **DeepSeekβOCR** | **β6.3β―GB** FP16 weights, **β13β―GB** RAM/VRAM with cache & activations (512-token budget) | Apple Silicon + Metal (FP16), high-VRAM NVIDIA GPUs, 32β―GB+ RAM desktops | Highest accuracy, SAM+CLIP global/local context, MoE DeepSeekβV2 decoder (3β―B params, ~570β―M active per token). Use when latency is secondary to quality. |
|
| 20 |
+
| **PaddleOCRβVL** | **β4.7β―GB** FP16 weights, **β9β―GB** RAM/VRAM with cache & activations | 16β―GB laptops, CPU-only boxes, mid-range GPUs | Dense 0.9β―B Ernie decoder with SigLIP vision tower. Faster startup, lower memory, great for batch jobs or lightweight deployments. |
|
| 21 |
+
|
| 22 |
+
\*Measured from the default FP16 safetensors. Runtime footprint varies with sequence length.
|
| 23 |
+
|
| 24 |
+
Guidance:
|
| 25 |
+
|
| 26 |
+
- **Need maximum fidelity, multi-region reasoning, or already have 16β24β―GB VRAM?** Use **DeepSeekβOCR**. The hybrid SAM+CLIP tower plus DeepSeekβV2 MoE decoder handles complex layouts best, but expect higher memory/latency.
|
| 27 |
+
- **Deploying to CPU-only nodes, 16β―GB laptops, or latency-sensitive services?** Choose **PaddleOCRβVL**. Its dense Ernie decoder (18 layers, hidden 1024) activates fewer parameters per token and keeps memory under 10β―GB while staying close in quality on most docs.
|
| 28 |
+
|
| 29 |
+
## Why Rust? π‘
|
| 30 |
+
|
| 31 |
+
The original DeepSeek-OCR ships as a Python + Transformers stackβpowerful, but hefty to deploy and awkward to embed. Rewriting the pipeline in Rust gives us:
|
| 32 |
+
|
| 33 |
+
- Smaller deployable artifacts with zero Python runtime or conda baggage.
|
| 34 |
+
- Memory-safe, thread-friendly infrastructure that blends into native Rust backends.
|
| 35 |
+
- Unified tooling (CLI + server) running on Candle + Rocket without the Python GIL overhead.
|
| 36 |
+
- Drop-in compatibility with OpenAI-style clients while tuned for single-turn OCR prompts.
|
| 37 |
+
|
| 38 |
+
## Technical Stack βοΈ
|
| 39 |
+
|
| 40 |
+
- **Candle** for tensor compute, with Metal and CUDA backends and FlashAttention support.
|
| 41 |
+
- **Rocket** + async streaming for OpenAI-compatible `/v1/responses` and `/v1/chat/completions`.
|
| 42 |
+
- **tokenizers** (upstream DeepSeek release) wrapped by `crates/assets` for deterministic caching via Hugging Face and ModelScope mirrors.
|
| 43 |
+
- **Pure Rust vision/prompt pipeline** shared by CLI and server to avoid duplicated logic.
|
| 44 |
+
|
| 45 |
+
## Advantages over the Python Release π₯·
|
| 46 |
+
|
| 47 |
+
- Faster cold-start on Apple Silicon, lower RSS, and native binary distribution.
|
| 48 |
+
- Deterministic dual-source (Hugging Face + ModelScope) asset download + verification built into the workspace.
|
| 49 |
+
- Automatic single-turn chat compaction so OCR outputs stay stable even when clients send history.
|
| 50 |
+
- Ready-to-use OpenAI compatibility for tools like Open WebUI without adapters.
|
| 51 |
+
|
| 52 |
+
## Highlights β¨
|
| 53 |
+
|
| 54 |
+
- **One repo, two entrypoints** β a batteries-included CLI for batch jobs and a Rocket-based server that speaks `/v1/responses` and `/v1/chat/completions`.
|
| 55 |
+
- **Works out of the box** β pulls model weights, configs, and tokenizer from whichever of Hugging Face or ModelScope responds fastest on first run.
|
| 56 |
+
- **Optimised for Apple Silicon** β optional Metal backend with FP16 execution for real-time OCR on laptops.
|
| 57 |
+
- **CUDA (alpha)** β experimental support via `--features cuda` + `--device cuda --dtype f16`; expect rough edges while we finish kernel coverage.
|
| 58 |
+
- **Intel MKL (preview)** β faster BLAS on x86 via `--features mkl` (install Intel oneMKL beforehand).
|
| 59 |
+
- **OpenAI client compatibility** β drop-in replacement for popular SDKs; the server automatically collapses chat history to the latest user turn for OCR-friendly prompts.
|
| 60 |
+
|
| 61 |
+
## Quick Start π
|
| 62 |
+
|
| 63 |
+
### Prerequisites
|
| 64 |
+
|
| 65 |
+
- Rust 1.78+ (edition 2024 support)
|
| 66 |
+
- Git
|
| 67 |
+
- Optional: Apple Silicon running macOS 13+ for Metal acceleration
|
| 68 |
+
- Optional: CUDA 12.2+ toolkit + driver for experimental NVIDIA GPU acceleration on Linux/Windows
|
| 69 |
+
- Optional: Intel oneAPI MKL for preview x86 acceleration (see below)
|
| 70 |
+
- (Recommended) Hugging Face account with `HF_TOKEN` when pulling from the `deepseek-ai/DeepSeek-OCR` repo (ModelScope is used automatically when itβs faster/reachable).
|
| 71 |
+
|
| 72 |
+
### Clone the Workspace
|
| 73 |
+
|
| 74 |
+
```bash
|
| 75 |
+
git clone https://github.com/TimmyOVO/deepseek-ocr.rs.git
|
| 76 |
+
cd deepseek-ocr.rs
|
| 77 |
+
cargo fetch
|
| 78 |
+
```
|
| 79 |
+
|
| 80 |
+
### Model Assets
|
| 81 |
+
|
| 82 |
+
The first invocation of the CLI or server downloads the config, tokenizer, and `model-00001-of-000001.safetensors` (~6.3GB) into `DeepSeek-OCR/`. To prefetch manually:
|
| 83 |
+
|
| 84 |
+
```bash
|
| 85 |
+
cargo run -p deepseek-ocr-cli --release -- --help # dev profile is extremely slow; always prefer --release
|
| 86 |
+
```
|
| 87 |
+
|
| 88 |
+
> Always include `--release` when running from source; debug builds on this model are extremely slow.
|
| 89 |
+
Set `HF_HOME`/`HF_TOKEN` if you store Hugging Face caches elsewhere (ModelScope downloads land alongside the same asset tree). The full model package is ~6.3GB on disk and typically requires ~13GB of RAM headroom during inference (model + activations).
|
| 90 |
+
|
| 91 |
+
## Configuration & Overrides ποΈ
|
| 92 |
+
|
| 93 |
+
The CLI and server share the same configuration. On first launch we create a `config.toml` populated with defaults; later runs reuse it so both entrypoints stay in sync.
|
| 94 |
+
|
| 95 |
+
| Platform | Config file (default) | Model cache root |
|
| 96 |
+
| --- | --- | --- |
|
| 97 |
+
| Linux | `~/.config/deepseek-ocr/config.toml` | `~/.cache/deepseek-ocr/models/<id>/β¦` |
|
| 98 |
+
| macOS | `~/Library/Application Support/deepseek-ocr/config.toml` | `~/Library/Caches/deepseek-ocr/models/<id>/β¦` |
|
| 99 |
+
| Windows | `%APPDATA%\deepseek-ocr\config.toml` | `%LOCALAPPDATA%\deepseek-ocr\models\<id>\β¦` |
|
| 100 |
+
|
| 101 |
+
- Override the location with `--config /path/to/config.toml` (available on both CLI and server). Missing files are created automatically.
|
| 102 |
+
- Each `[models.entries."<id>"]` record can point to custom `config`, `tokenizer`, or `weights` files. When omitted we fall back to the cache directory above and download/update assets as required.
|
| 103 |
+
- Runtime values resolve in this order: command-line flags β values stored in `config.toml` β built-in defaults. The HTTP API adds a final layer where request payload fields (for example `max_tokens`) override everything else for that call.
|
| 104 |
+
|
| 105 |
+
The generated file starts with the defaults below; adjust them to persistently change behaviour:
|
| 106 |
+
|
| 107 |
+
```toml
|
| 108 |
+
[models]
|
| 109 |
+
active = "deepseek-ocr"
|
| 110 |
+
|
| 111 |
+
[models.entries.deepseek-ocr]
|
| 112 |
+
|
| 113 |
+
[inference]
|
| 114 |
+
device = "cpu"
|
| 115 |
+
template = "plain"
|
| 116 |
+
base_size = 1024
|
| 117 |
+
image_size = 640
|
| 118 |
+
crop_mode = true
|
| 119 |
+
max_new_tokens = 512
|
| 120 |
+
use_cache = true
|
| 121 |
+
|
| 122 |
+
[server]
|
| 123 |
+
host = "0.0.0.0"
|
| 124 |
+
port = 8000
|
| 125 |
+
```
|
| 126 |
+
|
| 127 |
+
- `[models]` picks the active model and lets you add more entries (each entry can point to its own config/tokenizer/weights).
|
| 128 |
+
- `[inference]` controls notebook-friendly defaults shared by the CLI and server (device, template, vision sizing, decoding budget, cache usage).
|
| 129 |
+
- `[server]` sets the network binding and the model identifier reported by `/v1/models`.
|
| 130 |
+
|
| 131 |
+
See `crates/cli/README.md` and `crates/server/README.md` for concise override tables.
|
| 132 |
+
|
| 133 |
+
## Benchmark Snapshot π
|
| 134 |
+
|
| 135 |
+
Single-request Rust CLI (Accelerate backend on macOS) compared with the reference Python pipeline on the same prompt and image:
|
| 136 |
+
|
| 137 |
+
| Stage | ref total (ms) | ref avg (ms) | python total | python/ref |
|
| 138 |
+
|---------------------------------------------------|----------------|--------------|--------------|------------|
|
| 139 |
+
| Decode β Overall (`decode.generate`) | 30077.840 | 30077.840 | 56554.873 | 1.88x |
|
| 140 |
+
| Decode β Token Loop (`decode.iterative`) | 26930.216 | 26930.216 | 39227.974 | 1.46x |
|
| 141 |
+
| Decode β Prompt Prefill (`decode.prefill`) | 3147.337 | 3147.337 | 5759.684 | 1.83x |
|
| 142 |
+
| Prompt β Build Tokens (`prompt.build_tokens`) | 0.466 | 0.466 | 45.434 | 97.42x |
|
| 143 |
+
| Prompt β Render Template (`prompt.render`) | 0.005 | 0.005 | 0.019 | 3.52x |
|
| 144 |
+
| Vision β Embed Images (`vision.compute_embeddings`)| 6391.435 | 6391.435 | 3953.459 | 0.62x |
|
| 145 |
+
| Vision β Prepare Inputs (`vision.prepare_inputs`) | 62.524 | 62.524 | 45.438 | 0.73x |
|
| 146 |
+
|
| 147 |
+
## Command-Line Interface π₯οΈ
|
| 148 |
+
|
| 149 |
+
Build and run directly from the workspace:
|
| 150 |
+
|
| 151 |
+
```bash
|
| 152 |
+
cargo run -p deepseek-ocr-cli --release -- \
|
| 153 |
+
--prompt "<image>\n<|grounding|>Convert this receipt to markdown." \
|
| 154 |
+
--image baselines/sample/images/test.png \
|
| 155 |
+
--device cpu --max-new-tokens 512
|
| 156 |
+
```
|
| 157 |
+
|
| 158 |
+
> Tip: `--release` is required for reasonable throughput; debug builds can be 10x slower.
|
| 159 |
+
|
| 160 |
+
> macOS tip: append `--features metal` to the `cargo run`/`cargo build` commands to compile with Accelerate + Metal backends.
|
| 161 |
+
>
|
| 162 |
+
> CUDA tip (Linux/Windows): append `--features cuda` and run with `--device cuda --dtype f16` to target NVIDIA GPUsβfeature is still alpha, so be ready for quirks.
|
| 163 |
+
>
|
| 164 |
+
> Intel MKL preview: install Intel oneMKL, then build with `--features mkl` for faster CPU matmuls on x86.
|
| 165 |
+
|
| 166 |
+
Install the CLI as a binary:
|
| 167 |
+
|
| 168 |
+
```bash
|
| 169 |
+
cargo install --path crates/cli
|
| 170 |
+
deepseek-ocr-cli --help
|
| 171 |
+
```
|
| 172 |
+
|
| 173 |
+
Key flags:
|
| 174 |
+
|
| 175 |
+
- `--prompt` / `--prompt-file`: text with `<image>` slots
|
| 176 |
+
- `--image`: path(s) matching `<image>` placeholders
|
| 177 |
+
- `--device` and `--dtype`: choose `metal` + `f16` on Apple Silicon or `cuda` + `f16` on NVIDIA GPUs
|
| 178 |
+
- `--max-new-tokens`: decoding budget
|
| 179 |
+
- Sampling controls: `--do-sample`, `--temperature`, `--top-p`, `--top-k`, `--repetition-penalty`, `--no-repeat-ngram-size`, `--seed`
|
| 180 |
+
- By default decoding stays deterministic (`do_sample=false`, `temperature=0.0`, `no_repeat_ngram_size=20`)
|
| 181 |
+
- To use stochastic sampling set `--do-sample true --temperature 0.8` (and optionally adjust the other knobs)
|
| 182 |
+
|
| 183 |
+
### Switching Models
|
| 184 |
+
|
| 185 |
+
The autogenerated `config.toml` now contains two model entries:
|
| 186 |
+
|
| 187 |
+
- `deepseek-ocr` (default) β the original DeepSeek vision-language stack.
|
| 188 |
+
- `paddleocr-vl` β the PaddleOCR-VL 0.9B SigLIP + Ernie release.
|
| 189 |
+
|
| 190 |
+
Pick which one to load via `--model`:
|
| 191 |
+
|
| 192 |
+
```bash
|
| 193 |
+
deepseek-ocr-cli --model paddleocr-vl --prompt "<image> Summarise"
|
| 194 |
+
```
|
| 195 |
+
|
| 196 |
+
The CLI (and server) will download the matching config/tokenizer/weights from the appropriate repository (`deepseek-ai/DeepSeek-OCR` or `PaddlePaddle/PaddleOCR-VL`) into your cache on first use. You can still override paths with `--model-config`, `--tokenizer`, or `--weights` if you maintain local fine-tunes.
|
| 197 |
+
|
| 198 |
+
## HTTP Server βοΈ
|
| 199 |
+
|
| 200 |
+
Launch an OpenAI-compatible endpoint:
|
| 201 |
+
|
| 202 |
+
```bash
|
| 203 |
+
cargo run -p deepseek-ocr-server --release -- \
|
| 204 |
+
--host 0.0.0.0 --port 8000 \
|
| 205 |
+
--device cpu --max-new-tokens 512
|
| 206 |
+
```
|
| 207 |
+
|
| 208 |
+
> Keep `--release` on the server as well; the debug profile is far too slow for inference workloads.
|
| 209 |
+
> macOS tip: add `--features metal` to the `cargo run -p deepseek-ocr-server` command when you want the server binary to link against Accelerate + Metal (and pair it with `--device metal` at runtime).
|
| 210 |
+
>
|
| 211 |
+
> CUDA tip: add `--features cuda` and start the server with `--device cuda --dtype f16` to offload inference to NVIDIA GPUs (alpha-quality support).
|
| 212 |
+
>
|
| 213 |
+
> Intel MKL preview: install Intel oneMKL before building with `--features mkl` to accelerate CPU workloads on x86.
|
| 214 |
+
|
| 215 |
+
Notes:
|
| 216 |
+
|
| 217 |
+
- Use `data:` URLs or remote `http(s)` links; local paths are rejected.
|
| 218 |
+
- The server collapses multi-turn chat inputs to the latest user message to keep prompts OCR-friendly.
|
| 219 |
+
- Works out of the box with tools such as [Open WebUI](https://github.com/open-webui/open-webui) or any OpenAI-compatible clientβjust point the base URL to your server (`http://localhost:8000/v1`) and select either the `deepseek-ocr` or `paddleocr-vl` model ID exposed in `/v1/models`.
|
| 220 |
+
- Adjust the request body limit with Rocket config if you routinely send large images.
|
| 221 |
+
|
| 222 |
+

|
| 223 |
+
|
| 224 |
+
## GPU Acceleration β‘
|
| 225 |
+
|
| 226 |
+
- **Metal (macOS 13+ Apple Silicon)** β pass `--device metal --dtype f16` and build binaries with `--features metal` so Candle links against Accelerate + Metal.
|
| 227 |
+
- **CUDA (alpha, NVIDIA GPUs)** β install CUDA 12.2+ toolkits, build with `--features cuda`, and launch the CLI/server with `--device cuda --dtype f16`; still experimental.
|
| 228 |
+
- **Intel MKL (preview)** β install Intel oneMKL and build with `--features mkl` to speed up CPU workloads on x86.
|
| 229 |
+
- For either backend, prefer release builds (e.g. `cargo build --release -p deepseek-ocr-cli --features metal|cuda`) to maximise throughput.
|
| 230 |
+
- Combine GPU runs with `--max-new-tokens` and crop tuning flags to balance latency vs. quality.
|
| 231 |
+
|
| 232 |
+
## Repository Layout ποΈ
|
| 233 |
+
|
| 234 |
+
- `crates/core` β shared inference pipeline, model loaders, conversation templates.
|
| 235 |
+
- `crates/cli` β command-line frontend (`deepseek-ocr-cli`).
|
| 236 |
+
- `crates/server` β Rocket server exposing OpenAI-compatible endpoints.
|
| 237 |
+
- `crates/assets` β asset management (configuration, tokenizer, Hugging Face + ModelScope download helpers).
|
| 238 |
+
- `baselines/` β reference inputs and outputs for regression testing.
|
| 239 |
+
|
| 240 |
+
Detailed CLI usage lives in [`crates/cli/README.md`](crates/cli/README.md). The serverβs OpenAI-compatible interface is covered in [`crates/server/README.md`](crates/server/README.md).
|
| 241 |
+
|
| 242 |
+
## Troubleshooting π οΈ
|
| 243 |
+
|
| 244 |
+
- **Where do assets come from?** β downloads automatically pick between Hugging Face and ModelScope based on latency; the CLI prints the chosen source for each file.
|
| 245 |
+
- **Slow first response** β model load and GPU warm-up (Metal/CUDA alpha) happen on the initial request; later runs are faster.
|
| 246 |
+
- **Large image rejection** β increase Rocket JSON limits in `crates/server/src/main.rs` or downscale the input.
|
| 247 |
+
|
| 248 |
+
## Roadmap πΊοΈ
|
| 249 |
+
|
| 250 |
+
- β
Apple Metal backend with FP16 support and CLI/server parity on macOS.
|
| 251 |
+
- β
NVIDIA CUDA backend (alpha) β build with `--features cuda`, run with `--device cuda --dtype f16` for Linux/Windows GPUs; polishing in progress.
|
| 252 |
+
- π **Parity polish** β finish projector normalisation + crop tiling alignment; extend intermediate-tensor diff suite beyond the current sample baseline.
|
| 253 |
+
- π **Grounding & streaming** β port the Python post-processing helpers (box extraction, markdown polish) and refine SSE streaming ergonomics.
|
| 254 |
+
- π **Cross-platform acceleration** β continue tuning CUDA kernels, add automatic device detection across CPU/Metal/CUDA, and publish opt-in GPU benchmarks.
|
| 255 |
+
- π **Packaging & Ops** β ship binary releases with deterministic asset checksums, richer logging/metrics, and Helm/docker references for server deploys.
|
| 256 |
+
- π **Structured outputs** β optional JSON schema tools for downstream automation once parity gaps close.
|
| 257 |
+
|
| 258 |
+
## License π
|
| 259 |
+
|
| 260 |
+
This repository inherits the licenses of its dependencies and the upstream DeepSeek-OCR model. Refer to `DeepSeek-OCR/LICENSE` for model terms and apply the same restrictions to downstream use.
|