DigiNews

Tech Watch Articles

← Back to articles

Show HN: Ocrbase – pdf → .md/.json document OCR and structured extraction API

Quality: 7/10 Relevance: 9/10

Summary

OCRBase offers a PDF to MD/JSON OCR and structured data extraction API built on PaddleOCR with LLM-powered parsing. It provides a type-safe TypeScript SDK with React hooks, real-time WebSocket updates, and self-hosting options for scalable document processing.

🚀 Service construit par Johan Denoyer