Michael

@michaelg600

PDF to JSON and Markdown Output Review

Allemagne

Allemand, Anglais

Certaines informations sont présentées en anglais.

À propos de moi

I work on PDF and document parsing cleanup with Python. I turn existing parser output from tools like Docling or PyMuPDF into reviewable JSON blocks, clean Markdown, JSONL chunk records, and short quality reports. I focus on traceability: source PDF, page number, bounding box, section context, and parser provenance where available. I am best suited for small pilots, sample reviews, and structured output preparation, not OCR guarantees, compliance ownership, or full RAG system builds.... Plus d’infos

Compétences