Project Files
README.md
@zexigh/docx โ round-trip Word โ markdown for LM StudioTwo LM Studio tools to read and write .docx files locally, no cloud, no
vision model. Designed to slot into pipelines like
read_docx_to_markdown โ anonymize_text โ write_markdown_to_docx.
.docx XML directly โ it's a structural
conversion, not OCR. A 30-page document takes a fraction of a second.@zexigh/read-pdf: PDFs need a vision model
because they're images-on-paper; .docx is structured XML so we read it
directly. The right tool for the right format.read_docx_to_markdown(path, preserve_styles?, include_metadata?)Returns { markdown, warnings, source_chars, metadata }. Headings, bold,
italic, lists, blockquotes, links, and basic tables are preserved.
write_markdown_to_docx(path, markdown, title?)Single-call write. Returns { written: { path, bytes, paragraphs }, backup_path, warnings }. If the target file exists, the previous
version is copied to <root>/.lmstudio-fs-backup/<rel>.<ts>.bak before
being overwritten.
| Field | Default | Use |
|---|---|---|
allowedPaths | [] | Roots under which .docx can be read or written. ~ is expanded. Set at least one. |
defaultFontFamily | Calibri | Body font of generated docs. |
defaultFontSizePt | 11 | Body size in points. |
pageMargin | normal | normal โ 2.54cm, narrow โ 1.27cm, wide โ 5.08cm. |
preserveListStyles | true | Keep numbered vs bulleted from the markdown source. |
maxFileSizeMb | 50 | Reject .docx larger than this on read. |
verboseLogging | false | Log each conversion. |
A document written by write_markdown_to_docx and read back by
read_docx_to_markdown produces markdown that's structurally identical:
heading levels, list types, bold/italic, links and paragraph boundaries
are preserved.
Known small losses, kept as-is for i1:
`code` round-trips as plain text (mammoth doesn't detect
Consolas runs as code).> blockquote round-trips as a single *italic* indented paragraph.
Reason: mammoth's markdown writer (โค1.12) has no handler for <blockquote>;
rendering blockquotes via italic+indent at least keeps paragraph
boundaries intact.```) round-trip as one monospace paragraph per line
rather than a fenced block.No images, no complex tables (cell merges, custom borders), no tracked
changes, no Word comments, no custom paragraph styles, no equations,
no footnotes. Each of these is targeted in a later iteration (see
PROMPT.md ยง11).
MIT.