HTTP API Reference¶
Koharu exposes a local HTTP API under:
http://127.0.0.1:<PORT>/api/v1
This is the same API used by the desktop UI and headless Web UI.
Runtime model¶
Important behavior from the current implementation:
- the API is served by the same process as the GUI or headless runtime
- the server binds to
127.0.0.1by default - the API and MCP server share the same loaded documents, models, and pipeline state
- when no
--portis provided, Koharu chooses a random local port
Common response shapes¶
Frequently used types include:
MetaInfo: app version and ML deviceDocumentSummary: document id, name, size, revision, layer availability, and text-block countDocumentDetail: full document metadata plus text blocksJobState: current pipeline job progressLlmState: current LLM load stateImportResult: imported document count and summariesExportResult: count of exported files
Endpoints¶
Meta and fonts¶
| Method | Path | Purpose |
|---|---|---|
GET |
/meta |
get app version and active ML backend |
GET |
/fonts |
list font families available for rendering |
Documents¶
| Method | Path | Purpose |
|---|---|---|
GET |
/documents |
list loaded documents |
POST |
/documents/import?mode=replace |
replace the current document set with uploaded images |
POST |
/documents/import?mode=append |
append uploaded images to the current document set |
GET |
/documents/{documentId} |
get one document and all text-block metadata |
GET |
/documents/{documentId}/thumbnail |
get a thumbnail image |
GET |
/documents/{documentId}/layers/{layer} |
fetch one image layer |
The import endpoint uses multipart form data with repeated files fields.
Document layers currently exposed by the implementation include:
originalsegmentinpaintedbrushrendered
Page pipeline¶
| Method | Path | Purpose |
|---|---|---|
POST |
/documents/{documentId}/detect |
detect text blocks and layout |
POST |
/documents/{documentId}/ocr |
run OCR on detected text blocks |
POST |
/documents/{documentId}/inpaint |
remove original text using the current mask |
POST |
/documents/{documentId}/render |
render translated text |
POST |
/documents/{documentId}/translate |
generate translations for one block or the full page |
PUT |
/documents/{documentId}/mask-region |
replace or update part of the segmentation mask |
PUT |
/documents/{documentId}/brush-region |
write a patch into the brush layer |
POST |
/documents/{documentId}/inpaint-region |
re-inpaint a rectangular region only |
Useful request details:
/renderacceptstextBlockId,shaderEffect,shaderStroke, andfontFamily/translateacceptstextBlockIdandlanguage/mask-regionacceptsdataplus an optionalregion/brush-regionacceptsdataplus a requiredregion/inpaint-regionaccepts a rectangularregion
Text blocks¶
| Method | Path | Purpose |
|---|---|---|
POST |
/documents/{documentId}/text-blocks |
create a new text block from x, y, width, height |
PATCH |
/documents/{documentId}/text-blocks/{textBlockId} |
patch text, translation, box geometry, or style |
DELETE |
/documents/{documentId}/text-blocks/{textBlockId} |
remove a text block |
The text-block patch shape currently includes:
texttranslationxywidthheightstyle
style can include font families, font size, RGBA color, text alignment, italic and bold flags, and stroke configuration.
Export¶
| Method | Path | Purpose |
|---|---|---|
GET |
/documents/{documentId}/export?layer=rendered |
export one rendered image |
GET |
/documents/{documentId}/export?layer=inpainted |
export one inpainted image |
GET |
/documents/{documentId}/export/psd |
export one layered PSD |
POST |
/exports?layer=rendered |
export all rendered pages |
POST |
/exports?layer=inpainted |
export all inpainted pages |
Single-document export endpoints return binary file content. Bulk export returns JSON with the number of files written.
LLM control¶
| Method | Path | Purpose |
|---|---|---|
GET |
/llm/models |
list local and API-backed translation models |
GET |
/llm/state |
get the current LLM status |
POST |
/llm/load |
load a local or API-backed model |
POST |
/llm/offload |
unload the current model |
POST |
/llm/ping |
test an OpenAI-compatible base URL |
Useful request details:
/llm/modelsaccepts optionallanguageandopenaiCompatibleBaseUrlquery parameters/llm/loadacceptsid,apiKey,baseUrl,temperature,maxTokens, andcustomSystemPrompt/llm/pingacceptsbaseUrland optionalapiKey
Provider API keys¶
| Method | Path | Purpose |
|---|---|---|
GET |
/providers/{provider}/api-key |
read a saved API key for a provider |
PUT |
/providers/{provider}/api-key |
store or overwrite a provider API key |
Current built-in provider ids include:
openaigeminiclaudedeepseekopenai-compatible
Pipeline jobs¶
| Method | Path | Purpose |
|---|---|---|
POST |
/jobs/pipeline |
start a full processing job |
DELETE |
/jobs/{jobId} |
cancel a running pipeline job |
The pipeline job request can include:
documentIdto target one page, or omit it to process all loaded pages- LLM settings such as
llmModelId,llmApiKey,llmBaseUrl,llmTemperature,llmMaxTokens, andllmCustomSystemPrompt - render settings such as
shaderEffect,shaderStroke, andfontFamily language
Events stream¶
Koharu also exposes server-sent events at:
GET /events
Current event names are:
snapshotdocuments.changeddocument.changedjob.changeddownload.changedllm.changed
The stream sends an initial snapshot event and uses a 15-second keepalive.
Typical workflow¶
The normal API order for one page is:
POST /documents/import?mode=replacePOST /documents/{documentId}/detectPOST /documents/{documentId}/ocrPOST /llm/loadPOST /documents/{documentId}/translatePOST /documents/{documentId}/inpaintPOST /documents/{documentId}/renderGET /documents/{documentId}/export?layer=rendered
If you want agent-oriented access instead of HTTP endpoint orchestration, see MCP Tools Reference.