HTTP API Reference¶
Koharu exposes a local HTTP API under:
http://127.0.0.1:<PORT>/api/v1
This is the same API used by the desktop UI and the headless Web UI.
Runtime model¶
Important current behavior:
- the API is served by the same process as the GUI or headless runtime
- the server binds to
127.0.0.1by default; use--hostto bind elsewhere - the API and MCP server share the same loaded project, models, and pipeline state
- when no
--portis provided, Koharu chooses a random local port - everything except
/api/v1/downloads,/api/v1/operations, and/api/v1/eventsreturns503 Service Unavailableuntil the app finishes bootstrapping
Resource model¶
The API is project-centric. A single project is open at a time and contains:
- a list of
Pagesindexed byPageId - per-page
Nodes(image layers, masks, text blocks) referenced byNodeId - a content-addressed
Blobstore that holds raw image bytes by Blake3 hash - a
Scenesnapshot built from those pieces, advanced by anepochcounter - a history of
Opmutations that can be undone or redone
Mutations always go through the history layer (POST /history/apply) so the scene, autosave, and event subscribers stay in sync.
Common response shapes¶
Frequently used response types include:
MetaInfo— app version and ML device labelEngineCatalog— installable engine ids per pipeline stageProjectSummary— id, name, path, page count, last openedSceneSnapshot—{ epoch, scene }LlmState— current LLM load state (status, target, error)LlmCatalog— local + provider models grouped by familyJobSummary—{ id, kind, status, error }DownloadProgress— package id, byte counts, status
Endpoints¶
Meta¶
| Method | Path | Purpose |
|---|---|---|
GET |
/meta |
get app version and active ML backend |
GET |
/engines |
list registered pipeline engines per stage |
Fonts¶
| Method | Path | Purpose |
|---|---|---|
GET |
/fonts |
combined system + Google Fonts catalog for rendering |
GET |
/google-fonts |
Google Fonts catalog as a standalone list |
POST |
/google-fonts/{family}/fetch |
download and cache one Google Fonts family |
GET |
/google-fonts/{family}/{file} |
serve the cached TTF/WOFF file |
Projects¶
Every project lives under the managed {data.path}/projects/ directory; clients never supply filesystem paths.
| Method | Path | Purpose |
|---|---|---|
GET |
/projects |
list managed projects |
POST |
/projects |
create a new project (body { name }) |
POST |
/projects/import |
extract a .khr archive into a fresh dir and open it |
PUT |
/projects/current |
open a managed project by id |
DELETE |
/projects/current |
close the current session |
POST |
/projects/current/export |
export the current project; returns binary bytes |
POST /projects/current/export accepts { format, pages? } where format is one of khr, psd, rendered, inpainted. When the format produces multiple files, the response is application/zip.
Pages¶
| Method | Path | Purpose |
|---|---|---|
POST |
/pages |
create pages from N uploaded image files (multipart) |
POST |
/pages/from-paths |
Tauri-only fast path that imports by absolute path |
POST |
/pages/{id}/image-layers |
add a Custom image node from an uploaded file |
PUT |
/pages/{id}/masks/{role} |
upsert a mask node from raw PNG bytes |
GET |
/pages/{id}/thumbnail |
get the page thumbnail (cached as WebP) |
role is segment or brushInpaint. POST /pages accepts an optional replace=true field; the import is filename-sorted using natural order.
Scene and blobs¶
| Method | Path | Purpose |
|---|---|---|
GET |
/scene.json |
full scene snapshot for web/UI clients |
GET |
/scene.bin |
postcard-encoded Snapshot { epoch, scene } for Tauri client |
GET |
/blobs/{hash} |
raw blob bytes by Blake3 hash |
/scene.bin includes the current epoch in the x-koharu-epoch response header.
History (mutations)¶
All scene mutations go through here. Each response returns { epoch }.
| Method | Path | Purpose |
|---|---|---|
POST |
/history/apply |
apply an Op (including Op::Batch) |
POST |
/history/undo |
revert the last applied op |
POST |
/history/redo |
re-apply the last undone op |
Op is the discriminated union that covers add/remove/update node, add/remove page, batch, and other scene transitions. The body is the JSON-tagged variant.
Pipelines¶
| Method | Path | Purpose |
|---|---|---|
POST |
/pipelines |
start a pipeline run as an operation |
Body fields:
steps— engine ids to run in order (validated against the registry)pages— optional subset ofPageIds; omit to process the whole projectregion— optional bounding box for the inpainter (repair-brush flow)targetLanguage,systemPrompt,defaultFont— optional per-run overrides
The response carries an operationId. Progress and completion arrive on /events as JobStarted, JobProgress, JobWarning, and JobFinished.
Operations¶
/operations is the unified registry for in-flight and recently-completed jobs (pipelines + downloads).
| Method | Path | Purpose |
|---|---|---|
GET |
/operations |
snapshot of every in-flight or recent operation |
DELETE |
/operations/{id} |
cancel a pipeline run; best-effort eviction for downloads |
Downloads¶
| Method | Path | Purpose |
|---|---|---|
GET |
/downloads |
snapshot of every active or recent download |
POST |
/downloads |
start a model-package download ({ modelId }) |
modelId is a package id declared via declare_hf_model_package! (e.g. "model:comic-text-detector:yolo-v5"). The response is { operationId } reusing the package id.
LLM control¶
The loaded model is a singleton resource at /llm/current.
| Method | Path | Purpose |
|---|---|---|
GET |
/llm/current |
current state (status, target, error) |
PUT |
/llm/current |
load the given target (local or provider) |
DELETE |
/llm/current |
unload / release the model |
GET |
/llm/catalog |
list available local + provider-backed models |
PUT /llm/current accepts an LlmLoadRequest:
- provider targets —
{ kind: "provider", providerId, modelId } - local targets —
{ kind: "local", modelId } - optional
options { temperature, maxTokens, customSystemPrompt }
PUT /llm/current returns 204 once the load task is queued. The actual ready state is published as LlmLoaded on /events.
Config¶
| Method | Path | Purpose |
|---|---|---|
GET |
/config |
read the current AppConfig |
PATCH |
/config |
apply a ConfigPatch; persists and broadcasts |
PUT |
/config/providers/{id}/secret |
save (or overwrite) a provider's API key |
DELETE |
/config/providers/{id}/secret |
clear a provider's stored API key |
AppConfig exposes top-level data, http, pipeline, and providers:
data.path— local data directory used for runtime, model cache, and projectshttp { connectTimeout, readTimeout, maxRetries }— shared HTTP client used by downloads and provider-backed requestspipeline { detector, fontDetector, segmenter, bubbleSegmenter, ocr, translator, inpainter, renderer }— engine id selected for each stageproviders[] { id, baseUrl?, apiKey? }— saved API keys round-trip as the redacted placeholder"[REDACTED]"; never the raw secret
Built-in provider ids:
openaigeminiclaudedeepseekdeeplgoogle-translatecaiyunopenai-compatible
API keys are stored in the platform credential store, not in config.toml. PATCHing apiKey: "" clears the saved key; PATCHing "[REDACTED]" leaves it unchanged. The dedicated /config/providers/{id}/secret routes are the explicit, non-PATCH way to manage one provider's secret.
Events stream¶
Koharu exposes a Server-Sent Events stream at:
GET /events
Behavior:
- a fresh connection (no
Last-Event-IDheader) starts with aSnapshotevent holding the current jobs and downloads registries - on reconnect, the server replays buffered events with
seq > Last-Event-IDin order; if the requested id has scrolled out of the ring, the server re-sends aSnapshot - each live event is emitted with its
seqas the SSEid:field - a 15-second keep-alive is maintained
Event variants currently include:
Snapshot— full state seed for fresh and lag-recovery clientsJobStarted,JobProgress,JobWarning,JobFinished— pipeline job lifecycleDownloadProgress— package download progress ticksConfigChanged— config was applied viaPATCH /configor a secret routeLlmLoaded,LlmUnloaded— LLM lifecycle transitionsSceneAdvanced— emitted when a scene mutation advances the epoch
Typical workflow¶
The normal API order for one new project is:
POST /projects— create the projectPOST /pages(or/pages/from-pathsfrom Tauri) — import imagesPUT /llm/current— load a translation model (local or provider)POST /pipelines— kick offdetect → ocr → translate → inpaint → render- tail
GET /eventsuntilJobFinished POST /projects/current/exportwithformat = "rendered"or"psd"
For finer control, post POST /history/apply with explicit Op payloads instead of running a full pipeline.
If you want agent-oriented access instead of HTTP endpoint orchestration, see MCP Tools Reference.