HTTP API Reference¶

Koharu exposes a local HTTP API under:

http://127.0.0.1:<PORT>/api/v1

This is the same API used by the desktop UI and the headless Web UI.

Runtime model¶

Important current behavior:

the API is served by the same process as the GUI or headless runtime
the server binds to 127.0.0.1 by default; use --host to bind elsewhere
the API and MCP server share the same loaded project, models, and pipeline state
when no --port is provided, Koharu chooses a random local port
everything except /api/v1/downloads, /api/v1/operations, and /api/v1/events returns 503 Service Unavailable until the app finishes bootstrapping

Resource model¶

The API is project-centric. A single project is open at a time and contains:

a list of Pages indexed by PageId
per-page Nodes (image layers, masks, text blocks) referenced by NodeId
a content-addressed Blob store that holds raw image bytes by Blake3 hash
a Scene snapshot built from those pieces, advanced by an epoch counter
a history of Op mutations that can be undone or redone

Mutations always go through the history layer (POST /history/apply) so the scene, autosave, and event subscribers stay in sync.

Common response shapes¶

Frequently used response types include:

MetaInfo — app version and ML device label
EngineCatalog — installable engine ids per pipeline stage
ProjectSummary — id, name, path, page count, last opened
SceneSnapshot — { epoch, scene }
LlmState — current LLM load state (status, target, error)
LlmCatalog — local + provider models grouped by family
JobSummary — { id, kind, status, error }
DownloadProgress — package id, byte counts, status

Endpoints¶

Meta¶

Method	Path	Purpose
`GET`	`/meta`	get app version and active ML backend
`GET`	`/engines`	list registered pipeline engines per stage

Fonts¶

Method	Path	Purpose
`GET`	`/fonts`	combined system + Google Fonts catalog for rendering
`GET`	`/google-fonts`	Google Fonts catalog as a standalone list
`POST`	`/google-fonts/{family}/fetch`	download and cache one Google Fonts family
`GET`	`/google-fonts/{family}/{file}`	serve the cached TTF/WOFF file

Projects¶

Every project lives under the managed {data.path}/projects/ directory; clients never supply filesystem paths.

Method	Path	Purpose
`GET`	`/projects`	list managed projects
`POST`	`/projects`	create a new project (body `{ name }`)
`POST`	`/projects/import`	extract a `.khr` archive into a fresh dir and open it
`PUT`	`/projects/current`	open a managed project by `id`
`DELETE`	`/projects/current`	close the current session
`POST`	`/projects/current/export`	export the current project; returns binary bytes

POST /projects/current/export accepts { format, pages? } where format is one of khr, psd, rendered, inpainted. When the format produces multiple files, the response is application/zip.

Pages¶

Method	Path	Purpose
`POST`	`/pages`	create pages from N uploaded image files (multipart)
`POST`	`/pages/from-paths`	Tauri-only fast path that imports by absolute path
`POST`	`/pages/{id}/image-layers`	add a Custom image node from an uploaded file
`PUT`	`/pages/{id}/masks/{role}`	upsert a mask node from raw PNG bytes
`GET`	`/pages/{id}/thumbnail`	get the page thumbnail (cached as WebP)

role is segment or brushInpaint. POST /pages accepts an optional replace=true field; the import is filename-sorted using natural order.

Scene and blobs¶

Method	Path	Purpose
`GET`	`/scene.json`	full scene snapshot for web/UI clients
`GET`	`/scene.bin`	postcard-encoded `Snapshot { epoch, scene }` for Tauri client
`GET`	`/blobs/{hash}`	raw blob bytes by Blake3 hash

/scene.bin includes the current epoch in the x-koharu-epoch response header.

History (mutations)¶

All scene mutations go through here. Each response returns { epoch }.

Method	Path	Purpose
`POST`	`/history/apply`	apply an `Op` (including `Op::Batch`)
`POST`	`/history/undo`	revert the last applied op
`POST`	`/history/redo`	re-apply the last undone op

Op is the discriminated union that covers add/remove/update node, add/remove page, batch, and other scene transitions. The body is the JSON-tagged variant.

Pipelines¶

Method	Path	Purpose
`POST`	`/pipelines`	start a pipeline run as an operation

Body fields:

steps — engine ids to run in order (validated against the registry)
pages — optional subset of PageIds; omit to process the whole project
region — optional bounding box for the inpainter (repair-brush flow)
targetLanguage, systemPrompt, defaultFont — optional per-run overrides

The response carries an operationId. Progress and completion arrive on /events as JobStarted, JobProgress, JobWarning, and JobFinished.

Operations¶

/operations is the unified registry for in-flight and recently-completed jobs (pipelines + downloads).

Method	Path	Purpose
`GET`	`/operations`	snapshot of every in-flight or recent operation
`DELETE`	`/operations/{id}`	cancel a pipeline run; best-effort eviction for downloads

Downloads¶

Method	Path	Purpose
`GET`	`/downloads`	snapshot of every active or recent download
`POST`	`/downloads`	start a model-package download (`{ modelId }`)

modelId is a package id declared via declare_hf_model_package! (e.g. "model:comic-text-detector:yolo-v5"). The response is { operationId } reusing the package id.

LLM control¶

The loaded model is a singleton resource at /llm/current.

Method	Path	Purpose
`GET`	`/llm/current`	current state (status, target, error)
`PUT`	`/llm/current`	load the given target (local or provider)
`DELETE`	`/llm/current`	unload / release the model
`GET`	`/llm/catalog`	list available local + provider-backed models

PUT /llm/current accepts an LlmLoadRequest:

provider targets — { kind: "provider", providerId, modelId }
local targets — { kind: "local", modelId }
optional options { temperature, maxTokens, customSystemPrompt }

PUT /llm/current returns 204 once the load task is queued. The actual ready state is published as LlmLoaded on /events.

Config¶

Method	Path	Purpose
`GET`	`/config`	read the current `AppConfig`
`PATCH`	`/config`	apply a `ConfigPatch`; persists and broadcasts
`PUT`	`/config/providers/{id}/secret`	save (or overwrite) a provider's API key
`DELETE`	`/config/providers/{id}/secret`	clear a provider's stored API key

AppConfig exposes top-level data, http, pipeline, and providers:

data.path — local data directory used for runtime, model cache, and projects
http { connectTimeout, readTimeout, maxRetries } — shared HTTP client used by downloads and provider-backed requests
pipeline { detector, fontDetector, segmenter, bubbleSegmenter, ocr, translator, inpainter, renderer } — engine id selected for each stage
providers[] { id, baseUrl?, apiKey? } — saved API keys round-trip as the redacted placeholder "[REDACTED]"; never the raw secret

Built-in provider ids:

openai
gemini
claude
deepseek
deepl
google-translate
caiyun
openai-compatible

API keys are stored in the platform credential store, not in config.toml. PATCHing apiKey: "" clears the saved key; PATCHing "[REDACTED]" leaves it unchanged. The dedicated /config/providers/{id}/secret routes are the explicit, non-PATCH way to manage one provider's secret.

Events stream¶

Koharu exposes a Server-Sent Events stream at:

GET /events

Behavior:

a fresh connection (no Last-Event-ID header) starts with a Snapshot event holding the current jobs and downloads registries
on reconnect, the server replays buffered events with seq > Last-Event-ID in order; if the requested id has scrolled out of the ring, the server re-sends a Snapshot
each live event is emitted with its seq as the SSE id: field
a 15-second keep-alive is maintained

Event variants currently include:

Snapshot — full state seed for fresh and lag-recovery clients
JobStarted, JobProgress, JobWarning, JobFinished — pipeline job lifecycle
DownloadProgress — package download progress ticks
ConfigChanged — config was applied via PATCH /config or a secret route
LlmLoaded, LlmUnloaded — LLM lifecycle transitions
SceneAdvanced — emitted when a scene mutation advances the epoch

Typical workflow¶

The normal API order for one new project is:

POST /projects — create the project
POST /pages (or /pages/from-paths from Tauri) — import images
PUT /llm/current — load a translation model (local or provider)
POST /pipelines — kick off detect → ocr → translate → inpaint → render
tail GET /events until JobFinished
POST /projects/current/export with format = "rendered" or "psd"

For finer control, post POST /history/apply with explicit Op payloads instead of running a full pipeline.

If you want agent-oriented access instead of HTTP endpoint orchestration, see MCP Tools Reference.