Ingest
Plan or queue ingest from a public URL or an uploaded file (PDF, Markdown, plain text).
| Operation | Method | Path |
|---|---|---|
| URL ingest | POST |
/v1/ingest/url |
| File ingest | POST |
/v1/ingest/file |
| List ingest jobs | GET |
/v1/ingest/jobs?limit=20 |
| Inspect a single job | GET |
/v1/ingest/jobs/{jobId} |
All four require an API key with scope ingest.
Both POSTs are dry-run by default (dryRun: true). Live writes require distill: true plus a datedAt value in YYYY-MM-DD form. Live writes return 202 Accepted with a queued job; the ingest worker drains the queue, fetches the artefact, distills, embeds, and writes the validated graph rows.
URL ingest
Request
POST /v1/ingest/url
{
"url": "https://example.com/my-bio",
"dryRun": false,
"distill": true,
"datedAt": "2026-05-12",
"analyzeStyle": false,
"distillerProvider": "anthropic",
"distillerModel": "claude-haiku-4-5",
"focus": "cross-functional delivery",
"focusContext": 3,
"idempotencyKey": "bio-2026-05-12"
}
| Field | Type | Default | Notes |
|---|---|---|---|
url |
string | http or https. Localhost and private addresses are blocked |
|
dryRun |
boolean | true |
When true, returns a plan and does not persist anything |
distill |
boolean | false |
Required true for live ingest |
datedAt |
string | YYYY-MM-DD. Required for live ingest |
|
analyzeStyle |
boolean | false |
Sample broader source spans for style evidence |
distillLimit |
integer | Cap the number of notes distilled in this run | |
distillerProvider |
enum | env | "openai" | "anthropic" |
distillerModel |
string | env | Provider-specific model id |
focus |
string | Clip the fetched page around this term before distillation | |
focusContext |
integer | 3 |
Lines of context to keep around each focus hit |
idempotencyKey |
string | 8 ≤ length ≤ 160. Identical key returns the existing queued job instead of duplicating it |
Response: 200 OK (dry-run)
{
"schemaVersion": "marrow-url-ingest-plan-v1",
"mode": "dry-run",
"plan": { /* notes, claims, projects, entities, facets, edges, warnings */ }
}
Response: 202 Accepted (live)
{
"schemaVersion": "marrow-ingest-job-v1",
"mode": "queued",
"created": true,
"job": { "id": "5d2a…", "status": "queued", "kind": "url" }
}
created: false indicates the idempotencyKey already had a job; the existing job is returned.
File ingest
Request
POST /v1/ingest/file
{
"filename": "profile.md",
"contentType": "text/markdown",
"contentBase64": "IyBQcm9maWxlCg==",
"dryRun": false,
"distill": true,
"datedAt": "2026-05-12"
}
Accepted content types include application/pdf, text/markdown, and text/plain. The file size cap is enforced by the API server; oversize uploads return 413 Payload Too Large.
Other fields mirror URL ingest: analyzeStyle, distillLimit, distillerProvider, distillerModel, idempotencyKey.
Response
200 OK(dry-run) →marrow-file-ingest-plan-v1202 Accepted(live) →marrow-ingest-job-v1withmode: "queued"
Live file ingest persists the raw artefact first (Supabase Storage, R2, or local in dev), then queues the job.
List ingest jobs
GET /v1/ingest/jobs?limit=20
{
"schemaVersion": "marrow-ingest-job-list-v1",
"jobs": [
{ "id": "5d2a…", "kind": "url", "status": "succeeded", "createdAt": "…", "finishedAt": "…" }
],
"pagination": { "limit": 20, "returned": 1, "hasMore": false, "nextCursor": null }
}
Get a single job
GET /v1/ingest/jobs/{jobId}
{
"schemaVersion": "marrow-ingest-job-v1",
"job": {
"id": "5d2a…",
"kind": "url",
"status": "succeeded",
"attempts": 1,
"sourceRunId": "f1a3…",
"error": null,
"createdAt": "…",
"finishedAt": "…"
}
}
Terminal statuses are succeeded, failed, and quarantined. quarantined means the run produced material that did not connect to any known anchor (profile owner, project, publication, organization, repository, tool); the rows are kept under source_kind=quarantine and not surfaced to read endpoints by default.
Errors
| Code | error.code |
Reason |
|---|---|---|
| 400 | invalid_ingest_request |
Live ingest missing distill: true or datedAt |
| 401 | unauthorized |
Missing or invalid API key |
| 403 | forbidden |
API key lacks scope ingest |
| 404 | not_found |
Job id does not exist for this account |
| 413 | payload_too_large |
File upload exceeded the configured size cap |
| 422 | validation_error |
Body or query failed validation |
| 429 | rate_limit_exceeded |
Rate limit hit |
CLI mapping
npm run dev -- api ingest-url --dated-at 2026-05-12 https://example.com/profile
npm run dev -- api ingest-url --yes --dated-at 2026-05-12 https://example.com/profile
npm run dev -- api ingest-file --dated-at 2026-05-12 ./profile.md
npm run dev -- api ingest-file --yes --dated-at 2026-05-12 ./profile.md
npm run dev -- api ingest-jobs
npm run dev -- api ingest-job <job_id>