API overview

Vinci exposes an OpenAI-compatible HTTP API. If you’ve used the OpenAI API, you already know how to use Vinci — point your client at the Vinci base URL and use a Vinci key.

Base URL


https://vinci.getsimpledirect.com/api/v1

Authentication

Every request needs a Vinci API key in the Authorization header:


Authorization: Bearer vinci_live_...

Issue keys in the Vinci app under Account → API keys. Keys are shown once and stored only as a hash. Rotate or revoke anytime; a revoked key stops working immediately.

Keys are scoped to your account. Usage on a key counts against your account’s limits, the same as the web app.

Models

Model id	Notes
`vinci-piccolo`	4B — fast and light; great default for coding and chat.
`vinci-bozza`	9B — balanced quality/speed.
`vinci-tela`	27B — most capable; reasoning.
`ds-v4-flash`	Current hosted preview model.

Vinci models are rolling out — the app’s model picker shows what’s live right now. An unknown model falls back to the current default rather than erroring.

Streaming

Set "stream": true to receive Server-Sent Events of chat.completion.chunk objects, terminated by data: [DONE] — the same format as OpenAI. See Chat Completions.

Limits

Free tier — a monthly token allowance shared across the web app and the API. Exceeding it returns 402.
Concurrency — requests are admission-controlled; if the service is momentarily saturated you get 429 with Retry-After. Back off and retry.

Zero Data Retention

The API does not persist your message content. Prompts and completions are not written to logs, analytics, or any datastore — only counts and metadata (model, token totals, status) are recorded for metering. Inference runs in Montreal (ca-central-1); prompt-transiting routes never touch US infrastructure.

The Vinci character

Vinci’s voice and operating principles are applied server-side on every request and cannot be stripped by the client. If you send a system message, it is honored as additional guidance layered on top of — not replacing — the Vinci character.

Errors

Errors use OpenAI’s shape: { "error": { "message": ..., "type": ... } }. See the error reference.