API overview
Vinci exposes an OpenAI-compatible HTTP API. If you’ve used the OpenAI API, you already know how to use Vinci — point your client at the Vinci base URL and use a Vinci key.
Base URL
https://vinci.getsimpledirect.com/api/v1Authentication
Every request needs a Vinci API key in the Authorization header:
Authorization: Bearer vinci_live_...Issue keys in the Vinci app under Account → API keys. Keys are shown once and stored only as a hash. Rotate or revoke anytime; a revoked key stops working immediately.
Keys are scoped to your account. Usage on a key counts against your account’s limits, the same as the web app.
Models
| Model id | Notes |
|---|---|
vinci-piccolo | 4B — fast and light; great default for coding and chat. |
vinci-bozza | 9B — balanced quality/speed. |
vinci-tela | 27B — most capable; reasoning. |
ds-v4-flash | Current hosted preview model. |
Vinci models are rolling out — the app’s model picker shows what’s live right now. An
unknown model falls back to the current default rather than erroring.
Streaming
Set "stream": true to receive Server-Sent Events of chat.completion.chunk objects,
terminated by data: [DONE] — the same format as OpenAI. See
Chat Completions.
Limits
- Free tier — a monthly token allowance shared across the web app and the API.
Exceeding it returns
402. - Concurrency — requests are admission-controlled; if the service is momentarily
saturated you get
429withRetry-After. Back off and retry.
Zero Data Retention
The API does not persist your message content. Prompts and completions are not
written to logs, analytics, or any datastore — only counts and metadata (model, token
totals, status) are recorded for metering. Inference runs in Montreal (ca-central-1);
prompt-transiting routes never touch US infrastructure.
The Vinci character
Vinci’s voice and operating principles are applied server-side on every request and
cannot be stripped by the client. If you send a system message, it is honored as
additional guidance layered on top of — not replacing — the Vinci character.
Errors
Errors use OpenAI’s shape: { "error": { "message": ..., "type": ... } }. See the
error reference.