Chat Completions
POST https://vinci.getsimpledirect.com/api/v1/chat/completionsGenerate a model response for a conversation. OpenAI-compatible.
Request body
| Field | Type | Required | Notes |
|---|---|---|---|
model | string | yes | A Vinci model id (e.g. vinci-piccolo). Unknown ids fall back to the default. |
messages | array | yes | { role, content } items. role is system, user, or assistant. |
stream | boolean | no | true to stream chunks. Defaults to false. |
temperature | number | no | Sampling temperature. |
max_tokens | number | no | Max tokens to generate. |
System messages. A
systemmessage you send is treated as additional instructions layered on top of the Vinci character — it can shape the response but cannot override Vinci’s core behavior.
Response — non-streaming
{
"id": "chatcmpl-…",
"object": "chat.completion",
"created": 1781981326,
"model": "vinci-piccolo",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "…" },
"finish_reason": "stop"
}
],
"usage": { "prompt_tokens": 42, "completion_tokens": 18, "total_tokens": 60 }
}Response — streaming
With "stream": true, the response is text/event-stream of
chat.completion.chunk objects:
data: {"id":"chatcmpl-…","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-…","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-…","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: {"id":"chatcmpl-…","object":"chat.completion.chunk","choices":[],"usage":{"prompt_tokens":42,"completion_tokens":18,"total_tokens":60}}
data: [DONE]Examples
curl (streaming):
curl -N https://vinci.getsimpledirect.com/api/v1/chat/completions \
-H "Authorization: Bearer $VINCI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vinci-piccolo",
"stream": true,
"messages": [{ "role": "user", "content": "Write a haiku about Canada." }]
}'Python (streaming):
from openai import OpenAI
client = OpenAI(
base_url="https://vinci.getsimpledirect.com/api/v1",
api_key="vinci_live_...",
)
stream = client.chat.completions.create(
model="vinci-piccolo",
messages=[{"role": "user", "content": "Write a haiku about Canada."}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")Errors
OpenAI-shaped: { "error": { "message": "...", "type": "..." } }.
| Status | type | Meaning |
|---|---|---|
400 | invalid_request_error | Malformed JSON, or messages missing/empty. |
401 | authentication_error | Missing/invalid/revoked API key. |
402 | insufficient_quota | Free-tier limit reached for the month. |
429 | rate_limit_error | Service momentarily saturated — retry. |
502 | api_error | Upstream model error. |