LLM Endpoints
Direct LLM chat with streaming support.
Stream Chat
POST /llm/stream
Stream LLM response.
bash
curl -X POST http://localhost:3000/llm/stream \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Hello!"}
]
}'Response (SSE stream):
data: {"type":"content","text":"Hello"}
data: {"type":"content","text":"! How"}
data: {"type":"content","text":" can I"}
data: {"type":"content","text":" help?"}
data: {"type":"done"}With System Prompt
bash
curl -X POST http://localhost:3000/llm/stream \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is TypeScript?"}
]
}'Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
messages | array | Yes | Chat messages |
model | string | No | Override model |
temperature | number | No | Sampling temperature (0-2) |
maxTokens | number | No | Max response tokens |
stream | boolean | No | Enable streaming (default: true) |
Message Format
typescript
interface Message {
role: 'system' | 'user' | 'assistant';
content: string;
}List Models
GET /llm/models
List available models.
bash
curl http://localhost:3000/llm/models \
-H "Authorization: Bearer <token>"Response:
json
{
"models": [
{
"id": "gpt-4o",
"provider": "openai",
"name": "GPT-4o"
},
{
"id": "claude-sonnet-4-20250514",
"provider": "anthropic",
"name": "Claude Sonnet"
}
],
"default": "claude-sonnet-4-20250514"
}Non-Streaming
POST /llm/chat
Non-streaming chat (returns complete response).
bash
curl -X POST http://localhost:3000/llm/chat \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Hello!"}
],
"stream": false
}'Response:
json
{
"content": "Hello! How can I help you today?",
"model": "claude-sonnet-4-20250514",
"usage": {
"inputTokens": 10,
"outputTokens": 15
}
}Error Responses
Model Not Available
json
{
"error": "Model not available",
"message": "The requested model is not configured"
}Rate Limited
json
{
"error": "Rate limited",
"message": "LLM rate limit exceeded",
"retryAfter": 60
}Providers
Configured in config.json:
json
{
"agent": {
"llm": {
"provider": "anthropic",
"model": "claude-sonnet-4-20250514",
"apiKey": "{{env.ANTHROPIC_API_KEY}}"
}
}
}Supported providers:
openai- GPT-4, GPT-4o, GPT-3.5anthropic- Claude 3.5, Claude 3google- Gemini Pro, Gemini Flashlmstudio- Local models