API Gateway
The llm.port Gateway provides an OpenAI-compatible interface for chat and embedding workloads.
Supported endpoint patterns
GET /v1/modelsPOST /v1/chat/completionsPOST /v1/embeddings
Integration model
- Point your SDK or app to the Gateway base URL
- Authenticate with your API token
- Send requests using model aliases
- Monitor usage and policy outcomes in the admin console
Why model aliases matter
Aliases let you decouple application code from provider-specific model names, so you can change providers or runtime placement without app changes.
Streaming support
The Gateway supports SSE streaming for low-latency response delivery.
Production guidance
- Use per-environment API tokens
- Enforce least-privilege access
- Validate model and policy configuration before rollout
Example request
curl http://localhost:4000/v1/chat/completions \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"model": "assistant-default",
"messages": [{"role": "user", "content": "Summarize this ticket"}],
"stream": false
}'