3 posts tagged with "responses_api"

View All Tags

v1.67.4-stable - Improved User Management

April 26, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

Deploy this version

Docker
Pip

docker run litellm
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.67.4-stable

pip install litellm

pip install litellm==1.67.4.post1

Key Highlights

Improved User Management: This release enables search and filtering across users, keys, teams, and models.
Responses API Load Balancing: Route requests across provider regions and ensure session continuity.
UI Session Logs: Group several requests to LiteLLM into a session.

Improved User Management

This release makes it easier to manage users and keys on LiteLLM. You can now search and filter across users, keys, teams, and models, and control user settings more easily.

New features include:

Search for users by email, ID, role, or team.
See all of a user's models, teams, and keys in one place.
Change user roles and model access right from the Users Tab.

These changes help you spend less time on user setup and management on LiteLLM.

Responses API Load Balancing

This release introduces load balancing for the Responses API, allowing you to route requests across provider regions and ensure session continuity. It works as follows:

If a previous_response_id is provided, LiteLLM will route the request to the original deployment that generated the prior response — ensuring session continuity.
If no previous_response_id is provided, LiteLLM will load-balance requests across your available deployments.

UI Session Logs

This release allow you to group requests to LiteLLM proxy into a session. If you specify a litellm_session_id in your request LiteLLM will automatically group all logs in the same session. This allows you to easily track usage and request content per session.

OpenAI
1. Added gpt-image-1 cost tracking Get Started
2. Bug fix: added cost tracking for gpt-image-1 when quality is unspecified PR
Azure
1. Fixed timestamp granularities passing to whisper in Azure Get Started
2. Added azure/gpt-image-1 pricing Get Started, PR
3. Added cost tracking for azure/computer-use-preview, azure/gpt-4o-audio-preview-2024-12-17, azure/gpt-4o-mini-audio-preview-2024-12-17 PR
Bedrock
1. Added support for all compatible Bedrock parameters when model="arn:.." (Bedrock application inference profile models) Get started, PR
2. Fixed wrong system prompt transformation PR
VertexAI / Google AI Studio
1. Allow setting budget_tokens=0 for gemini-2.5-flash Get Started,PR
2. Ensure returned usage includes thinking token usage PR
3. Added cost tracking for gemini-2.5-pro-preview-03-25 PR
Cohere
1. Added support for cohere command-a-03-2025 Get Started, PR
SageMaker
1. Added support for max_completion_tokens parameter Get Started, PR
Responses API
1. Added support for GET and DELETE operations - /v1/responses/{response_id} Get Started
2. Added session management support for non-OpenAI models PR
3. Added routing affinity to maintain model consistency within sessions Get Started, PR

Spend Tracking Improvements

Bug Fix: Fixed spend tracking bug, ensuring default litellm params aren't modified in memory PR
Deprecation Dates: Added deprecation dates for Azure, VertexAI models PR

Management Endpoints / UI

Users

Filtering and Searching:
- Filter users by user_id, role, team, sso_id
- Search users by email
User Info Panel: Added a new user information pane PR
- View teams, keys, models associated with User
- Edit user role, model permissions

Teams

Filtering and Searching:
- Filter teams by Organization, Team ID PR
- Search teams by Team Name PR

Keys

Key Management:
- Support for cross-filtering and filtering by key hash PR
- Fixed key alias reset when resetting filters PR
- Fixed table rendering on key creation PR

UI Logs Page

Session Logs: Added UI Session Logs Get Started

UI Authentication & Security

Required Authentication: Authentication now required for all dashboard pages PR
SSO Fixes: Fixed SSO user login invalid token error PR
[BETA] Encrypted Tokens: Moved UI to encrypted token usage PR
Token Expiry: Support token refresh by re-routing to login page (fixes issue where expired token would show a blank page) PR

UI General fixes

Fixed UI Flicker: Addressed UI flickering issues in Dashboard PR
Improved Terminology: Better loading and no-data states on Keys and Tools pages PR
Azure Model Support: Fixed editing Azure public model names and changing model names after creation PR
Team Model Selector: Bug fix for team model selection PR

Logging / Guardrail Integrations

Datadog:
1. Fixed Datadog LLM observability logging Get Started, PR
Prometheus / Grafana:
1. Enable datasource selection on LiteLLM Grafana Template Get Started, PR
AgentOps:
1. Added AgentOps Integration Get Started, PR
Arize:
1. Added missing attributes for Arize & Phoenix Integration Get Started, PR

General Proxy Improvements

Caching: Fixed caching to account for thinking or reasoning_effort when calculating cache key PR
Model Groups: Fixed handling for cases where user sets model_group inside model_info PR
Passthrough Endpoints: Ensured PassthroughStandardLoggingPayload is logged with method, URL, request/response body PR
Fix SQL Injection: Fixed potential SQL injection vulnerability in spend_management_endpoints.py PR

Helm

Fixed serviceAccountName on migration job PR

Full Changelog

The complete list of changes can be found in the GitHub release notes.

v1.63.14-stable

March 22, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

These are the changes since v1.63.11-stable.

This release brings:

LLM Translation Improvements (MCP Support and Bedrock Application Profiles)
Perf improvements for Usage-based Routing
Streaming guardrail support via websockets
Azure OpenAI client perf fix (from previous release)

Docker Run LiteLLM Proxy

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.63.14-stable.patch1

Demo Instance

Here's a Demo Instance to test changes:

Instance: https://demo.litellm.ai/
Login Credentials:
- Username: admin
- Password: sk-1234

New Models / Updated Models

Azure gpt-4o - fixed pricing to latest global pricing - PR
O1-Pro - add pricing + model information - PR
Azure AI - mistral 3.1 small pricing added - PR
Azure - gpt-4.5-preview pricing added - PR

LLM Translation

New LLM Features

Bedrock: Support bedrock application inference profiles Docs
- Infer aws region from bedrock application profile id - (arn:aws:bedrock:us-east-1:...)
Ollama - support calling via /v1/completions Get Started
Bedrock - support us.deepseek.r1-v1:0 model name Docs
OpenRouter - OPENROUTER_API_BASE env var support Docs
Azure - add audio model parameter support - Docs
OpenAI - PDF File support Docs
OpenAI - o1-pro Responses API streaming support Docs
[BETA] MCP - Use MCP Tools with LiteLLM SDK Docs

Bug Fixes

Voyage: prompt token on embedding tracking fix - PR
Sagemaker - Fix ‘Too little data for declared Content-Length’ error - PR
OpenAI-compatible models - fix issue when calling openai-compatible models w/ custom_llm_provider set - PR
VertexAI - Embedding ‘outputDimensionality’ support - PR
Anthropic - return consistent json response format on streaming/non-streaming - PR

Spend Tracking Improvements

litellm_proxy/ - support reading litellm response cost header from proxy, when using client sdk
Reset Budget Job - fix budget reset error on keys/teams/users PR
Streaming - Prevents final chunk w/ usage from being ignored (impacted bedrock streaming + cost tracking) PR

UI

Users Page
- Feature: Control default internal user settings PR
Icons:
- Feature: Replace external "artificialanalysis.ai" icons by local svg PR
Sign In/Sign Out
- Fix: Default login when default_user_id user does not exist in DB PR

Logging Integrations

Support post-call guardrails for streaming responses Get Started
Arize Get Started
- fix invalid package import PR
- migrate to using standardloggingpayload for metadata, ensures spans land successfully PR
- fix logging to just log the LLM I/O PR
- Dynamic API Key/Space param support Get Started
StandardLoggingPayload - Log litellm_model_name in payload. Allows knowing what the model sent to API provider was Get Started
Prompt Management - Allow building custom prompt management integration Get Started

Performance / Reliability improvements

Redis Caching - add 5s default timeout, prevents hanging redis connection from impacting llm calls PR
Allow disabling all spend updates / writes to DB - patch to allow disabling all spend updates to DB with a flag PR
Azure OpenAI - correctly re-use azure openai client, fixes perf issue from previous Stable release PR
Azure OpenAI - uses litellm.ssl_verify on Azure/OpenAI clients PR
Usage-based routing - Wildcard model support Get Started
Usage-based routing - Support batch writing increments to redis - reduces latency to same as ‘simple-shuffle’ PR
Router - show reason for model cooldown on ‘no healthy deployments available error’ PR
Caching - add max value limit to an item in in-memory cache (1MB) - prevents OOM errors on large image url’s being sent through proxy PR

General Improvements

Passthrough Endpoints - support returning api-base on pass-through endpoints Response Headers Docs
SSL - support reading ssl security level from env var - Allows user to specify lower security settings Get Started
Credentials - only poll Credentials table when STORE_MODEL_IN_DB is True PR
Image URL Handling - new architecture doc on image url handling Docs
OpenAI - bump to pip install "openai==1.68.2" PR
Gunicorn - security fix - bump gunicorn==23.0.0 PR

Complete Git Diff

Here's the complete git diff

v1.63.11-stable

March 15, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

These are the changes since v1.63.2-stable.

This release is primarily focused on:

[Beta] Responses API Support
Snowflake Cortex Support, Amazon Nova Image Generation
UI - Credential Management, re-use credentials when adding new models
UI - Test Connection to LLM Provider before adding a model

Known Issues

🚨 Known issue on Azure OpenAI - We don't recommend upgrading if you use Azure OpenAI. This version failed our Azure OpenAI load test

Docker Run LiteLLM Proxy

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.63.11-stable

Demo Instance

Here's a Demo Instance to test changes:

Instance: https://demo.litellm.ai/
Login Credentials:
- Username: admin
- Password: sk-1234

New Models / Updated Models

Image Generation support for Amazon Nova Canvas Getting Started
Add pricing for Jamba new models PR
Add pricing for Amazon EU models PR
Add Bedrock Deepseek R1 model pricing PR
Update Gemini pricing: Gemma 3, Flash 2 thinking update, LearnLM PR
Mark Cohere Embedding 3 models as Multimodal PR
Add Azure Data Zone pricing PR
- LiteLLM Tracks cost for azure/eu and azure/us models

LLM Translation

New Endpoints

[Beta] POST /responses API. Getting Started

New LLM Providers

Snowflake Cortex Getting Started

New LLM Features

Support OpenRouter reasoning_content on streaming Getting Started

Bug Fixes

OpenAI: Return code, param and type on bad request error More information on litellm exceptions
Bedrock: Fix converse chunk parsing to only return empty dict on tool use PR
Bedrock: Support extra_headers PR
Azure: Fix Function Calling Bug & Update Default API Version to 2025-02-01-preview PR
Azure: Fix AI services URL PR
Vertex AI: Handle HTTP 201 status code in response PR
Perplexity: Fix incorrect streaming response PR
Triton: Fix streaming completions bug PR
Deepgram: Support bytes.IO when handling audio files for transcription PR
Ollama: Fix "system" role has become unacceptable PR
All Providers (Streaming): Fix String data: stripped from entire content in streamed responses PR

Spend Tracking Improvements

Support Bedrock converse cache token tracking Getting Started
Cost Tracking for Responses API Getting Started
Fix Azure Whisper cost tracking Getting Started

UI

Re-Use Credentials on UI

You can now onboard LLM provider credentials on LiteLLM UI. Once these credentials are added you can re-use them when adding new models Getting Started

Test Connections before adding models

Before adding a model you can test the connection to the LLM provider to verify you have setup your API Base + API Key correctly

General UI Improvements

Add Models Page
- Allow adding Cerebras, Sambanova, Perplexity, Fireworks, Openrouter, TogetherAI Models, Text-Completion OpenAI on Admin UI
- Allow adding EU OpenAI models
- Fix: Instantly show edit + deletes to models
Keys Page
- Fix: Instantly show newly created keys on Admin UI (don't require refresh)
- Fix: Allow clicking into Top Keys when showing users Top API Key
- Fix: Allow Filter Keys by Team Alias, Key Alias and Org
- UI Improvements: Show 100 Keys Per Page, Use full height, increase width of key alias
Users Page
- Fix: Show correct count of internal user keys on Users Page
- Fix: Metadata not updating in Team UI
Logs Page
- UI Improvements: Keep expanded log in focus on LiteLLM UI
- UI Improvements: Minor improvements to logs page
- Fix: Allow internal user to query their own logs
- Allow switching off storing Error Logs in DB Getting Started
Sign In/Sign Out
- Fix: Correctly use PROXY_LOGOUT_URL when set Getting Started

Security

Support for Rotating Master Keys Getting Started
Fix: Internal User Viewer Permissions, don't allow internal_user_viewer role to see Test Key Page or Create Key Button More information on role based access controls
Emit audit logs on All user + model Create/Update/Delete endpoints Getting Started
JWT
- Support multiple JWT OIDC providers Getting Started
- Fix JWT access with Groups not working when team is assigned All Proxy Models access
Using K/V pairs in 1 AWS Secret Getting Started

Logging Integrations

Prometheus: Track Azure LLM API latency metric Getting Started
Athina: Added tags, user_feedback and model_options to additional_keys which can be sent to Athina Getting Started

Performance / Reliability improvements

Redis + litellm router - Fix Redis cluster mode for litellm router PR

General Improvements

OpenWebUI Integration - display thinking tokens

Guide on getting started with LiteLLM x OpenWebUI. Getting Started
Display thinking tokens on OpenWebUI (Bedrock, Anthropic, Deepseek) Getting Started

Complete Git Diff

Here's the complete git diff

Deploy this version​

Key Highlights​

Improved User Management​

Responses API Load Balancing​

UI Session Logs​

New Models / Updated Models​

Spend Tracking Improvements​

Management Endpoints / UI​

Users​

Teams​

Keys​

UI Logs Page​

UI Authentication & Security​

UI General fixes​

Logging / Guardrail Integrations​

General Proxy Improvements​

Helm​

Full Changelog​

Docker Run LiteLLM Proxy​

Demo Instance​

New Models / Updated Models​

LLM Translation​

Spend Tracking Improvements​

UI​

Logging Integrations​

Performance / Reliability improvements​

General Improvements​

Complete Git Diff​

Known Issues​

Docker Run LiteLLM Proxy​

Demo Instance​

New Models / Updated Models​

LLM Translation​

Spend Tracking Improvements​

UI​

Re-Use Credentials on UI​

Test Connections before adding models​

General UI Improvements​

Security​

Logging Integrations​

Performance / Reliability improvements​

General Improvements​

Complete Git Diff​

Deploy this version

Key Highlights

Improved User Management

Responses API Load Balancing

UI Session Logs

New Models / Updated Models

Spend Tracking Improvements

Management Endpoints / UI

Users

Teams

Keys

UI Logs Page

UI Authentication & Security

UI General fixes

Logging / Guardrail Integrations

General Proxy Improvements

Helm

Full Changelog

Docker Run LiteLLM Proxy

Demo Instance

New Models / Updated Models

LLM Translation

Spend Tracking Improvements

UI

Logging Integrations

Performance / Reliability improvements

General Improvements

Complete Git Diff

Known Issues

Docker Run LiteLLM Proxy

Demo Instance

New Models / Updated Models

LLM Translation

Spend Tracking Improvements

UI

Re-Use Credentials on UI

Test Connections before adding models

General UI Improvements

Security

Logging Integrations

Performance / Reliability improvements

General Improvements

Complete Git Diff