4 posts tagged with "security"

View All Tags

v1.67.4-stable - Improved User Management

April 26, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

Deploy this version

Docker
Pip

docker run litellm
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.67.4-stable

pip install litellm

pip install litellm==1.67.4.post1

Key Highlights

Improved User Management: This release enables search and filtering across users, keys, teams, and models.
Responses API Load Balancing: Route requests across provider regions and ensure session continuity.
UI Session Logs: Group several requests to LiteLLM into a session.

Improved User Management

This release makes it easier to manage users and keys on LiteLLM. You can now search and filter across users, keys, teams, and models, and control user settings more easily.

New features include:

Search for users by email, ID, role, or team.
See all of a user's models, teams, and keys in one place.
Change user roles and model access right from the Users Tab.

These changes help you spend less time on user setup and management on LiteLLM.

Responses API Load Balancing

This release introduces load balancing for the Responses API, allowing you to route requests across provider regions and ensure session continuity. It works as follows:

If a previous_response_id is provided, LiteLLM will route the request to the original deployment that generated the prior response — ensuring session continuity.
If no previous_response_id is provided, LiteLLM will load-balance requests across your available deployments.

UI Session Logs

This release allow you to group requests to LiteLLM proxy into a session. If you specify a litellm_session_id in your request LiteLLM will automatically group all logs in the same session. This allows you to easily track usage and request content per session.

OpenAI
1. Added gpt-image-1 cost tracking Get Started
2. Bug fix: added cost tracking for gpt-image-1 when quality is unspecified PR
Azure
1. Fixed timestamp granularities passing to whisper in Azure Get Started
2. Added azure/gpt-image-1 pricing Get Started, PR
3. Added cost tracking for azure/computer-use-preview, azure/gpt-4o-audio-preview-2024-12-17, azure/gpt-4o-mini-audio-preview-2024-12-17 PR
Bedrock
1. Added support for all compatible Bedrock parameters when model="arn:.." (Bedrock application inference profile models) Get started, PR
2. Fixed wrong system prompt transformation PR
VertexAI / Google AI Studio
1. Allow setting budget_tokens=0 for gemini-2.5-flash Get Started,PR
2. Ensure returned usage includes thinking token usage PR
3. Added cost tracking for gemini-2.5-pro-preview-03-25 PR
Cohere
1. Added support for cohere command-a-03-2025 Get Started, PR
SageMaker
1. Added support for max_completion_tokens parameter Get Started, PR
Responses API
1. Added support for GET and DELETE operations - /v1/responses/{response_id} Get Started
2. Added session management support for non-OpenAI models PR
3. Added routing affinity to maintain model consistency within sessions Get Started, PR

Spend Tracking Improvements

Bug Fix: Fixed spend tracking bug, ensuring default litellm params aren't modified in memory PR
Deprecation Dates: Added deprecation dates for Azure, VertexAI models PR

Management Endpoints / UI

Users

Filtering and Searching:
- Filter users by user_id, role, team, sso_id
- Search users by email
User Info Panel: Added a new user information pane PR
- View teams, keys, models associated with User
- Edit user role, model permissions

Teams

Filtering and Searching:
- Filter teams by Organization, Team ID PR
- Search teams by Team Name PR

Keys

Key Management:
- Support for cross-filtering and filtering by key hash PR
- Fixed key alias reset when resetting filters PR
- Fixed table rendering on key creation PR

UI Logs Page

Session Logs: Added UI Session Logs Get Started

UI Authentication & Security

Required Authentication: Authentication now required for all dashboard pages PR
SSO Fixes: Fixed SSO user login invalid token error PR
[BETA] Encrypted Tokens: Moved UI to encrypted token usage PR
Token Expiry: Support token refresh by re-routing to login page (fixes issue where expired token would show a blank page) PR

UI General fixes

Fixed UI Flicker: Addressed UI flickering issues in Dashboard PR
Improved Terminology: Better loading and no-data states on Keys and Tools pages PR
Azure Model Support: Fixed editing Azure public model names and changing model names after creation PR
Team Model Selector: Bug fix for team model selection PR

Logging / Guardrail Integrations

Datadog:
1. Fixed Datadog LLM observability logging Get Started, PR
Prometheus / Grafana:
1. Enable datasource selection on LiteLLM Grafana Template Get Started, PR
AgentOps:
1. Added AgentOps Integration Get Started, PR
Arize:
1. Added missing attributes for Arize & Phoenix Integration Get Started, PR

General Proxy Improvements

Caching: Fixed caching to account for thinking or reasoning_effort when calculating cache key PR
Model Groups: Fixed handling for cases where user sets model_group inside model_info PR
Passthrough Endpoints: Ensured PassthroughStandardLoggingPayload is logged with method, URL, request/response body PR
Fix SQL Injection: Fixed potential SQL injection vulnerability in spend_management_endpoints.py PR

Helm

Fixed serviceAccountName on migration job PR

Full Changelog

The complete list of changes can be found in the GitHub release notes.

v1.67.0-stable - SCIM Integration

April 19, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

Key Highlights

SCIM Integration: Enables identity providers (Okta, Azure AD, OneLogin, etc.) to automate user and team (group) provisioning, updates, and deprovisioning
Team and Tag based usage tracking: You can now see usage and spend by team and tag at 1M+ spend logs.
Unified Responses API: Support for calling Anthropic, Gemini, Groq, etc. via OpenAI's new Responses API.

Let's dive in.

SCIM Integration

This release adds SCIM support to LiteLLM. This allows your SSO provider (Okta, Azure AD, etc) to automatically create/delete users, teams, and memberships on LiteLLM. This means that when you remove a team on your SSO provider, your SSO provider will automatically delete the corresponding team on LiteLLM.

Team and Tag based usage tracking

This release improves team and tag based usage tracking at 1m+ spend logs, making it easy to monitor your LLM API Spend in production. This covers:

View daily spend by teams + tags
View usage / spend by key, within teams
View spend by multiple tags
Allow internal users to view spend of teams they're a member of

Unified Responses API

This release allows you to call Azure OpenAI, Anthropic, AWS Bedrock, and Google Vertex AI models via the POST /v1/responses endpoint on LiteLLM. This means you can now use popular tools like OpenAI Codex with your own models.

OpenAI
1. gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, o3-mini, o4-mini pricing - Get Started, PR
2. o4 - correctly map o4 to openai o_series model
Azure AI
1. Phi-4 output cost per token fix - PR
2. Responses API support Get Started,PR
Anthropic
1. redacted message thinking support - Get Started,PR
Cohere
1. /v2/chat Passthrough endpoint support w/ cost tracking - Get Started, PR
Azure
1. Support azure tenant_id/client_id env vars - Get Started, PR
2. Fix response_format check for 2025+ api versions - PR
3. Add gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, o3-mini, o4-mini pricing
VLLM
1. Files - Support 'file' message type for VLLM video url's - Get Started, PR
2. Passthrough - new /vllm/ passthrough endpoint support Get Started, PR
Mistral
1. new /mistral passthrough endpoint support Get Started, PR
AWS
1. New mapped bedrock regions - PR
VertexAI / Google AI Studio
1. Gemini - Response format - Retain schema field ordering for google gemini and vertex by specifying propertyOrdering - Get Started, PR
2. Gemini-2.5-flash - return reasoning content Google AI Studio, Vertex AI
3. Gemini-2.5-flash - pricing + model information PR
4. Passthrough - new /vertex_ai/discovery route - enables calling AgentBuilder API routes Get Started, PR
Fireworks AI
1. return tool calling responses in tool_calls field (fireworks incorrectly returns this as a json str in content) PR
Triton
1. Remove fixed remove bad_words / stop words from /generate call - Get Started, PR
Other
1. Support for all litellm providers on Responses API (works with Codex) - Get Started, PR
2. Fix combining multiple tool calls in streaming response - Get Started, PR

Spend Tracking Improvements

Cost Control - inject cache control points in prompt for cost reduction Get Started, PR
Spend Tags - spend tags in headers - support x-litellm-tags even if tag based routing not enabled Get Started, PR
Gemini-2.5-flash - support cost calculation for reasoning tokens PR

Management Endpoints / UI

Users
1. Show created_at and updated_at on users page - PR
Virtual Keys
1. Filter by key alias - https://github.com/BerriAI/litellm/pull/10085
Usage Tab
1. Team based usage
  - New LiteLLM_DailyTeamSpend Table for aggregate team based usage logging - PR
  - New Team based usage dashboard + new /team/daily/activity API - PR
  - Return team alias on /team/daily/activity API - PR
  - allow internal user view spend for teams they belong to - PR
  - allow viewing top keys by team - PR
2. Tag Based Usage
  - New LiteLLM_DailyTagSpend Table for aggregate tag based usage logging - PR
  - Restrict to only Proxy Admins - PR
  - allow viewing top keys by tag
  - Return tags passed in request (i.e. dynamic tags) on /tag/list API - PR
3. Track prompt caching metrics in daily user, team, tag tables - PR
4. Show usage by key (on all up, team, and tag usage dashboards) - PR
5. swap old usage with new usage tab
Models
1. Make columns resizable/hideable - PR
API Playground
1. Allow internal user to call api playground - PR
SCIM
1. Add LiteLLM SCIM Integration for Team and User management - Get Started, PR

Logging / Guardrail Integrations

GCS
1. Fix gcs pub sub logging with env var GCS_PROJECT_ID - Get Started, PR
AIM
1. Add litellm call id passing to Aim guardrails on pre and post-hooks calls - Get Started, PR
Azure blob storage
1. Ensure logging works in high throughput scenarios - Get Started, PR

General Proxy Improvements

Support setting litellm.modify_params via env var PR
Model Discovery - Check provider’s /models endpoints when calling proxy’s /v1/models endpoint - Get Started, PR
/utils/token_counter - fix retrieving custom tokenizer for db models - Get Started, PR
Prisma migrate - handle existing columns in db table - PR

v1.66.0-stable - Realtime API Cost Tracking

April 12, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

Deploy this version

Docker
Pip

docker run litellm
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.66.0-stable

pip install litellm

pip install litellm==1.66.0.post1

v1.66.0-stable is live now, here are the key highlights of this release

Key Highlights

Realtime API Cost Tracking: Track cost of realtime API calls
Microsoft SSO Auto-sync: Auto-sync groups and group members from Azure Entra ID to LiteLLM
xAI grok-3: Added support for xai/grok-3 models
Security Fixes: Fixed CVE-2025-0330 and CVE-2024-6825 vulnerabilities

Let's dive in.

Realtime API Cost Tracking

This release adds Realtime API logging + cost tracking.

Logging: LiteLLM now logs the complete response from realtime calls to all logging integrations (DB, S3, Langfuse, etc.)
Cost Tracking: You can now set 'base_model' and custom pricing for realtime models. Custom Pricing
Budgets: Your key/user/team budgets now work for realtime models as well.

Start here

Microsoft SSO Auto-sync

Auto-sync groups and members from Azure Entra ID to LiteLLM

This release adds support for auto-syncing groups and members on Microsoft Entra ID with LiteLLM. This means that LiteLLM proxy administrators can spend less time managing teams and members and LiteLLM handles the following:

Auto-create teams that exist on Microsoft Entra ID
Sync team members on Microsoft Entra ID with LiteLLM teams

Get started with this here

New Models / Updated Models

xAI
1. Added reasoning_effort support for xai/grok-3-mini-beta Get Started
2. Added cost tracking for xai/grok-3 models PR
Hugging Face
1. Added inference providers support Get Started
Azure
1. Added azure/gpt-4o-realtime-audio cost tracking PR
VertexAI
1. Added enterpriseWebSearch tool support Get Started
2. Moved to only passing keys accepted by the Vertex AI response schema PR
Google AI Studio
1. Added cost tracking for gemini-2.5-pro PR
2. Fixed pricing for 'gemini/gemini-2.5-pro-preview-03-25' PR
3. Fixed handling file_data being passed in PR
Azure
1. Updated Azure Phi-4 pricing PR
2. Added azure/gpt-4o-realtime-audio cost tracking PR
Databricks
1. Removed reasoning_effort from parameters PR
2. Fixed custom endpoint check for Databricks PR
General
1. Added litellm.supports_reasoning() util to track if an llm supports reasoning Get Started
2. Function Calling - Handle pydantic base model in message tool calls, handle tools = [], and support fake streaming on tool calls for meta.llama3-3-70b-instruct-v1:0 PR
3. LiteLLM Proxy - Allow passing thinking param to litellm proxy via client sdk PR
4. Fixed correctly translating 'thinking' param for litellm PR

Spend Tracking Improvements

OpenAI, Azure
1. Realtime API Cost tracking with token usage metrics in spend logs Get Started
Anthropic
1. Fixed Claude Haiku cache read pricing per token PR
2. Added cost tracking for Claude responses with base_model PR
3. Fixed Anthropic prompt caching cost calculation and trimmed logged message in db PR
General
1. Added token tracking and log usage object in spend logs PR
2. Handle custom pricing at deployment level PR

Management Endpoints / UI

Test Key Tab
1. Added rendering of Reasoning content, ttft, usage metrics on test key page PR
  View input, output, reasoning tokens, ttft metrics.
Tag / Policy Management
1. Added Tag/Policy Management. Create routing rules based on request metadata. This allows you to enforce that requests with tags="private" only go to specific models. Get Started
  
  Create and manage tags.
Redesigned Login Screen
1. Polished login screen PR
Microsoft SSO Auto-Sync
1. Added debug route to allow admins to debug SSO JWT fields PR
2. Added ability to use MSFT Graph API to assign users to teams PR
3. Connected litellm to Azure Entra ID Enterprise Application PR
4. Added ability for admins to set default_team_params for when litellm SSO creates default teams PR
5. Fixed MSFT SSO to use correct field for user email PR
6. Added UI support for setting Default Team setting when litellm SSO auto creates teams PR
UI Bug Fixes
1. Prevented team, key, org, model numerical values changing on scrolling PR
2. Instantly reflect key and team updates in UI PR

Logging / Guardrail Improvements

Prometheus
1. Emit Key and Team Budget metrics on a cron job schedule Get Started

Security Fixes

Fixed CVE-2025-0330 - Leakage of Langfuse API keys in team exception handling PR
Fixed CVE-2024-6825 - Remote code execution in post call rules PR

Helm

Added service annotations to litellm-helm chart PR
Added extraEnvVars to the helm deployment PR

Demo

Try this on the demo instance today

Complete Git Diff

See the complete git diff since v1.65.4-stable, here

v1.57.3 - New Base Docker Image

January 8, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

docker image, security, vulnerability

0 Critical/High Vulnerabilities

What changed?

LiteLLMBase image now uses cgr.dev/chainguard/python:latest-dev

Why the change?

To ensure there are 0 critical/high vulnerabilities on LiteLLM Docker Image

Migration Guide

If you use a custom dockerfile with litellm as a base image + apt-get

Instead of apt-get use apk, the base litellm image will no longer have apt-get installed.

You are only impacted if you use apt-get in your Dockerfile

# Use the provided base image
FROM ghcr.io/berriai/litellm:main-latest

# Set the working directory
WORKDIR /app

# Install dependencies - CHANGE THIS to `apk`
RUN apt-get update && apt-get install -y dumb-init 

Before Change

RUN apt-get update && apt-get install -y dumb-init

After Change

RUN apk update && apk add --no-cache dumb-init

Deploy this version​

Key Highlights​

Improved User Management​

Responses API Load Balancing​

UI Session Logs​

New Models / Updated Models​

Spend Tracking Improvements​

Management Endpoints / UI​

Users​

Teams​

Keys​

UI Logs Page​

UI Authentication & Security​

UI General fixes​

Logging / Guardrail Integrations​

General Proxy Improvements​

Helm​

Full Changelog​

Key Highlights​

SCIM Integration​

Team and Tag based usage tracking​

Unified Responses API​

New Models / Updated Models​

Spend Tracking Improvements​

Management Endpoints / UI​

Logging / Guardrail Integrations​

General Proxy Improvements​

Deploy this version​

Key Highlights​

Realtime API Cost Tracking​

Microsoft SSO Auto-sync​

New Models / Updated Models​

Spend Tracking Improvements​

Management Endpoints / UI​

Logging / Guardrail Improvements​

Security Fixes​

Helm​

Demo​

Complete Git Diff​

0 Critical/High Vulnerabilities

What changed?​

Why the change?​

Migration Guide​

Deploy this version

Key Highlights

Improved User Management

Responses API Load Balancing

UI Session Logs

New Models / Updated Models

Spend Tracking Improvements

Management Endpoints / UI

Users

Teams

Keys

UI Logs Page

UI Authentication & Security

UI General fixes

Logging / Guardrail Integrations

General Proxy Improvements

Helm

Full Changelog

Key Highlights

SCIM Integration

Team and Tag based usage tracking

Unified Responses API

New Models / Updated Models

Spend Tracking Improvements

Management Endpoints / UI

Logging / Guardrail Integrations

General Proxy Improvements

Deploy this version

Key Highlights

Realtime API Cost Tracking

Microsoft SSO Auto-sync

New Models / Updated Models

Spend Tracking Improvements

Management Endpoints / UI

Logging / Guardrail Improvements

Security Fixes

Helm

Demo

Complete Git Diff

What changed?

Why the change?

Migration Guide