LiteLLM Integration
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude
Connect to Cursor
Install MCP server on Cursor
Connect to VS Code
Install MCP server on VS Code

Connect your LiteLLM gateway to Capsule Security for centralized governance of every completion request routed through the proxy — with inline policy enforcement before the upstream model call and token/latency audit after it.

Overview

This integration uses a LiteLLM custom callback that runs in-process inside your LiteLLM proxy. Before each completion is forwarded to the upstream model provider, the callback posts the request to your Capsule tenant and blocks on an allow/deny verdict. After the upstream call settles, it posts token usage and latency as fire-and-forget telemetry.

LiteLLM is operator-deployed — you run the proxy in your own infrastructure, so there is no SaaS install. The only thing exchanged at install time is a signed token that scopes the plugin to your tenant and environment.

The following hooks are configured:

Hook	Type	Description
async_pre_call_hook	Blocking	The completion request (model + messages) before it reaches the upstream provider — can deny on policy violation
async_log_success_event	Observation	Token usage and latency after a successful completion
async_log_failure_event	Observation	Latency and error detail when the upstream call fails

Prerequisites

Before you begin, ensure you have:

A running LiteLLM proxy (Python 3.10 or later)
A Capsule Security account with admin access
Network access from your LiteLLM proxy to your Capsule agentsecurity endpoint (e.g. https://agents.capsule.security)

Step 1: Generate a Plugin Token

Log in to the Capsule Security portal.
Navigate to Integrations and locate LiteLLM.
Click Install — Capsule generates a JWT scoped to your tenant and LiteLLM environment. The token contains the tenant and environment claims required by the agentsecurity endpoint; treat it as a secret.
Copy the generated endpoint and token. You will reference them from your LiteLLM proxy environment in the next steps.

Step 2: Install the Plugin

Install the Capsule plugin into the same environment that runs your LiteLLM proxy:

git clone https://github.com/capsulesecurity/capsule.git
cd capsule/plugins/litellm-plugin
pip install -e .

Step 3: Configure the Plugin

litellm_settings:
  callbacks: ["capsule_litellm_plugin.capsule_callback"]

The callback reads its configuration from environment variables. Set these where the proxy runs:

export CAPSULE_LITELLM_ENDPOINT="https://agents.capsule.security/v1/litellm/hooks"
export CAPSULE_LITELLM_TOKEN="<token from Step 1>"

Configuration Options

Variable	Type	Default	Description
`CAPSULE_LITELLM_ENDPOINT`	`string`	(required)	Capsule LiteLLM hooks endpoint, e.g. `https://agents.capsule.security/v1/litellm/hooks`
`CAPSULE_LITELLM_TOKEN`	`string`	(required)	JWT generated in Step 1, scoped to your tenant and environment
`CAPSULE_LITELLM_BLOCK_ON_RISK`	`boolean`	`true`	Apply server `deny` verdicts inline. Set to `false` to run in observe-only mode
`CAPSULE_LITELLM_FAIL_OPEN`	`boolean`	`true`	When Capsule is unreachable or returns an error, allow the request to proceed. Set to `false` to fail closed
`CAPSULE_LITELLM_TIMEOUT_MS`	`number`	`5000`	Per-request timeout in milliseconds

Token Storage

Never commit the JWT to version control. Recommended approaches:

Read the token from an environment variable (CAPSULE_LITELLM_TOKEN)
Source it from your secret manager (1Password, AWS Secrets Manager, HashiCorp Vault) at proxy startup
Deliver it through the same channel that distributes the rest of your LiteLLM deployment configuration

Step 4: Restart the LiteLLM Proxy

For the callback to take effect, restart the proxy with the updated configuration:

litellm --config config.yaml

On startup the plugin logs capsule_litellm_plugin: enabled when it has a valid endpoint and token.

Step 5: Verify the Installation

Send a completion through the proxy to generate activity:

curl -sS http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer $LITELLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

Log in to the Capsule Security portal.
Navigate to Inventory > Agents and confirm a LiteLLM agent (named after the model) appears.
Click on the agent and review the audit logs to verify the pre-call and post-call events are captured.

Troubleshooting

If events are not appearing:

Verify the endpoint is reachable from the proxy host:

curl -sS -o /dev/null -w "%{http_code}\n" https://agents.capsule.security/v1/litellm/hooks

Verify the environment variables are set — the plugin logs capsule_litellm_plugin: disabled and stays inert if CAPSULE_LITELLM_ENDPOINT or CAPSULE_LITELLM_TOKEN is missing.
Check for timeouts — if your network has high latency to the Capsule endpoint, increase CAPSULE_LITELLM_TIMEOUT_MS or confirm CAPSULE_LITELLM_FAIL_OPEN is set appropriately.
Confirm the callback is loaded — LiteLLM logs registered callbacks on startup; the Capsule callback registers from capsule_litellm_plugin.capsule_callback.
Contact Capsule Security support if issues persist.

Security Considerations

The Capsule callback runs in-process inside the LiteLLM proxy and observes every completion request. Before deploying:

Protect the JWT — anyone with the token can post events on behalf of your tenant. Rotate it through the portal if it is exposed.
Choose CAPSULE_LITELLM_FAIL_OPEN deliberately — true (the default) prioritizes availability and lets requests through when Capsule is unreachable; false prioritizes policy enforcement and rejects requests when Capsule cannot be reached.
Pin a plugin version in production and review changelogs before upgrading.
Use TLS-only endpoints. The plugin sends a Bearer token on every request — never configure an http:// endpoint outside local development.

Support

For help with this integration:

Email: support@capsule.security
Include: Your organization ID, integration status, plugin version, and any error messages

LiteLLM Integration
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude
Connect to Cursor
Install MCP server on Cursor
Connect to VS Code
Install MCP server on VS Code

Overview

Prerequisites

Step 1: Generate a Plugin Token

Step 2: Install the Plugin

Step 3: Configure the Plugin

Configuration Options

Token Storage

Step 4: Restart the LiteLLM Proxy

Step 5: Verify the Installation

Troubleshooting

Security Considerations

Support

References

Was this helpful?

LiteLLM IntegrationCopyCopy for LLMCopy page as Markdown for LLMsView as MarkdownOpen this page as MarkdownOpen in ChatGPTGet insights from ChatGPTOpen in ClaudeGet insights from ClaudeConnect to CursorInstall MCP server on CursorConnect to VS CodeInstall MCP server on VS Code

Overview

Prerequisites

Step 1: Generate a Plugin Token

Step 2: Install the Plugin

Step 3: Configure the Plugin

Configuration Options

Token Storage

Step 4: Restart the LiteLLM Proxy

Step 5: Verify the Installation

Troubleshooting

Security Considerations

Support

References

Was this helpful?

LiteLLM Integration
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude
Connect to Cursor
Install MCP server on Cursor
Connect to VS Code
Install MCP server on VS Code