Connect your LiteLLM gateway to Capsule Security for centralized governance of every completion request routed through the proxy — with inline policy enforcement before the upstream model call and token/latency audit after it.
This integration uses a LiteLLM custom callback that runs in-process inside your LiteLLM proxy. Before each completion is forwarded to the upstream model provider, the callback posts the request to your Capsule tenant and blocks on an allow/deny verdict. After the upstream call settles, it posts token usage and latency as fire-and-forget telemetry.
LiteLLM is operator-deployed — you run the proxy in your own infrastructure, so there is no SaaS install. The only thing exchanged at install time is a signed token that scopes the plugin to your tenant and environment.
The following hooks are configured:
| Hook | Type | Description |
|---|---|---|
| async_pre_call_hook | Blocking | The completion request (model + messages) before it reaches the upstream provider — can deny on policy violation |
| async_log_success_event | Observation | Token usage and latency after a successful completion |
| async_log_failure_event | Observation | Latency and error detail when the upstream call fails |
Before you begin, ensure you have:
- A running LiteLLM proxy (Python 3.10 or later)
- A Capsule Security account with admin access
- Network access from your LiteLLM proxy to your Capsule agentsecurity endpoint (e.g.
https://agents.capsule.security)
Log in to the Capsule Security portal.
Navigate to Integrations and locate LiteLLM.
Click Install — Capsule generates a JWT scoped to your tenant and LiteLLM environment. The token contains the tenant and environment claims required by the agentsecurity endpoint; treat it as a secret.
Copy the generated endpoint and token. You will reference them from your LiteLLM proxy environment in the next steps.
Install the Capsule plugin into the same environment that runs your LiteLLM proxy:
git clone https://github.com/capsulesecurity/capsule.git
cd capsule/plugins/litellm-plugin
pip install -e .Register the callback in your LiteLLM config.yaml:
litellm_settings:
callbacks: ["capsule_litellm_plugin.capsule_callback"]The callback reads its configuration from environment variables. Set these where the proxy runs:
export CAPSULE_LITELLM_ENDPOINT="https://agents.capsule.security/v1/litellm/hooks"
export CAPSULE_LITELLM_TOKEN="<token from Step 1>"| Variable | Type | Default | Description |
|---|---|---|---|
CAPSULE_LITELLM_ENDPOINT | string | (required) | Capsule LiteLLM hooks endpoint, e.g. https://agents.capsule.security/v1/litellm/hooks |
CAPSULE_LITELLM_TOKEN | string | (required) | JWT generated in Step 1, scoped to your tenant and environment |
CAPSULE_LITELLM_BLOCK_ON_RISK | boolean | true | Apply server deny verdicts inline. Set to false to run in observe-only mode |
CAPSULE_LITELLM_FAIL_OPEN | boolean | true | When Capsule is unreachable or returns an error, allow the request to proceed. Set to false to fail closed |
CAPSULE_LITELLM_TIMEOUT_MS | number | 5000 | Per-request timeout in milliseconds |
Never commit the JWT to version control. Recommended approaches:
- Read the token from an environment variable (
CAPSULE_LITELLM_TOKEN) - Source it from your secret manager (1Password, AWS Secrets Manager, HashiCorp Vault) at proxy startup
- Deliver it through the same channel that distributes the rest of your LiteLLM deployment configuration
For the callback to take effect, restart the proxy with the updated configuration:
litellm --config config.yamlOn startup the plugin logs capsule_litellm_plugin: enabled when it has a valid endpoint and token.
Send a completion through the proxy to generate activity:
curl -sS http://localhost:4000/v1/chat/completions \ -H "Authorization: Bearer $LITELLM_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'Log in to the Capsule Security portal.
Navigate to Inventory > Agents and confirm a LiteLLM agent (named after the model) appears.
Click on the agent and review the audit logs to verify the pre-call and post-call events are captured.
If events are not appearing:
Verify the endpoint is reachable from the proxy host:
curl -sS -o /dev/null -w "%{http_code}\n" https://agents.capsule.security/v1/litellm/hooksVerify the environment variables are set — the plugin logs
capsule_litellm_plugin: disabledand stays inert ifCAPSULE_LITELLM_ENDPOINTorCAPSULE_LITELLM_TOKENis missing.Check for timeouts — if your network has high latency to the Capsule endpoint, increase
CAPSULE_LITELLM_TIMEOUT_MSor confirmCAPSULE_LITELLM_FAIL_OPENis set appropriately.Confirm the callback is loaded — LiteLLM logs registered callbacks on startup; the Capsule callback registers from
capsule_litellm_plugin.capsule_callback.Contact Capsule Security support if issues persist.
The Capsule callback runs in-process inside the LiteLLM proxy and observes every completion request. Before deploying:
- Protect the JWT — anyone with the token can post events on behalf of your tenant. Rotate it through the portal if it is exposed.
- Choose
CAPSULE_LITELLM_FAIL_OPENdeliberately —true(the default) prioritizes availability and lets requests through when Capsule is unreachable;falseprioritizes policy enforcement and rejects requests when Capsule cannot be reached. - Pin a plugin version in production and review changelogs before upgrading.
- Use TLS-only endpoints. The plugin sends a Bearer token on every request — never configure an
http://endpoint outside local development.
For help with this integration:
- Email: support@capsule.security
- Include: Your organization ID, integration status, plugin version, and any error messages