Skip to content
Last updated

LiteLLM Integration

Connect your LiteLLM gateway to Capsule Security for centralized governance of every completion request routed through the proxy — with inline policy enforcement before the upstream model call and token/latency audit after it.

Overview

This integration uses a LiteLLM custom callback that runs in-process inside your LiteLLM proxy. Before each completion is forwarded to the upstream model provider, the callback posts the request to your Capsule tenant and blocks on an allow/deny verdict. After the upstream call settles, it posts token usage and latency as fire-and-forget telemetry.

LiteLLM is operator-deployed — you run the proxy in your own infrastructure, so there is no SaaS install. The only thing exchanged at install time is a signed token that scopes the plugin to your tenant and environment.

The following hooks are configured:

HookTypeDescription
async_pre_call_hookBlockingThe completion request (model + messages) before it reaches the upstream provider — can deny on policy violation
async_log_success_eventObservationToken usage and latency after a successful completion
async_log_failure_eventObservationLatency and error detail when the upstream call fails

Prerequisites

Before you begin, ensure you have:

  • A running LiteLLM proxy (Python 3.10 or later)
  • A Capsule Security account with admin access
  • Network access from your LiteLLM proxy to your Capsule agentsecurity endpoint (e.g. https://agents.capsule.security)

Step 1: Generate a Plugin Token

  1. Log in to the Capsule Security portal.

  2. Navigate to Integrations and locate LiteLLM.

  3. Click Install — Capsule generates a JWT scoped to your tenant and LiteLLM environment. The token contains the tenant and environment claims required by the agentsecurity endpoint; treat it as a secret.

  4. Copy the generated endpoint and token. You will reference them from your LiteLLM proxy environment in the next steps.


Step 2: Install the Plugin

Install the Capsule plugin into the same environment that runs your LiteLLM proxy:

git clone https://github.com/capsulesecurity/capsule.git
cd capsule/plugins/litellm-plugin
pip install -e .

Step 3: Configure the Plugin

Register the callback in your LiteLLM config.yaml:

litellm_settings:
  callbacks: ["capsule_litellm_plugin.capsule_callback"]

The callback reads its configuration from environment variables. Set these where the proxy runs:

export CAPSULE_LITELLM_ENDPOINT="https://agents.capsule.security/v1/litellm/hooks"
export CAPSULE_LITELLM_TOKEN="<token from Step 1>"

Configuration Options

VariableTypeDefaultDescription
CAPSULE_LITELLM_ENDPOINTstring(required)Capsule LiteLLM hooks endpoint, e.g. https://agents.capsule.security/v1/litellm/hooks
CAPSULE_LITELLM_TOKENstring(required)JWT generated in Step 1, scoped to your tenant and environment
CAPSULE_LITELLM_BLOCK_ON_RISKbooleantrueApply server deny verdicts inline. Set to false to run in observe-only mode
CAPSULE_LITELLM_FAIL_OPENbooleantrueWhen Capsule is unreachable or returns an error, allow the request to proceed. Set to false to fail closed
CAPSULE_LITELLM_TIMEOUT_MSnumber5000Per-request timeout in milliseconds

Token Storage

Never commit the JWT to version control. Recommended approaches:

  • Read the token from an environment variable (CAPSULE_LITELLM_TOKEN)
  • Source it from your secret manager (1Password, AWS Secrets Manager, HashiCorp Vault) at proxy startup
  • Deliver it through the same channel that distributes the rest of your LiteLLM deployment configuration

Step 4: Restart the LiteLLM Proxy

For the callback to take effect, restart the proxy with the updated configuration:

litellm --config config.yaml

On startup the plugin logs capsule_litellm_plugin: enabled when it has a valid endpoint and token.


Step 5: Verify the Installation

  1. Send a completion through the proxy to generate activity:

    curl -sS http://localhost:4000/v1/chat/completions \
      -H "Authorization: Bearer $LITELLM_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
  2. Log in to the Capsule Security portal.

  3. Navigate to Inventory > Agents and confirm a LiteLLM agent (named after the model) appears.

  4. Click on the agent and review the audit logs to verify the pre-call and post-call events are captured.

Troubleshooting

If events are not appearing:

  1. Verify the endpoint is reachable from the proxy host:

    curl -sS -o /dev/null -w "%{http_code}\n" https://agents.capsule.security/v1/litellm/hooks
  2. Verify the environment variables are set — the plugin logs capsule_litellm_plugin: disabled and stays inert if CAPSULE_LITELLM_ENDPOINT or CAPSULE_LITELLM_TOKEN is missing.

  3. Check for timeouts — if your network has high latency to the Capsule endpoint, increase CAPSULE_LITELLM_TIMEOUT_MS or confirm CAPSULE_LITELLM_FAIL_OPEN is set appropriately.

  4. Confirm the callback is loaded — LiteLLM logs registered callbacks on startup; the Capsule callback registers from capsule_litellm_plugin.capsule_callback.

  5. Contact Capsule Security support if issues persist.


Security Considerations

The Capsule callback runs in-process inside the LiteLLM proxy and observes every completion request. Before deploying:

  1. Protect the JWT — anyone with the token can post events on behalf of your tenant. Rotate it through the portal if it is exposed.
  2. Choose CAPSULE_LITELLM_FAIL_OPEN deliberatelytrue (the default) prioritizes availability and lets requests through when Capsule is unreachable; false prioritizes policy enforcement and rejects requests when Capsule cannot be reached.
  3. Pin a plugin version in production and review changelogs before upgrading.
  4. Use TLS-only endpoints. The plugin sends a Bearer token on every request — never configure an http:// endpoint outside local development.

Support

For help with this integration:

  • Email: support@capsule.security
  • Include: Your organization ID, integration status, plugin version, and any error messages

References