# LiteLLM Integration Connect your [LiteLLM](https://docs.litellm.ai) gateway to Capsule Security for centralized governance of every completion request routed through the proxy — with inline policy enforcement before the upstream model call and token/latency audit after it. ## Overview This integration uses a LiteLLM [custom callback](https://docs.litellm.ai/docs/observability/custom_callback) that runs in-process inside your LiteLLM proxy. Before each completion is forwarded to the upstream model provider, the callback posts the request to your Capsule tenant and blocks on an allow/deny verdict. After the upstream call settles, it posts token usage and latency as fire-and-forget telemetry. LiteLLM is operator-deployed — you run the proxy in your own infrastructure, so there is no SaaS install. The only thing exchanged at install time is a signed token that scopes the plugin to your tenant and environment. The following hooks are configured: | Hook | Type | Description | | --- | --- | --- | | **async_pre_call_hook** | Blocking | The completion request (model + messages) before it reaches the upstream provider — can deny on policy violation | | **async_log_success_event** | Observation | Token usage and latency after a successful completion | | **async_log_failure_event** | Observation | Latency and error detail when the upstream call fails | ## Prerequisites Before you begin, ensure you have: - A running **LiteLLM proxy** (Python 3.10 or later) - A **Capsule Security** account with admin access - Network access from your LiteLLM proxy to your Capsule agentsecurity endpoint (e.g. `https://agents.capsule.security`) ## Step 1: Generate a Plugin Token 1. Log in to the **Capsule Security** portal. 2. Navigate to **Integrations** and locate **LiteLLM**. 3. Click **Install** — Capsule generates a JWT scoped to your tenant and LiteLLM environment. The token contains the tenant and environment claims required by the agentsecurity endpoint; treat it as a secret. 4. Copy the generated **endpoint** and **token**. You will reference them from your LiteLLM proxy environment in the next steps. ## Step 2: Install the Plugin Install the Capsule plugin into the same environment that runs your LiteLLM proxy: ```bash git clone https://github.com/capsulesecurity/capsule.git cd capsule/plugins/litellm-plugin pip install -e . ``` ## Step 3: Configure the Plugin Register the callback in your LiteLLM `config.yaml`: ```yaml litellm_settings: callbacks: ["capsule_litellm_plugin.capsule_callback"] ``` The callback reads its configuration from environment variables. Set these where the proxy runs: ```bash export CAPSULE_LITELLM_ENDPOINT="https://agents.capsule.security/v1/litellm/hooks" export CAPSULE_LITELLM_TOKEN="" ``` ### Configuration Options | Variable | Type | Default | Description | | --- | --- | --- | --- | | `CAPSULE_LITELLM_ENDPOINT` | `string` | *(required)* | Capsule LiteLLM hooks endpoint, e.g. `https://agents.capsule.security/v1/litellm/hooks` | | `CAPSULE_LITELLM_TOKEN` | `string` | *(required)* | JWT generated in Step 1, scoped to your tenant and environment | | `CAPSULE_LITELLM_BLOCK_ON_RISK` | `boolean` | `true` | Apply server `deny` verdicts inline. Set to `false` to run in observe-only mode | | `CAPSULE_LITELLM_FAIL_OPEN` | `boolean` | `true` | When Capsule is unreachable or returns an error, allow the request to proceed. Set to `false` to fail closed | | `CAPSULE_LITELLM_TIMEOUT_MS` | `number` | `5000` | Per-request timeout in milliseconds | ### Token Storage Never commit the JWT to version control. Recommended approaches: - Read the token from an environment variable (`CAPSULE_LITELLM_TOKEN`) - Source it from your secret manager (1Password, AWS Secrets Manager, HashiCorp Vault) at proxy startup - Deliver it through the same channel that distributes the rest of your LiteLLM deployment configuration ## Step 4: Restart the LiteLLM Proxy For the callback to take effect, restart the proxy with the updated configuration: ```bash litellm --config config.yaml ``` On startup the plugin logs `capsule_litellm_plugin: enabled` when it has a valid endpoint and token. ## Step 5: Verify the Installation 1. Send a completion through the proxy to generate activity: ```bash curl -sS http://localhost:4000/v1/chat/completions \ -H "Authorization: Bearer $LITELLM_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}' ``` 2. Log in to the **Capsule Security** portal. 3. Navigate to **Inventory > Agents** and confirm a LiteLLM agent (named after the model) appears. 4. Click on the agent and review the audit logs to verify the pre-call and post-call events are captured. ### Troubleshooting If events are not appearing: 1. **Verify the endpoint is reachable** from the proxy host: ```bash curl -sS -o /dev/null -w "%{http_code}\n" https://agents.capsule.security/v1/litellm/hooks ``` 2. **Verify the environment variables are set** — the plugin logs `capsule_litellm_plugin: disabled` and stays inert if `CAPSULE_LITELLM_ENDPOINT` or `CAPSULE_LITELLM_TOKEN` is missing. 3. **Check for timeouts** — if your network has high latency to the Capsule endpoint, increase `CAPSULE_LITELLM_TIMEOUT_MS` or confirm `CAPSULE_LITELLM_FAIL_OPEN` is set appropriately. 4. **Confirm the callback is loaded** — LiteLLM logs registered callbacks on startup; the Capsule callback registers from `capsule_litellm_plugin.capsule_callback`. 5. **Contact Capsule Security support** if issues persist. ## Security Considerations The Capsule callback runs in-process inside the LiteLLM proxy and observes every completion request. Before deploying: 1. **Protect the JWT** — anyone with the token can post events on behalf of your tenant. Rotate it through the portal if it is exposed. 2. **Choose `CAPSULE_LITELLM_FAIL_OPEN` deliberately** — `true` (the default) prioritizes availability and lets requests through when Capsule is unreachable; `false` prioritizes policy enforcement and rejects requests when Capsule cannot be reached. 3. **Pin a plugin version** in production and review changelogs before upgrading. 4. **Use TLS-only endpoints**. The plugin sends a Bearer token on every request — never configure an `http://` endpoint outside local development. ## Support For help with this integration: - **Email**: support@capsule.security - **Include**: Your organization ID, integration status, plugin version, and any error messages ## References - [LiteLLM Documentation](https://docs.litellm.ai) - [LiteLLM Custom Callbacks](https://docs.litellm.ai/docs/observability/custom_callback) - [LiteLLM Hooks API Reference](/apis/litellm-hooks)