> ## Documentation Index
> Fetch the complete documentation index at: https://docs.stigg.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Using Stigg at Scale

> How to handle high-volume entitlement checks without adding latency or REST API overhead

The REST API is the right tool for **provisioning operations** — creating customers, managing subscriptions, reporting usage. These are write-heavy, low-frequency operations that fit naturally into a REST request/response model.

**Access checks are different.** Entitlement checks happen on every authenticated request in your product: gating feature access, enforcing usage limits, checking plan-based permissions. The API does expose `GET /api/v1/customers/{id}/entitlements` for retrieving a customer's effective entitlements, but at even moderate scale, routing those checks through the REST API introduces problems:

* Each check requires a network round-trip to `api.stigg.io`
* Latency adds directly to your users' request path
* There is no request-level caching — identical checks hit the network every time
* The API is designed for provisioning mutations, not sub-millisecond read throughput

## The Sidecar

For production access checks, Stigg provides the **Sidecar**: a lightweight service that runs alongside your application and serves entitlement evaluations from a local cache.

Key characteristics:

* **Low latency** — cached reads resolve in-process without a network call. On a cache miss, the Sidecar fetches from the Stigg Edge API (\~100 ms) and populates the cache for subsequent requests.
* **Real-time updates** — the Sidecar subscribes to Stigg's event stream and refreshes cached entitlements and usage automatically when plans or subscriptions change.
* **Fail-safe** — if the upstream Stigg API becomes unreachable, the Sidecar continues serving reads from its cache. If the cache is also unavailable, it falls back to configurable static default values.
* **Language-neutral** — the Sidecar exposes a gRPC interface defined with Protocol Buffers, so any language can call it.

<Note>
  **Node.js applications** do not need the Sidecar. The [Node.js SDK](https://node-sdk-docs.stigg.io/classes/stigg) provides full feature parity — including low-latency entitlement checks, local caching, and real-time updates — natively in-process.
</Note>

## Deployment options

The Sidecar can run in two configurations:

**Sidecar pattern** — one Sidecar container per application instance, co-located in the same network namespace (e.g., the same Kubernetes Pod). Requests stay on localhost with no external hops.

**Standalone service** — a single shared Sidecar instance that multiple application instances access over a private network port. Simpler to operate, but adds a small internal network hop and shares cache miss load across callers.

See [Sidecar Architecture](/documentation/high-availability-and-scale/sidecar/architecture) for diagrams and caching configuration details.

## Getting started

<CardGroup cols={2}>
  <Card title="Sidecar Overview" icon="server" href="/documentation/high-availability-and-scale/sidecar/overview">
    Understand fail-safe design, latency guarantees, and isolation properties
  </Card>

  <Card title="Running the Sidecar" icon="play" href="/documentation/high-availability-and-scale/sidecar/running-sidecar">
    Docker setup, environment variables, and health endpoints
  </Card>

  <Card title="Sidecar SDK" icon="code" href="/api-and-sdks/integration/backend/sidecar">
    Install and call the Sidecar from Python, Ruby, Go, Java, or .NET
  </Card>

  <Card title="Production guide" icon="rocket" href="/guides/i-want-to/run-stigg-sidecar-api-in-production">
    Scaling, monitoring, alerting, and troubleshooting
  </Card>
</CardGroup>
