What happens if Stigg is down?
Learn how Stigg mitigates risk in mission-critical components
Stigg's mission is to create a reliable and highly available solution that won't interrupt the mission-critical functionality of your application. To make sure that the critical paths are not affected, in case of a network error or if our services are unavailable for any reason, we have a few guardrails in place.
We prioritized the following SDK functions as "mission-critical":
- Accessing customer entitlements (P95 latency of 100ms)
- Reporting usage measurements (P95 latency of 200ms)
Accessible from anywhere
Stigg's API is globally scalable and highly available by leveraging a content delivery network (CDN), where our code is executed at a location (regional edge) closest to the user. This means our services are accessible from anywhere in the world without compromising performance and latency. Due to the nature of this function-as-a-service (FaaS) architecture, our API can handle a large volume of concurrent requests by automatically scaling up and allocating new functions as much as needed.

Leveraging the CDN for improved performance and latency
When accessing entitlement, usage, and catalog data through our API, the reads are performed on pre-calculated data that is stored in a distributed cache and is automatically replicated across multiple regions. It allows us to achieve lightning-fast read response times and improves our APIs overall availability. The cache is kept up-to-date by recalculating the data on every customer subscription update and catalog modification.
Local caching and fallback strategy
Stigg SDKs support local caching, which means that customer entitlements are stored in-memory to prevent network trips for subsequent entitlement access checks. It also makes our API redundant in the case it is unreachable for any reason (like a network outage), and the SDK can continue to evaluate access checks even when it's offline.
The local cache is invalidated by frequently polling for updates, or by listening to real-time updates delivered by a WebSocket streaming connection to our API. By choosing the latter, changes are propagated almost instantaneously.
const client = await Stigg.initialize({
// set the polling interval, default to 5 seconds
pollingIntervalMs: 5000,
// alternatively, use streaming connection instead of polling
subscribeToUpdates: true
});
In the case where the entitlement local data is missing and our API is unreachable, the SDK supports fallbacks that can be provided on each entitlement check, for example:
const entitlement = stiggClient.getMeteredEntitlement({
featureId: 'feature-number-of-seats',
customerId: 'customer-demo-id',
options: {
//if an error occurs, allow access with unlimited usage
fallback: {
hasAccess: true,
isUnlimited: true
}
}
});
While the entitlements checks continue to work in offline mode, usage metering is postponed until the SDK will become online again. During that period, usage measurements are buffered in-memory and will not be counted as actual usage before they're sent and reach Stigg servers.
In case of a long-term outage, the measurements can be stored in a persistent cache (if provided) to avoid data loss if the process will crash while the SDK is offline.
Persistent cache
By default, the SDKs in-memory cache stores entitlements data for faster subsequent entitlement checks. Both server and client SDKs support this functionality. If you restart the process that is using the SDK or re-initialize the SDK, the local data will be lost. Usually, this behavior is acceptable because if the entitlements data is missing for a specific customer, the SDK will perform a network request and pull it from Stigg's API.
Alternatively, you can provide an external database to persist the customer entitlements data which can survive restarts and be accessed by multiple processes. If your processes are running on a serverless infrastructure, where each process is transient and can be de-provisioned after a limited period of time, it can greatly reduce cache misses. This can only be done using the Server SDK. Whenever an application that uses the client SDK restarts (or page reloads in the case of a web application), the entitlement data will always be fetched over the network from the Stigg API.

Retrieve data from persistent cache first, fetch over the network on cache miss
Currently, the Stigg Server SDK supports Redis as a persistent cache, for example:
import Stigg from '@stigg/node-server-sdk';
const client = await Stigg.initialize({
apiKey: 'YOU_SERVER_API_KEY',
persistentCache: {
type: 'redis',
host: 'localhost',
port: 6379,
},
inMemoryCache: {
ttlSeconds: 60
}
});
export default client;
During initialization, the SDK will try to connect to the Redis instance. Once connected, all the entitlement checks will be evaluated against data from the persistent cache instead of fetching it over the network.
The persistent cache is updated the same way the in-memory cache does - if customer data is missing, it will be fetched over the network and persisted in the cache. To avoid accessing the persistent cache on every entitlement check, the SDK stores a copy of the data in-memory with a predefined TTL.
While connected to the persistent cache, usage measurements are also persisted in the cache prior to reporting them to the Stigg API. This way no data is being lost when an outage occurs and measurements can be resent when the system is back online with network connectivity.
If you're currently using a caching framework other than Redis, let us know and we'd be happy to explore ways to support it.
Errors, retries, and usage buffering
Stigg's Server SDK automatically handles network errors and engages in periodic retries until the connectivity is back up again. Usage measurements are buffered in-memory or in a persistent cache (if configured) and will be sent to the API as soon as possible. During that time, the SDK will gracefully return results from its cache or use fallback values if it's empty.
Updated 1 day ago