Skip to content

Rate Limiting

Primatomic enforces per-tenant rate limits to ensure fair resource allocation. This document specifies the rate limiting behavior and client requirements.

ParameterValue
Sustained rate100 requests per second
Burst capacity50 requests
ScopePer tenant (shared across all API keys)

All API keys belonging to a tenant share the same rate limit bucket.

The service uses a token bucket algorithm with these properties:

PropertySpecification
Refill rate100 tokens per second
Bucket capacity150 tokens (sustained + burst)
Request cost1 token per request
RejectionRequests are rejected with 429 when bucket is empty

When rate limited, the service responds:

HTTP/1.1 429 Too Many Requests
Retry-After: 1
Content-Type: text/plain
Rate limit exceeded

The Retry-After header indicates the minimum seconds to wait before retrying.

RequirementLevel
Respect Retry-After headerRequired
Implement exponential backoffRequired
Add jitter to backoffRecommended
Implement client-side throttlingRecommended

Clients should implement retry logic for 429 responses:

async function withRateLimitRetry<T>(
fn: () => Promise<T>,
maxRetries: number = 5
): Promise<T> {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await fn();
} catch (error) {
if (error.status !== 429) {
throw error;
}
// Respect Retry-After header
const retryAfter = parseInt(error.headers.get('Retry-After') || '1');
// Use exponential backoff
const baseDelay = retryAfter * 1000 * Math.pow(2, attempt);
// Add jitter
const jitter = Math.random() * 1000;
await sleep(baseDelay + jitter);
}
}
throw new Error('Rate limit: max retries exceeded');
}

The /logs/{log_name}/append_batch endpoint accepts multiple events in a single request, reducing overhead and rate limit consumption:

// Single batch request with raw bytes
const response = await api.appendBatch(logName, events);
// response.sequences contains sequence numbers for all events
const lastSequence = response.sequences[response.sequences.length - 1];

Batch requests:

  • Accept up to 10 MB total request size
  • Return sequence numbers for all events in order
  • Count as a single request against the rate limit
  • Fail atomically on validation errors

For single-event appends, use controlled concurrency:

import pLimit from 'p-limit';
// Limit to 50 concurrent requests (within burst capacity)
const limit = pLimit(50);
const results = await Promise.all(
events.map(event => limit(() => api.appendLog(logName, event)))
);

Implement client-side throttling to avoid hitting limits:

import pLimit from 'p-limit';
// Limit concurrent requests
const limit = pLimit(50);
const results = await Promise.all(
items.map(item => limit(() => api.process(item)))
);

Cache responses when freshness is not critical:

const cache = new Map<string, { data: any; expires: number }>();
async function cachedQuery(
logName: string,
viewName: string
): Promise<Uint8Array> {
const key = `${logName}:${viewName}`;
const cached = cache.get(key);
if (cached && cached.expires > Date.now()) {
return cached.data; // Cache hit
}
const data = await api.queryView(logName, viewName, new Uint8Array());
cache.set(key, {
data,
expires: Date.now() + 5000 // 5 second TTL
});
return data;
}

Use stale reads to reduce request volume when consistency is not required:

// With sequence: waits for consistency, counts toward rate limit
await api.queryView(logName, viewName, input, sequence);
// Without sequence: immediate response, may be stale
await api.queryView(logName, viewName, input);

Monitor usage to anticipate rate limiting:

Terminal window
curl https://api.primatomic.com/billing/usage \
-H "Authorization: Bearer $TOKEN"
{
"storage_gb_seconds": 1234.56,
"compute_seconds": 789.01
}

Storage is metered in GB-seconds. Divide by 3,600 to convert to GB-hours.

Contact support if you require higher limits. Available options may include:

  • Increased per-tenant rate limits
  • Dedicated infrastructure
  • Batch API endpoints

Rate limit increases are evaluated based on use case and tenant history.