Welcome to the new Golem Cloud Docs! 👋
Quotas

Quotas / Resource Limits

Golem 1.5 introduces resource quotas to control and limit resource usage across agents within an environment. Quotas can enforce rate limits, capacity limits, or concurrency limits with configurable enforcement actions.

💡

Quotas are configured at the environment level and apply to all agents deployed in that environment.

Resource limit types

Golem supports three types of resource limits:

TypeFieldsDescription
Ratevalue, period, maxRate limit per time period. Periods: second, minute, hour, day, month, year. max is the burst limit.
CapacityvalueTotal capacity limit
ConcurrencyvalueMaximum concurrent usage

Enforcement actions

Each quota has an enforcement action that determines what happens when the limit is exceeded:

ActionDescription
rejectReject requests exceeding the limit
throttleSlow down requests exceeding the limit
terminateTerminate the agent when the limit is exceeded

Configuring quotas in golem.yaml

Quotas are defined per environment using resourceDefaults:

resourceDefaults:
  local:
    - name: api-calls
      limit:
        type: Rate
        value: 100
        period: minute
        max: 1000
      enforcementAction: reject
      unit: request
      units: requests
    - name: storage
      limit:
        type: Capacity
        value: 1073741824
      enforcementAction: reject
      unit: byte
      units: bytes
    - name: connections
      limit:
        type: Concurrency
        value: 50
      enforcementAction: throttle
      unit: connection
      units: connections

Managing quotas via REST API

Resources can also be managed via the REST API — CRUD operations on /v1/envs/{environment_id}/resources. See the REST API reference for details.

How quotas work internally

Quota enforcement uses a lease-based system. Worker executor nodes acquire resource leases from the shard manager, with local credit tracking and periodic renewal. This ensures efficient enforcement without per-request coordination.