We have two types of limits:

  1. Spend limits set a maximum monthly cost an organization can incur for API usage.
  2. Rate limits restrict the number of API requests an organization can make over a defined period of time.

We enforce service-configured limits at the organization level, but you may also set user-configurable limits for your organization’s workspaces.

About our limits

  • Limits are designed to prevent API abuse, while minimizing impact on common customer usage patterns.
  • Limits are defined by usage tier, where each tier is associated with a different set of spend and rate limits.
  • Your organization will increase tiers automatically as you reach certain thresholds while using the API.
    Limits are set at the organization level. You can see your organization’s limits in Plans and Billing in the Anthropic Console.
  • You may hit rate limits over shorter time intervals. For instance, a rate of 60 requests per minute (RPM) may be enforced as 1 request per second. Short bursts of requests at a high volume can surpass the rate limit and result in rate limit errors.
  • The limits outlined below are our standard limits and apply to the “Build” API plan. If you’re seeking higher, custom limits, contact sales by clicking “Select Plan” in the Anthropic Console to move to our custom “Scale” plan.
  • We use the token bucket algorithm to do rate limiting.

Spend limits

Each usage tier has a limit on how much you can spend on the API each calendar month. Once you reach the spend limit of your tier, until you qualify for the next tier, you will have to wait until the next month to be able to use the API again.

To qualify for the next tier, you must meet a deposit requirement and a mandatory wait period. Higher tiers require longer wait periods. Note, to minimize the risk of overfunding your account, you cannot deposit more than your monthly spend limit.

Requirements to advance tier

Usage TierCredit PurchaseWait After First PurchaseMax Usage per Month
Build Tier 1$50 days$100
Build Tier 2$407 days$500
Build Tier 3$2007 days$1,000
Build Tier 4$40014 days$5,000

Rate limits

Our rate limits are currently measured in requests per minute, tokens per minute, and tokens per day for each model class. If you exceed any of the rate limits you will get a 429 error. Click on the rate limit tier to view relevant rate limits.

Model TierRequests per minute (RPM)Tokens per minute (TPM)Tokens per day (TPD)
Claude 3.5 Sonnet5040,0001,000,000
Claude 3 Opus5020,0001,000,000
Claude 3 Sonnet5040,0001,000,000
Claude 3 Haiku5050,0005,000,000

User-configurable limits

In addition to service-configured limits, you may also configure spend limits and rate limits for individual workspaces. A workspace’s limit can be no higher than the organization’s overall limit. For example, if the service-configured limit is 80,000 tokens per minute, you can set an individual workspace’s rate limit to 30,000 tokens per minute. Then, the remaining 50,000 tokens per minute are available to the rest of your organization to use. You cannot set limits on the Default workspace.

If unconfigured, workspace limits will be the same as the equivalent service-configured limit. Service-configured limits are always enforced. In the above example, if you configure a second workspace’s limit to 70,000 tokens per minute, your organization will still be limited to 80,000 tokens per minute total.

Response headers

The API response includes headers that show you the rate limit enforced, current usage, and when the limit will be reset.

The following headers are returned:

anthropic-ratelimit-requests-limitThe maximum number of requests allowed within any rate limit period.
anthropic-ratelimit-requests-remainingThe number of requests remaining before being rate limited.
anthropic-ratelimit-requests-resetThe time when the request rate limit will reset, provided in RFC 3339 format.
anthropic-ratelimit-tokens-limitThe maximum number of tokens allowed within the any rate limit period.
anthropic-ratelimit-tokens-remainingThe number of tokens remaining (rounded to the nearest thousand) before being rate limited.
anthropic-ratelimit-tokens-resetThe time when the token rate limit will reset, provided in RFC 3339 format.
retry-afterThe number of seconds until you can retry the request.

The tokens rate limit headers display the values for the limit (daily or per-minute) with fewer tokens remaining. For example, if you have exceeded the daily token limit but have not sent any tokens within the last minute, the headers will contain the daily token rate limit values.