Rate Limits

Overview

What are rate limits?

A rate limit is a restriction that an API imposes on the number of times a user or client can access the server within a specified period of time.

Why do we have rate limits?

Rate limits are a common practice for APIs, and they're put in place for a few different reasons:

  • They help protect against abuse or misuse of the API. For example, a malicious actor could flood the API with requests in an attempt to overload it or cause disruptions in service. By setting rate limits, OpenAI can prevent this kind of activity.

  • Rate limits help ensure that everyone has fair access to the API. If one person or organization makes an excessive number of requests, it could bog down the API for everyone else. By throttling the number of requests that a single user can make, OpenAI ensures that the most number of people have an opportunity to use the API without experiencing slowdowns.

  • Rate limits can help Cooler manage the aggregate load on its infrastructure. If requests to the API increase dramatically, it could tax the servers and cause performance issues. By setting rate limits, Cooler can help maintain a smooth and consistent experience for all API consumers.

We enforce rate limits at the organization level, not user level, based on the specific endpoint used as well as the type of account you have. Rate limits are measured in one way RPM (requests per minute). Cooler limits the number of requests to 25 operations per second in the production environment and 10 operations per second in the sandbox environment. Systems that send too many requests in quick succession may see error responses that show up as status code 429, which gates access to the resource until the timeout expires.

Last updated