Understanding Rate Limits
APIs have limits: - Requests per minute (RPM) - Tokens per minute (TPM)
Example: Claude API might allow 10 requests per second and 100k tokens per minute.
In most cases, personal and small-business workflows won't hit these limits. But high-volume workflows might.
If your workflow runs 100 times per minute, and each call uses 600 tokens, that's 60,000 tokens per minute. You're close to the limit. A spike could cause failures.
Solutions: - Add delays: Put a 1-second delay between API calls. This spaces them out. - Use exponential backoff: If a request fails due to rate limits, wait, then retry. Double the wait time each retry. - Distribute load: Instead of all 100 calls happening in the same minute, spread them across 5 minutes. - Request higher limits: Contact OpenAI or Anthropic and ask for increased rate limits. They usually grant them if you're a paying customer.