2d25f6db2b
Rather than relying on reactive 65s retries, each semaphore slot is held for at least MinBatchIntervalSeconds (20s). With 2 concurrent slots that limits throughput to ~3 batches/min × ~2k tokens = ~6k output TPM, safely under the 8k/min limit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>