Reduce to 2 concurrent batches to avoid Haiku output TPM bursting
3 concurrent batches hit the rate limit simultaneously then retry in unison, causing repeated 429s. 2 concurrent keeps output rate lower. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -22,7 +22,7 @@ public class AiCatalogPriceCheckService : IAiCatalogPriceCheckService
|
||||
|
||||
private const string Model = "claude-haiku-4-5-20251001";
|
||||
private const int BatchSize = 25;
|
||||
private const int MaxConcurrentBatches = 3; // Haiku has generous rate limits; retry logic handles any 429s
|
||||
private const int MaxConcurrentBatches = 2; // 3 concurrent bursts past Haiku's output TPM limit
|
||||
private const int RateLimitRetrySeconds = 65; // wait just past the 60s window before retrying a 429
|
||||
|
||||
private static readonly JsonSerializerOptions JsonOpts = new() { PropertyNameCaseInsensitive = true };
|
||||
|
||||
Reference in New Issue
Block a user