Feature/en 352/google api rate limits#222
Conversation
- Added a global cap for Gemini API requests, limiting to 10 requests per minute. - Integrated a rate limiter into the AiService to manage API call frequency. - Updated the EntixQueueHandler to defer processing when the rate limit is exceeded. - Enhanced tests to verify behavior under rate limit conditions.
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
entix-app-staging | af44ad0 | Commit Preview URL Branch Preview URL |
Jun 10 2026, 11:27 PM |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: af44ad0a1d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| timestamps.push(now); | ||
| await this.kv.set(key, JSON.stringify(timestamps), KV_TTL_SECONDS); |
There was a problem hiding this comment.
Use an atomic reservation for RPM slots
When multiple queue batches or HTTP requests call acquire() concurrently, they can all read the same timestamp list before any of them writes it back, see capacity, and then each append a slot. Because this KV read/modify/write is not atomic, the new limiter can admit far more than 10 Gemini calls per minute exactly during the bursty queue scenarios it is meant to protect; use a serialization point/atomic primitive for the reservation instead of a plain KV get then set.
Useful? React with 👍 / 👎.
| if (this.rateLimiter) { | ||
| await this.rateLimiter.acquire(); | ||
| } |
There was a problem hiding this comment.
Avoid logging planned throttles as pipeline failures
When a vocabulary.process-text job is throttled here, acquire() throws before the Gemini fetch; that exception is caught by VocabularyProcessingService.processText, which invokes the queue's logPipelineFailure hook and inserts a vocabulary.pipeline_failed audit row before the queue handler recognizes the rate-limit error and retries. As a result, normal limiter deferrals create false pipeline-failure audit records for every delayed vocabulary item.
Useful? React with 👍 / 👎.
Summary
Ticket Link
Type of Change
Changes Made
How to Test
/org/:slug/dashboardRisks / Rollback Notes
Screenshots (If UI changed)
🏗 Checklist
npm run typecheck:apipasses.npm run test:api(or relevant unit tests) passes.get*methods have correspondingNotFoundErrortest cases.docs/updated if new architecture patterns were introduced.docs/AI.mdupdated if canonical rules changed.