fix: thread-safe cache writes and feature update handling by vazarkevych · Pull Request #114 · growthbook/growthbook-python

vazarkevych · 2026-04-23T14:28:52Z

Problem

Several race conditions existed in cache and feature-update handling:

InMemoryFeatureCache had no locking - concurrent reads/writes could corrupt cache entries.
FeatureRepository.load_features / load_features_async had no fetch coalescing — on a cold cache, many threads/coroutines requesting the same SDK payload could all hit the GrowthBook API/CDN at once (cache-miss stampede). Note: Python's GIL does not help here, as it is released during the blocking HTTP fetch.
_feature_update_callbacks was mutated and iterated without a lock - concurrent add/remove/notify could raise RuntimeError: list changed size during iteration.
_sticky_bucket_cache_lock was a boolean flag, not a real lock - the spin-loop was not thread-safe and silently returned {} when the "lock" was held.
FeatureCache.get_current_state returned a mutable reference to savedGroups instead of a copy.

Changes

This branch was rebuilt on top of current main (post remote-eval) to avoid the stale conflicts from the original PR:

InMemoryFeatureCache: threading.Lock around get / set / clear.
Per-key fetch coalescing in load_features / load_features_async: on a miss, only the first caller fetches for a given cache key; others wait and read the freshly-cached value (double-checked under the lock). Cache hits return before acquiring any lock, so there is no overhead on the hot path. Async locks are keyed by (event-loop id, cache key) to avoid reusing a lock bound to a finished loop (the cross-loop asyncio.Lock hang fixed earlier in main).
Callback delivery: add/remove guarded by a dedicated _callbacks_lock; _notify copies the list under the lock and iterates outside it, preventing both the iteration error and deadlocks from slow callbacks.
FeatureCache.get_current_state: returns a dict() copy of savedGroups.
Sticky buckets: replaced the boolean flag with a real asyncio.Lock() and simplified _refresh_sticky_buckets (re-check under the lock; removed the silent {} fallback).

Preserved from current main

Remote-eval cache keys (_compute_cache_key) are untouched; the remote-eval path keeps its existing _remote_eval_inflight coalescing and is intentionally left out of the new lock.
force_refresh semantics (the re-check under the lock also honors force_refresh, so SSE invalidation still triggers a refetch).
SSE invalidation and async client behavior unchanged.

madhuchavva · 2026-06-18T20:12:29Z

@vazarkevych - thanks for identifying these tricky issues.

I guess the most pressing issue here is: the cache-miss stampede in FeatureRepository.load_features / load_features_async: if many threads or coroutines ask for the same uncached SDK payload at once, they can all hit the GrowthBook API/CDN simultaneously.

and, P2 list includes callback list mutation during notification, sticky bucket boolean lock. but the blast radius is limited. so, I’d salvage this by porting these ideas onto current main: thread-safe InMemoryFeatureCache, snapshot callback delivery, per-key fetch coalescing for sync/async loads, savedGroups copy semantics, and real sticky-bucket lock. There are many changes that went in and we'll need to preserve current remote-eval cache keys, force_refresh, SSE invalidation, and async client behavior.

madhuchavva

resolve the conflicts and address the review comments please

vazarkevych · 2026-06-25T12:58:48Z

resolve the conflicts and address the review comments please

Thanks for the advice — agreed on the priorities. I've rebuilt this branch on top of current main (post remote-eval) rather than merging, so the stale conflicts are gone

vazarkevych · 2026-06-25T13:00:33Z

@vazarkevych - thanks for identifying these tricky issues.

I guess the most pressing issue here is: the cache-miss stampede in FeatureRepository.load_features / load_features_async: if many threads or coroutines ask for the same uncached SDK payload at once, they can all hit the GrowthBook API/CDN simultaneously.

and, P2 list includes callback list mutation during notification, sticky bucket boolean lock. but the blast radius is limited. so, I’d salvage this by porting these ideas onto current main: thread-safe InMemoryFeatureCache, snapshot callback delivery, per-key fetch coalescing for sync/async loads, savedGroups copy semantics, and real sticky-bucket lock. There are many changes that went in and we'll need to preserve current remote-eval cache keys, force_refresh, SSE invalidation, and async client behavior.

All the items from your list are in: thread-safe InMemoryFeatureCache, snapshot callback delivery, per-key fetch coalescing for both sync and async loads, savedGroups copy, and a real sticky-bucket lock. Remote-eval cache keys, force_refresh, SSE invalidation and async behavior are preserved (the remote-eval path is left out of the new lock since it already coalesces via _remote_eval_inflight).

The only non-obvious bit is the async coalescing lock — it's keyed by (event-loop id, cache key) to avoid reusing a lock bound to a finished loop.

vazarkevych requested a review from madhuchavva April 23, 2026 14:29

madhuchavva requested changes Jun 18, 2026

View reviewed changes

fix: port thread-safe cache/coalescing fixes onto current main

3d78f53

vazarkevych force-pushed the fix/thread-safe-cache-writes branch from 874a5b2 to 3d78f53 Compare June 25, 2026 12:41

vazarkevych requested a review from madhuchavva June 25, 2026 13:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: thread-safe cache writes and feature update handling #114

fix: thread-safe cache writes and feature update handling #114
vazarkevych wants to merge 1 commit into
growthbook:mainfrom
vazarkevych:fix/thread-safe-cache-writes

vazarkevych commented Apr 23, 2026 •

edited

Loading

Uh oh!

madhuchavva commented Jun 18, 2026 •

edited

Loading

Uh oh!

madhuchavva left a comment

Uh oh!

vazarkevych commented Jun 25, 2026

Uh oh!

vazarkevych commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

vazarkevych commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Changes

Uh oh!

madhuchavva commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

madhuchavva left a comment

Choose a reason for hiding this comment

Uh oh!

vazarkevych commented Jun 25, 2026

Uh oh!

vazarkevych commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vazarkevych commented Apr 23, 2026 •

edited

Loading

madhuchavva commented Jun 18, 2026 •

edited

Loading