⚡ Bolt: optimize Fisher-Yates shuffle in RANSAC utilities#250
⚡ Bolt: optimize Fisher-Yates shuffle in RANSAC utilities#250suraj-ranganath wants to merge 1 commit into
Conversation
Vectorize `stream.rand()` calls and use `math.floor(x + 0.5)` for faster scalar rounding in `rand_sample` and `rand_permutation`. This results in a measurable speedup of 30-50% while maintaining exact parity with EEGLAB's random number generation sequence. - Optimized `src/eegprep/plugins/clean_rawdata/private/ransac.py` - Documented findings in `.jules/bolt.md` Co-authored-by: suraj-ranganath <14310165+suraj-ranganath@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Claude finished @suraj-ranganath's task in 1m 45s —— View job PR Review: Optimize Fisher-Yates shuffle in RANSAC utilities
VerdictThe optimization is correct and parity-preserving. Both changes are sound: 1. Vectorized RNG — parity holds.
2. 3. The Minor notes (non-blocking)
TestingI couldn't run Overall: clean, minimal, well-scoped optimization with no behavioral regression. 👍 |
⚡ Bolt: optimize Fisher-Yates shuffle in RANSAC utilities
💡 What
This optimization targets the Fisher-Yates shuffle implementations in
rand_sampleandrand_permutationwithinsrc/eegprep/plugins/clean_rawdata/private/ransac.py. It introduces two key improvements:stream.rand(n)call instead of callingstream.rand()in every loop iteration.round_matutility withmath.floor(x + 0.5)for internal scalar index calculations, which is significantly faster for non-negative numbers in tight loops.🎯 Why
Fisher-Yates shuffles are used extensively in RANSAC and ICA. The overhead of individual Python function calls and the overhead of
round_mat(which handles arrays and multiple decimal places) accumulates in these tight loops.📊 Impact
Expected performance improvement:
rand_sample: ~50% faster.rand_permutation: ~30% faster.Exact parity with EEGLAB/MATLAB random number generation is preserved.
🔬 Measurement
Verified using a benchmark script comparing original vs. optimized implementations over 1000 iterations for
n=1000. Parity was confirmed by comparing output arrays.✅ Verification
tests/test_utils_ransac.pyandtests/test_parity_rng.pypassed.PR created automatically by Jules for task 14941794764479403253 started by @suraj-ranganath