Clarify ICSD24 oxidation-state filtering thresholds (fixes #627)#630
Conversation
Aligns documentation and defaults across the three places that expose the ICSD24 oxidation-state set so the same element no longer silently returns different lists depending on the code path: - data_loader.lookup_element_oxidation_states_icsd24: docstring said "≥5 reports" but the underlying file actually applies a strict ">5" cut-off (species with results_count == 5 are excluded). Corrected the docstring and added a pointer to ICSD24FilterConfig so users see the static-vs-dynamic threshold difference. - Element.oxidation_states (__init__.py): expanded the docstring to spell out that the attribute is the static >5-reports file, and to note that smact_validity applies a different (consensus=3) cut-off by default via ICSD24FilterConfig. Points users to ICSD24OxStatesFilter for a configurable threshold. - ICSD24FilterConfig.commonality (screening.py): default changed from "medium" to "low" to match ICSD24OxStatesFilter.filter(). Two public APIs should not silently apply different defaults for the same parameter. Updated the dataclass docstring with what each level means and refreshed the smact_validity docstring that quoted the old default. - docs/examples/property_prediction.ipynb: comment updated to reflect the new default. Behaviour-changing: smact_validity callers that relied on the implicit "medium" filter will now get a more permissive result. Worth a CHANGELOG entry on release. Fixes #627.
WalkthroughThis pull request aligns inconsistent ICSD24 oxidation-state filtering defaults by changing ChangesICSD24 Filtering Default Alignment
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #630 +/- ##
=======================================
Coverage 89.31% 89.31%
=======================================
Files 49 49
Lines 4988 4988
=======================================
Hits 4455 4455
Misses 533 533 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Actionable comments posted: 0 |
Fixes #627.
@dylancjohn flagged three places where the ICSD24 oxidation-state set is exposed inconsistently. This PR addresses all three.
Changes
smact/data_loader.py—lookup_element_oxidation_states_icsd24docstring said≥5 reports, but the underlying file (oxidation_states_icsd24_filtered.txt) actually uses a strict>5cut-off. Verified empirically: every species withresults_count == 5(P⁻⁴, Cl⁴, Ti¹, Co⁵, Se³, Sb¹, Sm⁴, Tb¹, Ra², Pu⁵, Cm⁴) is excluded from the filtered file. Corrected the docstring and added a pointer toICSD24FilterConfig.smact/__init__.py— Expanded theElement.oxidation_statesdocstring to spell out that the attribute is the static>5-reports file, and to flag thatsmact_validityapplies a different (consensus=3) cut-off by default viaICSD24FilterConfig. Points users atICSD24OxStatesFilterfor a configurable threshold.smact/screening.py— AlignedICSD24FilterConfig.commonalitydefault from"medium"to"low", matchingICSD24OxStatesFilter.filter(). Two public APIs should not silently apply different defaults for the same parameter. Also refreshed the dataclass docstring and thesmact_validitydocstring that still quoted the old default.docs/examples/property_prediction.ipynb— One comment updated to reflect the new default.Behaviour change ⚠
smact_validitycallers who relied on the implicit"medium"filter will now get a more permissive result (no proportion filter; onlyconsensus=3is applied). Callers that passicsd_filter=ICSD24FilterConfig(commonality="medium")explicitly are unaffected.Test plan
Summary by CodeRabbit