Expose internal ICLabel and ASR metrics in QC command output#239
Open
google-labs-jules[bot] wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context & Rationale
Currently, automated validation pipelines and AI agents rely on brittle, manually-replicated thresholds because the CLI only provides raw metrics (like SNR and variance). This requires external scripts to hardcode internal logic—such as the 0.9 probability threshold for ICLabel—to determine if a dataset is "clean."
This PR enhances the
qcmanifest by exposing internal quality indicators that are already computed during the cleaning phase. By providing direct access to these metrics, we reduce the complexity of autonomous validation pipelines and provide researchers with greater transparency into the signal quality auditing process.Key Changes
1. Enhanced ICLabel Reporting
Added a new helper
_iclabel_metricstoqc.pyto extract and expose the mean probabilities for all 7 ICLabel artifact classes. These are now structured withinmetrics["ica"]["iclabel"].2. ASR & RMS Transparency
Modified
clean_channels.pyandclean_windows.pyto attach internal computed values to theEEG['etc']structure. This includes:noisinessandznoise(ASR noisiness Z-scores).w_rms(RMS power) andwz(temporal Z-scores).3. Structured Manifest Integration
Implemented
_asr_metricsinqc.pyto securely merge these internal values into the machine-readable output undermetrics["data_quality"]["asr"].4. Machine-Readable Formatting
To ensure JSON compliance and compatibility with downstream agentic tools:
.tolist()._as_listand_float_or_noneutilities to maintain consistent data types.Success Criteria
qccommand output contains a structured section for all 7 ICLabel class probabilities.