Python: information-flow control prompt injection defense#5331
Python: information-flow control prompt injection defense#5331eavanvalkenburg wants to merge 2 commits intomainfrom
Conversation
* fides integration * documentation * documentation * documentation * human-approval on policy violation * numenous hyena 'works' * IFC based implementation * minor edits in documentation * rebasing the branch and running the email example * Add security tests for IFC middleware * Fix Role.TOOL NameError in approval handling * tiered labelling scheme * 3 tier labelling scheme in middleware * Adapt security middleware to list[Content] tool results * Refactor SecureAgentConfig as context provider and address Copilot review comments * Update FIDES docs to reflect context provider pattern and update code for ContextProvider rename * Fix security examples: use OpenAIChatClient instead of non-existent AzureOpenAIChatClient * Address PR review: consolidate security modules, remove ContentLineage, update docs * remove unrelated files * remove comment from _tools.py and rename decision file * Fix CI failures: Bandit B110, broken md links, hosted approval passthrough * apply template to decision doc 0024 * minor fixes to decision doc 0024 --------- Co-authored-by: Aashish <[email protected]>
* Python: follow up FIDES security flow Refine the secure approval path, mark the security classes with the FIDES experimental feature label, and clean up the related docs/tests. Also fix workspace-level validation regressions uncovered while running the full Python check suite. Co-authored-by: Copilot <[email protected]> * Python: remove FIDES GitHub MCP sample Drop the GitHub MCP security sample from the FIDES follow-up branch while keeping the remaining security docs and samples intact. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR brings the Python FIDES information-flow control (integrity + confidentiality) prompt-injection/data-exfiltration defense closer to main, adding core security primitives/middleware, DevUI support for policy-violation approvals, and end-to-end samples/docs/tests.
Changes:
- Adds FIDES security samples and documentation (quick start + developer guide + implementation summary + ADR).
- Extends DevUI mapping/execution to round-trip policy-violation metadata through approval events.
- Adjusts core function-invocation plumbing and test suites to support the new approval/policy flow and keep validation green.
Reviewed changes
Copilot reviewed 14 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| python/samples/02-agents/security/repo_confidentiality_example.py | New sample demonstrating confidentiality labels + exfiltration prevention. |
| python/samples/02-agents/security/email_security_example.py | New sample demonstrating integrity taint handling + quarantined processing. |
| python/samples/02-agents/security/README.md | New quick-start documentation for SecureAgentConfig/FIDES patterns. |
| python/samples/02-agents/security/FIDES_DEVELOPER_GUIDE.md | New deep-dive developer guide for the FIDES model and APIs. |
| python/packages/core/tests/test_security.py | New comprehensive unit tests for labeling/middleware/policy enforcement. |
| python/packages/core/agent_framework/_tools.py | Updates tool invocation + approval replacement behavior for policy approvals. |
| python/packages/core/agent_framework/init.py | Exports new FIDES/security APIs and ai_function alias. |
| python/packages/core/agent_framework/_feature_stage.py | Adds ExperimentalFeature.FIDES. |
| python/packages/core/agent_framework/observability.py | Hardens finish_reason attribute handling. |
| python/packages/devui/agent_framework_devui/_mapper.py | Emits policy violation details in approval request events when present. |
| python/packages/devui/agent_framework_devui/_executor.py | Preserves policy-violation metadata on approval responses. |
| python/packages/devui/tests/devui/test_ui_memory_regression.py | Skips the test when CDP websocket URL cannot be obtained. |
| python/packages/foundry/tests/foundry/test_foundry_embedding_client.py | Makes env patching deterministic via clear=True. |
| docs/features/FIDES_IMPLEMENTATION_SUMMARY.md | Adds an implementation summary for FIDES components/deliverables. |
| docs/decisions/0024-prompt-injection-defense.md | New ADR documenting the design rationale and tradeoffs. |
| - Message-level tracking tests (Phase 1) | ||
| - Data exfiltration prevention tests | ||
|
|
||
| 4. **`docs/decisions/0011-prompt-injection-defense.md`** |
There was a problem hiding this comment.
This summary references docs/decisions/0011-prompt-injection-defense.md, but that ADR file doesn’t exist in the repo; the prompt-injection defense ADR added in this PR is docs/decisions/0024-prompt-injection-defense.md. Update the reference so readers can navigate to the correct ADR.
| 4. **`docs/decisions/0011-prompt-injection-defense.md`** | |
| 4. **`docs/decisions/0024-prompt-injection-defense.md`** |
| # Email security (prompt injection defense) | ||
| PYTHONPATH=packages/core python samples/getting_started/security/email_security_example.py | ||
|
|
||
| # Repository confidentiality (data exfiltration prevention) | ||
| PYTHONPATH=packages/core python samples/getting_started/security/repo_confidentiality_example.py | ||
| ``` |
There was a problem hiding this comment.
The run commands reference samples/getting_started/security/..., but the examples in this PR are located under samples/02-agents/security/. As written, these commands will fail for users following the README.
| - Full documentation: `python/packages/core/FIDES_DEVELOPER_GUIDE.md` | ||
| - Test suite: `python/packages/core/tests/test_security.py` | ||
| - Email example: `python/samples/getting_started/security/email_security_example.py` | ||
| - Repo example: `python/samples/getting_started/security/repo_confidentiality_example.py` |
There was a problem hiding this comment.
These “More Information” links point to paths that don’t exist (e.g., python/packages/core/FIDES_DEVELOPER_GUIDE.md and python/samples/getting_started/security/...). Update them to the actual locations in this PR (e.g., python/samples/02-agents/security/FIDES_DEVELOPER_GUIDE.md and python/samples/02-agents/security/*.py).
| # Build a map of call_id -> actual result for replacing placeholders | ||
| result_by_call_id: dict[str, Content] = {} | ||
| for resp in fcc_todo.values(): | ||
| if resp.approved and resp.function_call is not None and resp.function_call.call_id is not None: | ||
| # Map the call_id from the function_call to be replaced | ||
| call_id = resp.function_call.call_id | ||
| if call_id not in result_by_call_id and approved_function_results: | ||
| idx = len(result_by_call_id) | ||
| if idx < len(approved_function_results): | ||
| result_by_call_id[call_id] = approved_function_results[idx] |
There was a problem hiding this comment.
result_by_call_id is populated by assigning approved results based on positional index (idx = len(result_by_call_id)) rather than matching the actual call_id of each approved_function_results entry. If there are multiple approvals/results, this can associate the wrong tool result with a given call_id. Build the mapping by matching on Content.call_id instead of relying on list/dict iteration order.
| # Build a map of call_id -> actual result for replacing placeholders | |
| result_by_call_id: dict[str, Content] = {} | |
| for resp in fcc_todo.values(): | |
| if resp.approved and resp.function_call is not None and resp.function_call.call_id is not None: | |
| # Map the call_id from the function_call to be replaced | |
| call_id = resp.function_call.call_id | |
| if call_id not in result_by_call_id and approved_function_results: | |
| idx = len(result_by_call_id) | |
| if idx < len(approved_function_results): | |
| result_by_call_id[call_id] = approved_function_results[idx] | |
| # Build a map of call_id -> actual result for replacing placeholders. | |
| # Match results by their actual call_id instead of relying on positional order. | |
| approved_call_ids = { | |
| resp.function_call.call_id | |
| for resp in fcc_todo.values() | |
| if resp.approved and resp.function_call is not None and resp.function_call.call_id is not None | |
| } | |
| result_by_call_id: dict[str, Content] = {} | |
| for result in approved_function_results: | |
| call_id = getattr(result, "call_id", None) | |
| if call_id is not None and call_id in approved_call_ids and call_id not in result_by_call_id: | |
| result_by_call_id[call_id] = result |
Motivation and Context
This draft PR brings the Python information-flow control prompt injection defense work from
feature/python-fidestowardmainfor review and integration.Description
Contribution Checklist