Skip to content

fix: filter wildcard DNS false positives during resolution#37

Open
0xParth wants to merge 1 commit into
hadriansecurity:mainfrom
0xParth:fix/wildcard-dns-filtering
Open

fix: filter wildcard DNS false positives during resolution#37
0xParth wants to merge 1 commit into
hadriansecurity:mainfrom
0xParth:fix/wildcard-dns-filtering

Conversation

@0xParth
Copy link
Copy Markdown

@0xParth 0xParth commented Apr 18, 2026

Problem

Domains with wildcard DNS records (*.example.com) cause every predicted subdomain to resolve successfully, producing massive false positive counts that compound through the recursive inference loop.

Tested on a real domain with 21 known subdomains:

  • Without this fix: 2,199 "resolved" subdomains — all false positives matching the same wildcard catch-all (CloudFront)
  • With this fix: only subdomains resolving to IPs outside the wildcard set are returned

The recursive loop in _get_domains_for_group() amplifies the problem — wildcard-resolved predictions are fed back as seeds for the next inference round, generating even more predictions that all resolve again.

Fix

Adds wildcard detection to resolve.py before resolving predictions:

  1. detect_wildcard(apex_domain) — probes 3 random subdomains (e.g. a8k2m9x4p1q7w3.example.com). If all resolve and share common IPs, a wildcard is present.
  2. get_registered_domains() — new optional apex_domain parameter. When provided and a wildcard is detected, only returns predictions that resolve to at least one IP outside the wildcard set.

Why IP-based filtering (not just "resolves = exists")

On wildcard domains, DNS resolution alone is meaningless — everything resolves. But real subdomains with explicit A records often point to different IPs than the wildcard. This approach catches those while filtering out the noise.

Changes

File Change
subwiz/resolve.py Add detect_wildcard(), _resolve_ips() helper, update get_registered_domains() with optional apex_domain param
subwiz/main.py Pass apex_domain=apex to get_registered_domains() in the recursive loop
tests/test_resolve.py Add tests for wildcard detection on non-wildcard domain, resolution with apex param

Backward Compatibility

  • apex_domain defaults to None — existing callers are unaffected
  • When apex_domain is not provided, the original fast path (just is_registered()) is used
  • No new dependencies

Test Plan

  • test_registered_domains — existing test passes (no regression)
  • test_wildcard_detection_non_wildcard — confirms detect_wildcard() returns None for hadrian.io
  • test_registered_domains_with_apex — confirms non-wildcard domains still resolve correctly when apex_domain is provided
  • Manual: tested against a wildcard domain (*.narad.io → CloudFront) — 2,199 false positives reduced to 0

Domains with wildcard DNS records (*.example.com) cause every predicted
subdomain to resolve successfully, producing massive false positive counts.
On a tested domain with 21 known subdomains, this inflated results from
~500 predictions to 2,199 "resolved" subdomains — all matching the same
wildcard catch-all endpoint.

This adds wildcard detection by probing random subdomains before resolving
predictions. When a wildcard is detected, only predictions that resolve to
IPs outside the wildcard set are returned.

- Add detect_wildcard() that probes 3 random subdomains per apex
- Modify get_registered_domains() to accept apex_domain for wildcard filtering
- Pass apex_domain from main.py recursive resolution loop
- Add tests for wildcard detection on non-wildcard domains
- Backward compatible: apex_domain is optional, existing behavior preserved

Co-Authored-By: WHO ELSE BUT!!!!

AI-Session-Id: 8ef18548-aeed-4766-afa2-7d0cfcfcc6a2
AI-Tool: claude-code
AI-Model: unknown
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant