Skip to content

Send resolve queries from the receive socket (fix firewall-asymmetry resolution failures)#278

Open
cboulay wants to merge 2 commits into
devfrom
cboulay/resolve_from_recv_socket
Open

Send resolve queries from the receive socket (fix firewall-asymmetry resolution failures)#278
cboulay wants to merge 2 commits into
devfrom
cboulay/resolve_from_recv_socket

Conversation

@cboulay

@cboulay cboulay commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fixes the long-standing "streams aren't resolvable until I disable my firewall" failure by making the resolver send its query from the same socket it listens for replies on, so a stateful firewall sees a matching request/reply flow.

Root cause

resolve_attempt_udp advertised one port but transmitted from another:

  • The query payload embeds the recv_socket_ port as the return port (LSL:shortinfo … <return_port> <query_id>).
  • But the query was sent from one of three separate sockets (unicast_/broadcast_/multicast_socket_), each with its own ephemeral source port.

The outlet correctly replies to the advertised return port, but to a stateful firewall that reply is unsolicited inbound UDP (no prior outbound flow from that local port), so it's dropped. Disabling the firewall "fixes" exactly this. Time sync was never affected because time_receiver sends and receives on one socket — this change brings resolution in line with that.

Fix

  • Route every query flavor (unicast / broadcast / multicast) through recv_socket_.
  • Give recv_socket_ the broadcast option, the multicast TTL, and the per-interface outbound_interface option it needs to send all flavors.
  • Remove the three now-unused send sockets.

Net diff is small: src/resolve_attempt_udp.{h,cpp} shrinks (one socket instead of four).

Test

New testing/int/resolve_query_port.cpp (own executable, like lsl_test_runtime_config, because it sets the api_config singleton). It points discovery at loopback (AddressesOverride={127.0.0.1}), stands up a fake responder on :16571, captures the query, and asserts:

datagram source port == advertised return port

This is deterministic and needs no firewall — it verifies the exact property that makes a firewall accept the reply. The first commit adds the test (red: 16572 != <ephemeral>); the second makes it green.

Scope (intentionally narrow)

This PR is only the send-from-receive-socket change (firewall asymmetry). Deliberately not included, to keep review tight:

  • The throwing set_option(outbound_interface(...)) on bad interfaces (still uses the throwing overload here; separate fix).
  • Port-range exhaustion / unicast probe coverage.
  • A follow-up lab.ResolveOverTCP option (separate PR).

Testing

Full suite green locally (macOS, universal build): new test passes, and discovery (live multicast/broadcast/unicast resolution, 730 assertions), network, tcpserver, and runtime_config all pass — confirming real discovery isn't regressed.

Compatibility

Wire protocol unchanged; this only changes which local socket emits the query. No API or config changes.

cboulay added 2 commits June 16, 2026 20:45
A resolve query advertises a return port in its payload but is
transmitted from a different, unbound socket, so the datagram's source
port does not match. A stateful firewall therefore sees the outlet's
reply (sent to the advertised return port) as unsolicited inbound and
drops it, so streams fail to resolve until the firewall is disabled
(NETWORK_FAILURE_MODES.md 1a).

This test points discovery at loopback, stands up a fake responder on
:16571, captures the query, and asserts the datagram source port equals
the return port embedded in the payload. It is deterministic and needs
no firewall. It fails on the current code (source port is an ephemeral
port, != return port); the following commit makes it pass.

Sets the api_config singleton, so it lives in its own executable like
lsl_test_runtime_config.
Route every resolve query (unicast / broadcast / multicast) through the
socket that also receives replies, instead of three separate send
sockets whose ephemeral source ports differ from the advertised return
port. The receive socket is given the broadcast option, the multicast
TTL, and the per-interface outbound-interface option so it can carry all
query flavors.

The query's source port now equals the return port embedded in it, so a
stateful firewall sees a matching request/reply flow and no longer drops
the reply. This is the long-standing "can't resolve until I disable the
firewall" failure (NETWORK_FAILURE_MODES.md 1a), and it makes the
regression test from the previous commit pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant