Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local Network Traversal - Multicast Discovery #1

Closed
35 tasks done
joshuakarp opened this issue May 26, 2021 · 69 comments
Closed
35 tasks done

Local Network Traversal - Multicast Discovery #1

joshuakarp opened this issue May 26, 2021 · 69 comments
Assignees
Labels
epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices

Comments

@joshuakarp
Copy link

joshuakarp commented May 26, 2021

Created by @CMCDragonkai

Specification

Untitled-2023-06-09-1740
There are two types of Data Flow in the MDNS System, Polling (Pull), and Announcements/Responses (Push). When a Node joins the MDNS group, the records are pushed to all other nodes. However, for the joined node to discover other nodes, it needs to conduct polling queries that other nodes respond to.

Sending Queries

image
The MDNS spec states that query records can have additional records, but we won't care to do this as it isn't necessary.
Queries won't have any other records in the query record, much like a standard DNS packet (albeit an mdns query packet can contain multiple questions).

In the case that a responder is binded to 2 interfaces that are connected to the same network (such as a laptop with WiFi + ethernet connected), the queries asking for the ip for a hostname of the responder will receive multiple responses with different ip addresses.
Untitled-2023-06-09-1740 excalidraw

This behavior is documented in: RFC 6762 14.

Control Flow

Unlike other mDNS libraries, we're going to use an AsyncIterator in order to have the consumer to have more control over the querying. An example of this would be:

async function* query({...}: Service, minimumDelay: number = 1, maximumDelay: number = 3600) {
   let delay = minimumDelay;
   while (true) {
    await this.sendPacket(...);
    delay *= 2;
    yield delay;
  }
}

The query system has been decided to have it's runtime contained within MDNS rather than being consumer-driven. This means that scheduled background queries will have to be managed by a TaskManager (similar to polykey)

Data Flow

Untitled-2023-06-09-1740(1)

Receiving Announcements/Responses (Pull)

Data Flow

Because queries are basically fire and forget, the main part comes in the form of receiving query responses from the multicast group. Hence, our querier needs to be able to collect records with a fan-in approach using a muxer that is reactive:

Untitled-2023-06-09-1740(3)

This can also be interpreted as a series of state transitions to completely build a service.
Untitled-2023-06-09-1740(3)

There also needs to be consideration that if the threshold for a muxer to complete is not reached, that additional queries are sent off in order to reach the finished state.
Untitled-2023-06-09-1740(2)

The decision tree for such would be as follows:
Untitled-2023-06-09-1740(4)

Control Flow

Instances of MDNS will extend EventTarget in order to emit events for service discovery/removal/etc.

class MDNS extends EventTarget {
}

The cache will be managed using a timer that is set to the soonest record TTL, rather than a timer for each record. The cache will also need to be an LRU in order to make sure that malicious responders cannot overwhelm it.

Sending Announcements

Control Flow

This will need to be experimented with a little. Currently the decisions are:

  • registerService cannot be called before start is called.
  • create should take in services in place of this.
  • stop should deregister all services
  • destroy should remove everything from the instance
class MDNS extends EventTarget {
  create()
  start()
  stop()
  register()
  deregister()
}

Types

Messages can be Queries or Announcements or Responses.
This can be expressed as:

type MessageType = "query" | "announcement" | "response";
type Message = [MessageType, ResourceRecord] & ["query", QuestionRecord];
const message = ["query", {...}];

Parser / Generator

The Parsing and Generation together are not isomorphic, as different parsed UInt8array packets can result in the same packet structure.

Every worker parser function will return the value wrapped in an object of this type:

type Parsed<T> = {
  data: T;
  remainder: UInt8Array;
}

The point of this is so that whatever hasn't been parsed get returned in .remainder so we don't keep track of the offset manually. This means that each worker function also needs to take in a second uint8array representing the original data structure.

  1. DNS Packet Parser Generator Utilities
  • Parser - parsePacket(Uint8array): Packet
    • Headers - parseHeader(Uint8array): {id: ..., flags: PacketFlags, counts: {...}}
    • Id - parseId(Uint8array): number
    • Flags - parseFlags(Uint8Array): PacketFlags
    • Counts - parseCount(Uint8Array): number
    • Question Records - parseQuestionRecords(Uint8Array): {...}
      • parseQuestionRecord(Uint8Array): {...}
    • Resource Records - parseResourceRecords(Uint8Array): {...}
      • parseResourceRecord(Uint8Array): {...}
      • parseResourceRecordName(Uint8Array): string
      • parseResourceRecordType(Uint8Array): A/CNAME
      • parseResourceRecordClass(Uint8Array): IN
      • parseResourceRecordLength(Uint8array): number
      • parseResourceRecordData(Uint8array): {...}
        • parseARecordData(Uint8array): {...}
        • parseAAAARecordData(Uint8array): {...}
        • parseCNAMERecordData(Uint8array): {...}
        • parseSRVRecordData(Uint8array): {...}
        • parseTXTRecordData(Uint8array): Map<string, string>
        • parseOPTRecordData(Uint8array): {...}
        • parseNSECRecordData(Uint8array): {...}
    • String Pointer Cycle Detection
      • Everytime a string is parsed, we take reference of the beginning and end of the string so that pointers cannot point to a start of a string that would infinite loop. A separate index table for the path of the dereferences to make sure deadlock doesn't happen.
    • Errors at each parsing function instead of letting the data view failing
      • ErrorDNSParse - Generic error with message that contains information for different exceptions. Ie. id parse failed at ...
    • Record Keys - parseResourceRecordKey and parseQuestionRecordKey and parseRecordKey - parseLabels.
  • Generator - generatePacket(Packet): UInt8Array
    • Header generateHeader(id, flags, counts...)
      • Id
      • Flags - generateFlags({ ... }): Uint8Array
      • Counts - generateCount(number): Uint8Array
    • Question Records - generateQuestionRecords(): Uint8Array - flatMap(generateQuestion)
      • generateQuestionRecord(): Uint8Array
    • Resource Records (KV) - generateResourceRecords()
      • generateRecord(): Uint8array -
      • generateRecordName - "abc.com" - ...RecordKey
      • generateRecordType - A/CNAME
      • generateRecordClass - IN
      • generateRecordLength
      • generateRecordData
        • generateARecordData(string): Uint8array
        • generateAAAARecordData(string): Uint8array
        • generateCNAMERecordData(string): Uint8array
        • generateSRVRecordData(SRVRecordValue): Uint8array
        • generateTXTRecordData(Map<string, string>): Uint8array
        • generateOPTRecordData(Uint8array): Uint8array
        • generateNSECRecordData(): Uint8array
  • Integrated into MDNS
  1. MDNS
  • Querying
    • MDNS.query()
    • query services of a type
    • MDNS.registerService()
    • MDNS.unregisterService()
  • Responding
    • Listening to queries
    • Responding to all queries with all records
    • Respond to unicast
    • Truncated bit

Testing

We can use two MDNS instances to interact with each other to test both query and respond on separate ports.

Additional Context

The following discussion from 'Refactoring Network Module' MR should be addressed:

Tasks

  • Parser - 5.5 days
    • Packet Header - 0.5 days
    • Packet Flags - 0.5 days
    • Questions - 0.5 days
    • Resource Records - 4 days
  • Generator 5.5 days
    • Packet Header - 0.5 days
    • Packet Flags - 0.5 days
    • Questions - 0.5 days
    • Resource Records - 4 days
  • MDNSCache - 2.5 days
    • Multi-Keyed Maps for ResourceRecord Querying - 0.5 days
    • TTL Expiry Record Invalidation - 0.5 days
    • Reverse Host to Record Mapping - 0.5 days
    • LRU to Prevent DoS - 0.5 days
    • Support use as local resource record in-memory database - 0.5 days
  • TaskManager - ? days
    • Migrate to in-memory - ? days
  • MDNS - 11.5 days
    • UDP Networking Logic - 2 days
      • Socket Binding to Multiple Interfaces - 1 days
      • Error Handling - 1 days
    • Querier - 4 days
      • Service Querying - 2.5 days
        • Record Aggregation for Muxxing Together Services - 0.5 days
        • Querying For a Service's Missing Records - 0.5 days
        • Emitting Expired Services - 0.5 days
      • Unicast 1.5 days
        • Checking for Unicast Availability - 1 days
        • Sending queries with unicast enabled - 0.5 days
    • Responder 5.5 days
      • Service Registration - 0.5 days
      • Filter Messages Received from Multicast Interface Loopback - 0.5 days
      • Unicast
        • Responding to Unicast Queries - 0.5 days
@CMCDragonkai
Copy link
Member

The old networking code is located here: https://github.com/MatrixAI/js-polykey/tree/3340fc7508e46a6021d1bd6d9005c99ea598e205/src-old/network

There may be some artifacts worth fetching out. Especially implementation of the local network discovery.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 7, 2022

Just a note, AWS's subnets in the VPC by default doesn't support multicast. And thus no mdns.

However this can be enabled by creating a transit gateway: https://docs.aws.amazon.com/vpc/latest/tgw/working-with-multicast.html.

Haven't tried it though.

Multicast is a way of doing "automatic service discovery" (by bootstrapping off a known location).

Alternative ways include using AWS's own service discovery, but that's not portable, and limited to specifically ECS or whatever aws provides there. And the only usage for that is to be able to auto-discover a seed node cluster in AWS like we are doing for testnet and mainnet.

Without automatic service discovery, deployment of agents into testnet and mainnet has to occur one by one, where each subsequent agent is deployed like a congaline, and has given knowledge of the other agents.

I wonder though... perhaps if I did use the auto-sd, maybe I could pass the SD domain/hostnaem directly as the seed node specification, and rely on our DHT to make use of it. Then we would be using AWS's SD but in a portable manner.


For the seed nodes, this can be worked around by using the DNS hostname testnet.polykey.io instead of having the different IPs... (different IPs is complex due to local IPs vs external IPs).

@CMCDragonkai CMCDragonkai added the epic Big issue with multiple subissues label Jul 9, 2022
@CMCDragonkai CMCDragonkai added the r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices label Jul 24, 2022
@CMCDragonkai CMCDragonkai changed the title Refactor Multicast Discovery Local Network Traversal - Multicast Discovery, Hairpinning, PMP, PCP Oct 31, 2022
@CMCDragonkai
Copy link
Member

Manual testing in MatrixAI/Polykey#487 (comment) has revealed an urgent need for some kind of "local network traversal".

Basically if 2 nodes are on the same subnet/LAN and thus have the same public IP address, hole punching using the relay signalling message will not work.

same-subnet-hole-punch

This is because the router may not support hairpinning. And therefore the packets just get dropped.

This can happen if 2 nodes are running on the same computer, and are using different ports. And it can also happen if 2 nodes are running on the same subnet, and are using different private IPs and ports.

It could also happen in more general way where in larger corporate networks.

image

Or even in larger CGNAT scenarios, where the home router themselves are not given a public IP address. Like imagine buying a bunch of nokia 5G routers for home usage, and now every home in a local area may be part of the same CGNAT IP.

This can seriously hamper connectivity!

@CMCDragonkai
Copy link
Member

Local multicase is necessary for local discovery. This means that a given NodeId may have multiple "valid" IP addresses/ports to access depending on who's the one asking the question.

To deal with this, we have to refactor the NG to be capable of dealing with the inherent ambiguity of node addresses.

Changing our NG key path to BucketIndex/NodeId/Host/Port, because different mechanisms may end up discovering different valid host and port combinations.

Tailscale seems to support some sort of local discovery, combined with detections of whether hairpinning works, and whether the immediate router supports PMP or PCP.

»» ~
 ♖ tailscale netcheck                                                                                    pts/2 16:22:29

Report:
	* UDP: true
	* IPv4: yes, 120.18.72.95:2703
	* IPv6: no
	* MappingVariesByDestIP: false
	* HairPinning: false
	* PortMapping: 
	* Nearest DERP: Sydney
	* DERP latency:
		- syd: 56.3ms  (Sydney)
		- sin: 126.8ms (Singapore)
		- blr: 173.2ms (Bangalore)
		- tok: 173.3ms (Tokyo)
		- hkg: 197.2ms (Hong Kong)
		- sfo: 197.3ms (San Francisco)
		- lax: 201.7ms (Los Angeles)
		- sea: 207.5ms (Seattle)
		- den: 207.6ms (Denver)
		- ord: 221.5ms (Chicago)
		- dfw: 250.8ms (Dallas)
		- tor: 250.9ms (Toronto)
		- nyc: 257.7ms (New York City)
		- hnl: 257.7ms (Honolulu)
		- mia: 257.8ms (Miami)
		- lhr: 314.2ms (London)
		- par: 325.1ms (Paris)
		- ams: 330.6ms (Amsterdam)
		- mad: 330.8ms (Madrid)
		- fra: 330.8ms (Frankfurt)
		- sao: 370.5ms (São Paulo)
		- waw: 370.7ms (Warsaw)
		- dbi: 471.1ms (Dubai)
		- jnb: 505.1ms (Johannesburg)

Meaning that a multitude of methods can be tried before falling back on some centralised relay.

@CMCDragonkai
Copy link
Member

For the most immediate use case, I think we can solve the problem of:

  1. 2 or more nodes on the same machine
  2. 2 or more nodes on the same home LAN

With the introduction of multicast discovery and expanding our NG to take that ambiguity and resolving it.

@CMCDragonkai
Copy link
Member

The fact that signalling does work though means that the signalling node is a common source of coordination. It can help do relaying, but it can also help the 2 nodes try to discover each other on the local networks too.

If we aren't afraid of "leaking private data", it's possible to provide private network information to the seed node.

If we want to hide that information from the seed node, it's possible to encrypt this data for the other node, and rely on the seed node to relay encrypted information to each other. This kind of leads to zero knowledge protocols too. https://www.theguardian.com/technology/2022/oct/29/privacy-problem-tech-enhancing-data-political-legal (I think these are called privacy enhancing protocols).

@CMCDragonkai
Copy link
Member

Wanted to mention that look to tailscale for inspiration.

They also keep track of all local IP:port bindings, and send that off to the tailscale server which is then distributed to other clients, and so other clients can actually just "attempt" DCs to the local IP:port. This actually sometimes works very well, and avoids any need to use complicated MDNS, multicast, PMP... etc protocols.

This is still a signaling system.

@amydevs first attempt to prototype the multicast system locally on a single computer, then we between computers on a home router (office router). And then look at the network and node graph modules to see how it can be integrated.

You need to have a read of the MDNS and multicast RFCs. There are existing libraries for this, but only use those as a guide. You do not need a library for implementation here.

@amydevs
Copy link
Contributor

amydevs commented May 12, 2023

What I think I've found so far:
For DNS-SD (Service Discovery) to work through mdns, two things need to happen.

  1. A host announcement needs to be made to register an 'A' record between the local IP of the device and it's hostname. This is typically done automatically by the OS on Windows and MacOS, and Avahi on Linux. It would be adequate to use the existing hostnames in these cases. Otherwise, we can check if the 'A' record already exists, and if not, create our own in case the user has network discovery disabled. The hostname of the device can be derived from os.hostname(), whether or not the network discovery is enabled or not.
  2. A service announcement needs to be made with a 'SRV' record with the host set to the hostname of the device in the first step. A port and service type can be set as well. We can set this to the QUIC RPC-Server port, with the service being something like _polykey._udp. (More is needed than just a SRV record, a TXT record is also needed, will add on this later)

These records are created by sending a DNS Response packet to the well-known mdns address.

A quick way to check if the service records have been correctly created is with avahi-browse --all on Linux or dns-sd -B _services._dns-sd._udp on Mac.

@amydevs
Copy link
Contributor

amydevs commented May 12, 2023

The RFC mentions as a recommendation that all clients of a machine should use a single shared MDNS implementation (bonjour, avahi, etc.). I think we ignore this recommendation as:

  1. We want this to work even if the user hasn't installed avahi / enabled network discovery (as is the default on windows machines).
  2. Linking into native libraries for mdns is a bit overkill when all we're doing is sending data through sockets.

@CMCDragonkai
Copy link
Member

Does the naming matter? Or we just choose Polykey.

@CMCDragonkai
Copy link
Member

Our standard port is 1314. Does this need to be fixed or can we do this for any port PK binds to?

@CMCDragonkai
Copy link
Member

Just wondering do we need to do a host announcement? It seems like this is an OS thing. Can we just expect that it is already done?

@CMCDragonkai
Copy link
Member

There was a commit that I forgot to push before.

I removed the type getter. It didn't make sense because MDNS could have multiple sockets, each with different types.

At the same time, if you are binding to ipv4, you get ipv4 type, ipv6 to ipv6 type, and if you use ipv4 mapped ipv6... I forgot what the type should be, but it should be the same as whatever js-quic does.

Furthermore each socket object should have type, host and port properties attached.

This means something like this:

image

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 5, 2023

Releasing 1.1.0 of js-table with the iterator changed to give you [number, R] this gives you access to the row index too. So you can delete things while iterating. Can be useful for limiting the cache size. Cache size limitation can just be a "drop old row" policy rather than going for a full blown LRU.

Also the table indexes options supports hashing functions. The default hashing function has changed to now support null and undefined. But you can provide your own hashing function.. Prefer producing a string, unless you have a specific reason.

@CMCDragonkai
Copy link
Member

Should also create your own special type XRow type. Where X is the thing you are putting into the table. This XRow type can be flattened as much as possible, that way you're able to index whatever deep keys you want.

@amydevs
Copy link
Contributor

amydevs commented Jul 5, 2023

The uniqueness of a record in mDNS, (even for shared records) is defined by the record's name, class, type, and data. Hence, MDNSCache needs a way to compoundly index by those 4 fields. The main field that is the problem is data, calling JSON.stringify on the parsed structure is not canonical. Hence, instead we should use the canonicalize.

@amydevs
Copy link
Contributor

amydevs commented Jul 7, 2023

i've moved and renamed MDNSCache to ResourceRecordCache to a separate folder to better reflect what it actually does. The reason for moving it to a separate folder is so that i can separate the utils, events, and errors for the cache

@amydevs
Copy link
Contributor

amydevs commented Jul 7, 2023

What's left is:

  • Splitting responses to separate interfaces
  • Managing promises

@amydevs
Copy link
Contributor

amydevs commented Jul 7, 2023

nodejs/node#1690 (comment)

After all, it seems that it is impossible to receive multicast messages from a socket directly binded to a specific interface. This makes things alot more complicated...

The solution ciao, bonjour-js, mdns-js, etc. have all chosen is the first method:

binding only to the PORT the multicast message is sent and joining the multicast group i.e.: as in receive-all-addresses.js. This way, you can receive every message sent to that port whether multicast or not.

However, as ciao states, this is leaky! and kind of defeats the whole purpose of the point of binding to each interface individually! https://github.com/homebridge/ciao/blob/a294fce273b19cac06fbc5dcdbb8db5e77caa68d/src/MDNSServer.ts#L518

The second option seems more sane:

binding to the PORT and the multicast address and joining the multicast group. This restricts the messages received to multicast only. As in receive-specific-address.js but binding to MULTICAST_IP instead of LOCAL_IP.

However, the obvious caveat is that we can't receive unicast messages.

My plan is just to focus on multicast for now. So we can bind onto the multicast group address, then use setMulticastInterface and addMembership targetting specific interface addresses in order to have a separate socket (and hence handler) for each interface.

However, only multicast messages will be able to be received on those sockets. Hence, unicast sockets also need to be binded later if we need them...

@CMCDragonkai
Copy link
Member

Does this mean 2 sockets for every IP address?

@amydevs
Copy link
Contributor

amydevs commented Jul 7, 2023

Does this mean 2 sockets for every IP address?

yes, but that is if we decide to do unicast response, request.

for now, i'm just focusing on not touching any of that @CMCDragonkai

@CMCDragonkai
Copy link
Member

We're going to have to do it the way ciao does it. 1 socket per interface, but bound to wildcard.

Make sure that the sockets will be able to join both the IPv4 multicast group and IPv6 multicast group.

@CMCDragonkai CMCDragonkai removed their assignment Jul 10, 2023
@amydevs
Copy link
Contributor

amydevs commented Jul 11, 2023

Some tests to consider:

  • DNS Packet Parsing/Generating
  • Cache Insertion
  • Cache Querying
  • Cache Expiry
  • Cache Overload
  • Creating/Receiving Announcements
    • Cancelling Announcements (via stop())
  • Creating/Receiving Responses
  • Creating/Receiving Queries
    • Cancelling Queries
  • Creating/Receiving Unicast Response
  • Port Conflict

@CMCDragonkai
Copy link
Member

@amydevs

  1. Logging should use INFO for most things. The only warnings right now should only be for when os.networkInterfaces provide invalid information and when you can proceed.
  2. Use X, Xing, Xed for the logging first keyword. Where X is the verb. Only using Xing when it's a one-off thing. X, Xed is for pre-X and post-X respectively.
  3. Mock testing just means mocking the input/output to MDNS. We don't need to simulate multiple interfaces or multiple sockets. But by controlling the input/output, we can simulate an entire interaction of MDNS in different scenarios.

@amydevs
Copy link
Contributor

amydevs commented Jul 14, 2023

On linux, due to node setting the IP_MULTICAST_ALL flag to true, will have sockets bound to any wildcard address (::, ::0, 0.0.0.0) receive all multicast traffic from all added groups on the system! This is not the same behavior as that on windows/macos/bsd.

nodejs/node#19954

@CMCDragonkai
Copy link
Member

Note that here https://github.com/clshortfuse/node-getsockethandleaddress it indicates that you can get the sockfd integer just by doing socket._handle.fd. Specifically for linux and macos I think.

You should confirm if this fix is needed for macos.

You'll still need to write the NAPI code to actually do something with that file descriptor number.

The _handle.fd does come with some sort of warning message. See if you can suppress that...?

@amydevs
Copy link
Contributor

amydevs commented Jul 17, 2023

The intended behavior is that a binded socket to "0.0.0.0" or "::0", with IP_MULTICAST_ALL disabled, will not receive any multicast messages at all.
Furthermore, when addMembership is called specifying a specific interface, multicast packets will only be received from that group on that interface.

It would seem, that disabling IP_MULTICAST_ALL works as intended on udp4 sockets.

However, it seems that when disabling IPV6_MULTICAST_ALL, without calling addMembership, it works as intended.
However, as soon as I call addMembship, that socket seems to start receiving multicast packets from every interface even when i've only specified one specific interface.

@amydevs
Copy link
Contributor

amydevs commented Jul 17, 2023

it seems that there are several options that add an ipv6 socket to a multicast group, being IPV6_JOIN_GROUP and IPV6_ADD_MEMBERSHIP. They all use the ipv6_mreq struct as a configuration option rather than the ip_mreq struct.

The key difference between these is that ipv6_mreq takes in the interface index, whilst ip_mreq takes an interface ip address.

Node, upon calling addMembership, will call uv_udp_set_membership with JOIN_GROUP, passing in the interface ip address.
If it is udp4, libuv will throw ENODEV when an iface does not correspond to the address you have provided.
However, on udp6, libuv tries to look for the index (scopeid) of the interface with the address you've provided using uv_ip6_addr. It SHOULD give an ENODEV error if it is invalid, but it is not bubbling up to node for some reason.

On udp6, libuv chooses the scopeid by WHATEVER IS AFTER THE % SIGN, IF IT CAN'T BE FOUND, IT'S IGNORED. THAT IS WHY ADDMEMBERSHIP WITH JUST A IPV6 ADDRESS IS NOT ENOUGH, I THINK THE SCOPE ID NEEDS TO BE PROVIDED AFTER THE PERCENTAGE SIGN. This is only on windows, (on linux, providing the network interface name after the % sign is correct)

On udp6, using either IPV6_ADD_MEMBERSHIP or IPV6_JOIN_GROUP, will for whatever reason, make your socket listen to multicast packets on all interfaces rather than just a singular specified one. The native code that i tested is:

int AddMulticastMembership6(int sockfd, char* group, char* interface) {
  struct ipv6_mreq mreq;
  inet_pton(AF_INET6, group, &mreq.ipv6mr_multiaddr);
  mreq.ipv6mr_interface = if_nametoindex(interface);
  bool success = setsockopt(sockfd, IPPROTO_IPV6, IPV6_JOIN_GROUP, &mreq, sizeof(mreq)) >= 0;
  return if_nametoindex(interface);
}

@amydevs
Copy link
Contributor

amydevs commented Jul 17, 2023

i've found that IP_BLOCK_SOURCE also exists. this could be useful in filtering out our own traffic. However, we would need to implement a platform-agnostic solution if we wanted to use this across all platforms. For now, having a set ip to filter out seems fine to me.

@amydevs
Copy link
Contributor

amydevs commented Jul 17, 2023

as a workaround to #1 (comment), i'm trying to just bind a unicast socket first, then binding all the multicast sockets after. This is done so that the first unicast socket will catch all of the necessary unicast traffic.

I'm at the point of implementing this. However, even though i've made sure that the unicast socket is the first thing to be binded on a particular port, as soon as i bind other sockets, none of the sockets seem to be receiving any unicast traffic at all!

I wonder if the first socket bound to a port on an interface with reuseaddr being true receiving all unicast traffiic is deterministic...

@amydevs
Copy link
Contributor

amydevs commented Jul 18, 2023

on macos, tests run correctly, just some counted references are making the cleanup (afterAll) of MDNS hang. I've pinned it down to the sending of the goodbye packets, but i'm still figuring a solution

@amydevs
Copy link
Contributor

amydevs commented Jul 18, 2023

on windows, it is not possible to bind to a multicast address like you can on any unix system. On windows systems, i am binding each multicast socket to "::" instead. This functionally is the same as binding to the multicast address in my case, as i'm binding a unicast socket before all the other multicast sockets are binded. Windows makes sure that only the first socket that you've bound will receive multicast traffic.

@CMCDragonkai
Copy link
Member

on macos, tests run correctly, just some counted references are making the cleanup (afterAll) of MDNS hang. I've pinned it down to the sending of the goodbye packets, but i'm still figuring a solution

Are you tracking all resources between start and stop? Always make sure to keep track of them. We already have problems with memory leaks and we have to be very strict here.

@CMCDragonkai
Copy link
Member

Merged into staging now, doing the release.

@tegefaulkes
Copy link
Contributor

Is this fully addressed by MDNS? Are there still plans to handle Hairpinning, PMP and PCP?

@CMCDragonkai
Copy link
Member

PMP and PCP should be done separately.

Hairpinning not sure how that would be achieved.

@CMCDragonkai CMCDragonkai changed the title Local Network Traversal - Multicast Discovery, Hairpinning, PMP, PCP Local Network Traversal - Multicast Discovery Jul 25, 2023
@CMCDragonkai
Copy link
Member

I created MatrixAI/Polykey#536 to track PCP/PMP via UPNP. I did find a project that could be wrapped around in JS to make use of.

@CMCDragonkai
Copy link
Member

@amydevs please tick off everything that was done above too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices
Development

No branches or pull requests

5 participants