Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement UDP reliability layer #32

Open
ainghazal opened this issue Sep 29, 2022 · 1 comment
Open

implement UDP reliability layer #32

ainghazal opened this issue Sep 29, 2022 · 1 comment
Assignees

Comments

@ainghazal
Copy link
Collaborator

ainghazal commented Sep 29, 2022

OpenVPN implements its own reliability layer on top of UDP (it's worthy to note that it's used for TCP too). We should follow the reference implementation as closely as possible.

This basically boils down to keep track of acknowledged packets on a fixed-size structure, and retry sending if acks have not been received after a given period. Retries follow an exponential back-off, and incoming packets with ids beyond a certain acknowledgment window will be dropped since sequentiality cannot be guaranteed by the data structure.

edit: for some reason I had originally written "sentimentality" in the last sentence.

Relevant pointers:

@ainghazal
Copy link
Collaborator Author

ainghazal commented Oct 6, 2022

From the work done to remediate all the vulnerabilities uncovered by the Red Team Lab, this has proven to be the most interesting.

I have partially implemented openvpn's reliability layer, and added the provided dos_exploit script in the E2E folder. One thing I noticed is that, while running the DoS script locally, setting different time intervals makes for different sensitivity to the injected bogus packets.

The initial handshake proves to be the most sensitive part in my implementation. I've added retry loops in the different steps for the handshake, and I manage to get ~100% success rates above 200 ms of injection interval. Anything below that tends to stall, which seems to indicate the TLS handshake is still sensitive to injection.

However, OpenVPN also seems to get into a TLS handshake stall from time to time at the same injection rates. This is probabilistic, and the approximate failure rate should be determined more accurately:

2022-10-06 23:49:52 library versions: OpenSSL 3.0.2 15 Mar 2022, LZO 2.10
2022-10-06 23:49:52 TCP/UDP: Preserving recently used remote address: [AF_INET]51.158.144.32:1194
2022-10-06 23:49:52 Socket Buffers: R=[212992->212992] S=[212992->212992]
2022-10-06 23:49:52 UDPv4 link local (bound): [AF_INET][undef]:1194
2022-10-06 23:49:52 UDPv4 link remote: [AF_INET]51.158.144.32:1194
2022-10-06 23:49:52 TLS: Initial packet from [AF_INET]51.158.144.32:1194, sid=41445041 434b4554
2022-10-06 23:49:52 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
2022-10-06 23:49:52 TLS Error: Unroutable control packet received from [AF_INET]51.158.144.32:1194 (si=3 op=P_ACK_V1)
2022-10-06 23:49:52 TLS Error: Unroutable control packet received from [AF_INET]51.158.144.32:1194 (si=3 op=P_ACK_V1)
2022-10-06 23:49:53 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
2022-10-06 23:49:53 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
2022-10-06 23:49:53 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
2022-10-06 23:49:54 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
2022-10-06 23:49:54 TLS Error: Unroutable control packet received from [AF_INET]51.158.144.32:1194 (si=3 op=P_ACK_V1)
2022-10-06 23:49:54 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
2022-10-06 23:49:57 TLS Error: local/remote TLS keys are out of sync: [AF_INET]51.158.144.32:1194 [0]
2022-10-06 23:49:57 TLS Error: Unroutable control packet received from [AF_INET]51.158.144.32:1194 (si=3 op=P_CONTROL_V1)
2022-10-06 23:49:57 TLS Error: Unroutable control packet received from [AF_INET]51.158.144.32:1194 (si=3 op=P_CONTROL_V1)
2022-10-06 23:49:58 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
2022-10-06 23:49:58 TLS Error: Unroutable control packet received from [AF_INET]51.158.144.32:1194 (si=3 op=P_ACK_V1)
2022-10-06 23:49:58 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
2022-10-06 23:49:58 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
2022-10-06 23:49:58 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
2022-10-06 23:49:58 TLS ERROR: local/remote key IDs out of sync (0/2) ID:  [key#0 state=S_PRE_START id=0 sid=41445041 434b4554] [key#1 state=S_UNDEF id=0 sid=00000000 00000000] [key#2 state=S_UNDEF id=0 sid=00000000 00000000]
^C2022-10-06 23:50:02 event_wait : Interrupted system call (code=4)
2022-10-07 00:04:18 TLS: new session incoming connection from [AF_INET]51.158.144.32:1194
2022-10-07 00:04:18 OpenSSL: error:0A0003E7:SSL routines::invalid session id
2022-10-07 00:04:18 TLS_ERROR: BIO read tls_read_plaintext error
2022-10-07 00:04:18 TLS Error: TLS object -> incoming plaintext read error
2022-10-07 00:04:18 TLS Error: TLS handshake failed
2022-10-07 00:04:18 TLS Error: Unroutable control packet received from [AF_INET]51.158.144.32:1194 (si=3 op=P_CONTROL_V1)
2022-10-07 00:04:18 TLS Error: Unroutable control packet received from [AF_INET]51.158.144.32:1194 (si=3 op=P_ACK_V1)

It's worth noting that OpenVPN tendency to stall at the same injection rate feels lesser than mine (based on superficial observation).

I'm tempted to keep debugging the TLS handshake, and perhaps finding ways in the current implementation to follow OpenVPN's more closely, but at this point I feel that for the purposes of monitoring services, this is approaching a good state: if a censor is successfully injecting packets at such a rate, we want to measure a rate of failure that is approximately equal to the reference implementation. Perhaps I could determine this failure rate empirically, so that I have a reference of what is the order of magnitude I need to approach to.

One useful thing might be to log the number of drops and/or bad packets (most of the cases will be drops becase the incoming packet id is out of the capacity of the receive window, which is by default set to a len of 8 packets in the reference implementation).

On the other hand, I also feel that priority-wise, time would be better spent implementing --tls-auth, -tls-crypt and -tls-crypt-v2, which are the DoS protections in the OpenVPN spec.

The time-to-bootstrap will be a good proxy of anomalies.

ainghazal added a commit to ainghazal/minivpn that referenced this issue Oct 7, 2022
This is not a complete implementation of the reliability layer. Rather,
I've tried to achieve te minimal incremental change that adds resilience
in the fase of network noise.

To achieve that, the simple thing to do was to make session an object
owned by an implementation of reliableTransport. I've reused the
reliableUDP implementation in govpn, and I like the simplicity of that
implementation a lot. A lot of our current logic (ackqueue/retries)
needs to move from the tlsTransport minivpn implementation into
reliableTransport.

- Related: ooni#32
ainghazal added a commit to ainghazal/minivpn that referenced this issue Nov 21, 2022
This is not a complete implementation of the reliability layer. Rather,
I've tried to achieve te minimal incremental change that adds resilience
in the fase of network noise.

To achieve that, the simple thing to do was to make session an object
owned by an implementation of reliableTransport. I've reused the
reliableUDP implementation in govpn, and I like the simplicity of that
implementation a lot. A lot of our current logic (ackqueue/retries)
needs to move from the tlsTransport minivpn implementation into
reliableTransport.

- Related: ooni#32
ainghazal added a commit to ainghazal/minivpn that referenced this issue Dec 12, 2022
I've tried to achieve te minimal incremental change that adds resilience
in the face of network noise.

To achieve that, the simple thing to do was to make session an object
owned by an implementation of reliableTransport. I've reused the
reliableUDP implementation in govpn, and I like the simplicity of that
implementation a lot. A lot of our current logic (ackqueue/retries)
needed to move from the tlsTransport minivpn implementation into
reliableTransport.

Although the DoS documented in the MIV-01 report is not done, we add the
e2e testing script to facilitate further development.

- Related: ooni#32

more tests
ainghazal added a commit to ainghazal/minivpn that referenced this issue Mar 15, 2023
I've tried to achieve te minimal incremental change that adds resilience
in the face of network noise.

To achieve that, the simple thing to do was to make session an object
owned by an implementation of reliableTransport. I've reused the
reliableUDP implementation in govpn, and I like the simplicity of that
implementation a lot. A lot of our current logic (ackqueue/retries)
needed to move from the tlsTransport minivpn implementation into
reliableTransport.

Although the DoS documented in the MIV-01 report is not done, we add the
e2e testing script to facilitate further development.

- Related: ooni#32

more tests
ainghazal added a commit to ainghazal/minivpn that referenced this issue May 8, 2023
I've tried to achieve te minimal incremental change that adds resilience
in the face of network noise.

To achieve that, the simple thing to do was to make session an object
owned by an implementation of reliableTransport. I've reused the
reliableUDP implementation in govpn, and I like the simplicity of that
implementation a lot. A lot of our current logic (ackqueue/retries)
needed to move from the tlsTransport minivpn implementation into
reliableTransport.

Although the DoS documented in the MIV-01 report is not done, we add the
e2e testing script to facilitate further development.

- Related: ooni#32

more tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant