protocols/kad: Fix right shift overflow panic in record_received #1492

mxinden · 2020-03-11T12:30:44Z

Within Behaviour::record_received the exponentially decreasing
expiration based on the distance to the target for a record is
calculated as following:

Calculate the amount of nodes between us and the record key beyond
the k replication constant as n.
Shift the configured record time-to-live n times to the right to
calculate an exponentially decreasing expiration.

The configured record time-to-live is a u64. If n is larger or equal
to 64 the right shift will lead to an overflow which panics in debug
mode.

This patch uses a checked right shift instead, defaulting to 0 (now + 0) for the expiration on overflow.

Fixes #1491.

Within `Behaviour::record_received` the exponentially decreasing expiration based on the distance to the target for a record is calculated as following: 1. Calculate the amount of nodes between us and the record key beyond the k replication constant as `n`. 2. Shift the configured record time-to-live `n` times to the right to calculate an exponentially decreasing expiration. The configured record time-to-live is a u64. If `n` is larger or equal to 64 the right shift will lead to an overflow which panics in debug mode. This patch uses a checked right shift instead, defaulting to 0 (`now + 0`) for the expiration on overflow.

protocols/kad/src/behaviour/test.rs

twittner · 2020-03-11T13:57:13Z

protocols/kad/src/handler.rs

@@ -364,6 +364,15 @@ pub struct KademliaRequestId {
    connec_unique_id: UniqueConnecId,
 }

+#[cfg(test)]
+impl Default for KademliaRequestId {


As long as KademliaRequestId implements Clone, this Default impl might as well be available all the time.

This is an opaque implementation detail. Implementing Default on it doesn't make any sense. Maybe it could contain a channel or an Arc in the future.

I understood KademliaRequestId to be some sort of unique one-time token and Clone defeats that purpose but I did not look much into it so feel free to ignore the criticism.

I am fine either way. I can't come up with a use-case for Default outside of testing though.

I understood KademliaRequestId to be some sort of unique one-time token and Clone defeats that purpose but I did not look much into it so feel free to ignore the criticism.

Just to give some context: KademliaRequestId did not use to be Clone but it had to be as a result of #1440 requiring Clone for the TInEvent of any handler in libp2p-swarm, since it is now possible to "broadcast" a single such event to multiple handlers (which in turn was a requirement discovered during the substrate integration in paritytech/substrate#5066) and KademliaRequestId is part of these events in libp2p-kad. It would of course be better to not have that Clone, but I didn't come up with a way to do so.

tomaka · 2020-03-11T14:22:09Z

protocols/kad/src/behaviour/test.rs

+        vec![],
+    );
+
+    kad.record_received(target, connection_id, request_id, record);


We shouldn't be testing private implementation details. The parameters of this method are totally arbitrary. The fact that it's a separate method at all is also totally arbitrary and subject to change. All we're doing with this test is locking the current implementation as it is.

I understand your concern. What are you advocating for instead: (1) remove the test or (2) write a larger test that reproduces the same behaviour?

All we're doing with this test is locking the current implementation as it is.

I don't think a test should ever lock an implementation. While I am in favor of having a test for each regression, I am also in favor of removing outdated tests if necessary.

protocols/kad/src/behaviour.rs

romanb · 2020-03-13T11:01:51Z

Thanks for fixing this, I wasn't even aware bit shifts could overflow in this way. A regression test is a good idea but I don't see why it needs to be so complicated. Why not extract the code that calculates the expiry into a standalone function and writing a small property and or / regression test for that? Even if you want the function to actually take a KBucketsTable as argument, so that it uses count_nodes_between internally, there shouldn't be a need for such a probabilistic generation of so many peer IDs, since Key::for_distance can be used in combination with the ClosestBucketsIter to deterministically fill the buckets as desired (this test may give inspiration). In other words, when testing a particular edge case requires seemingly too much surrounding boilerplate in order to test what you want, it is often a good idea to partition the code further so that the part you're interested in testing in isolation is easier to test, without so much surrounding ceremony. This seems to be such a case.

Extract right shift into isolated function and replace complex regression test with small isolated one.

mxinden · 2020-03-16T11:21:44Z

@romanb thanks for the input. Can you take another look?

romanb

I left a few simplifying suggestions, let me know if you disagree with any of them.

romanb · 2020-03-16T11:50:33Z

protocols/kad/src/behaviour.rs

-        let expiration = self.record_ttl.map(|ttl|
-            now + Duration::from_secs(ttl.as_secs() >> num_beyond_k)
-        );
+        let expiration = exp_decr_expiration(self.record_ttl, num_beyond_k);


Suggested change

let expiration = exp_decr_expiration(self.record_ttl, num_beyond_k);

let expiration = self.record_ttl.map(|ttl| now + exp_decrease(ttl, num_beyond_k));

romanb · 2020-03-16T11:51:33Z

protocols/kad/src/behaviour.rs

@@ -1030,6 +1026,15 @@ where
    }
 }

+/// Calculate exponentially decreasing expiration from a default time-to-live by a factor.
+fn exp_decr_expiration(default_ttl: Option<Duration>, factor: u32) -> Option<Instant> {


Suggested change

fn exp_decr_expiration(default_ttl: Option<Duration>, factor: u32) -> Option<Instant> {

fn exp_decrease(ttl: Duration, exp: u32) -> Instant {

romanb · 2020-03-16T11:54:19Z

protocols/kad/src/behaviour.rs

@@ -1030,6 +1026,15 @@ where
    }
 }

+/// Calculate exponentially decreasing expiration from a default time-to-live by a factor.


Suggested change

/// Calculate exponentially decreasing expiration from a default time-to-live by a factor.

/// Exponentially decrease the given duration (base 2).

romanb · 2020-03-16T11:58:50Z

protocols/kad/src/behaviour.rs

@@ -1030,6 +1026,15 @@ where
    }
 }

+/// Calculate exponentially decreasing expiration from a default time-to-live by a factor.
+fn exp_decr_expiration(default_ttl: Option<Duration>, factor: u32) -> Option<Instant> {
+    default_ttl.map(|ttl| Instant::now() + Duration::from_secs(


Suggested change

default_ttl.map(|ttl| Instant::now() + Duration::from_secs(

Duration::from_secs(ttl.as_secs().checked_shr(exp).unwrap_or(0))

romanb · 2020-03-16T12:04:46Z

protocols/kad/src/behaviour.rs

@@ -946,8 +946,6 @@ where
            return
        }

-        let now = Instant::now();


I think it is preferable to leave this here as it can be reused multiple times (cf. #1496). And since Instant::now() is a side-effecting function for obtaining some notion of system time, usually both testability and performance is improved by using it mostly in top-level functions and passing the instant down to others where needed (see also the other comments).

mxinden · 2020-03-17T18:20:05Z

Suggested simplifications look good to me. @romanb would you mind taking another look?

mxinden added 2 commits March 11, 2020 13:22

protocols/kad: Add test to reproduce right shift overflow panic

e6a57ba

mxinden requested review from romanb and arkpar March 11, 2020 12:30

twittner reviewed Mar 11, 2020

View reviewed changes

protocols/kad/src/behaviour/test.rs Show resolved Hide resolved

twittner reviewed Mar 11, 2020

View reviewed changes

tomaka reviewed Mar 11, 2020

View reviewed changes

arkpar reviewed Mar 11, 2020

View reviewed changes

protocols/kad/src/behaviour.rs Outdated Show resolved Hide resolved

mxinden added 2 commits March 12, 2020 14:46

protocols/kad: Put attribute below comment

a3271c2

Merge branch 'libp2p/master' into kad-right-shift

b08f509

mxinden mentioned this pull request Mar 12, 2020

protocols/kad: Do not attempt to store expired record in record store #1496

Merged

protocols/kad: Extract shifting logic and rework test

9e27e25

Extract right shift into isolated function and replace complex regression test with small isolated one.

romanb suggested changes Mar 16, 2020

View reviewed changes

protocols/kad/src/behaviour: Refactor exp_decr_expiration

36f6b93

romanb approved these changes Mar 17, 2020

View reviewed changes

Merge branch 'master' into kad-right-shift

ccdf994

romanb merged commit 2947146 into libp2p:master Mar 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

protocols/kad: Fix right shift overflow panic in record_received #1492

protocols/kad: Fix right shift overflow panic in record_received #1492

mxinden commented Mar 11, 2020

twittner Mar 11, 2020

tomaka Mar 11, 2020

twittner Mar 11, 2020

mxinden Mar 12, 2020

romanb Mar 12, 2020 •

edited

Loading

tomaka Mar 11, 2020 •

edited

Loading

mxinden Mar 12, 2020

romanb commented Mar 13, 2020 •

edited

Loading

mxinden commented Mar 16, 2020

romanb left a comment

romanb Mar 16, 2020

romanb Mar 16, 2020

romanb Mar 16, 2020

romanb Mar 16, 2020

romanb Mar 16, 2020

mxinden commented Mar 17, 2020

	let expiration = exp_decr_expiration(self.record_ttl, num_beyond_k);
	let expiration = self.record_ttl.map(\|ttl\| now + exp_decrease(ttl, num_beyond_k));

	fn exp_decr_expiration(default_ttl: Option<Duration>, factor: u32) -> Option<Instant> {
	fn exp_decrease(ttl: Duration, exp: u32) -> Instant {

	/// Calculate exponentially decreasing expiration from a default time-to-live by a factor.
	/// Exponentially decrease the given duration (base 2).

	default_ttl.map(\|ttl\| Instant::now() + Duration::from_secs(
	Duration::from_secs(ttl.as_secs().checked_shr(exp).unwrap_or(0))

protocols/kad: Fix right shift overflow panic in record_received #1492

protocols/kad: Fix right shift overflow panic in record_received #1492

Conversation

mxinden commented Mar 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romanb Mar 12, 2020 • edited Loading

Choose a reason for hiding this comment

tomaka Mar 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romanb commented Mar 13, 2020 • edited Loading

mxinden commented Mar 16, 2020

romanb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mxinden commented Mar 17, 2020

romanb Mar 12, 2020 •

edited

Loading

tomaka Mar 11, 2020 •

edited

Loading

romanb commented Mar 13, 2020 •

edited

Loading