Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swarm: Deprecate Swarm::with_XYZ and enforce creation via SwarmBuilder #3186

Closed
thomaseizinger opened this issue Dec 2, 2022 · 14 comments · Fixed by #3588
Closed

swarm: Deprecate Swarm::with_XYZ and enforce creation via SwarmBuilder #3186

thomaseizinger opened this issue Dec 2, 2022 · 14 comments · Fixed by #3588
Labels
getting-started Issues that can be tackled if you don't know the internals of libp2p very well help wanted priority:nicetohave

Comments

@thomaseizinger
Copy link
Contributor

thomaseizinger commented Dec 2, 2022

Description

With the introduction of the Swarm::with_executor APIs (#3068), we have a fair bit of duplication in APIs on how to create a Swarm. Additionally, the current SwarmBuilder and Swarm APIs don't work well with the QUIC and WebRTC transports as those don't go through the typical builder pattern on Transport and hereby don't construct a StreamMuxerBox (see #3179).

We should streamline the creation of a Swarm to work with any Transport and be easy to discover. @mxinden suggested to achieve this by forcing users to always go through SwarmBuilder.

Motivation

  • Less API surface and layers of indirection.

Open questions

Are you planning to do it yourself in a pull request?

Maybe.

@mxinden
Copy link
Member

mxinden commented Dec 8, 2022

Using the builder pattern to construct a Swarm enforces at compile time that a user can not reconfigure the below values at runtime after Swarm creation. I would like to uphold this enforcement at compile time, or explicitly allow it for a subset of the values below. The latter would require more thought, though at this point I don't see a use-case where a user would need to reconfigure these after creation during runtime.

  • notify_handler_buffer_size

  • connection_event_buffer_size

  • dial_concurrency_factor

  • connection_limits

  • substream_upgrade_protocol_override

  • max_negotiating_inbound_streams

With that in mind I am advocating against removing SwarmBuilder.

  • Less API surface

In case we want to improve here, I would suggest going the opposite route, namely to enforce the creation of a Swarm through the SwarmBuilder, thus reducing the paths to creating a Swarm to a single one, namely through the SwarmBuilder.

@thomaseizinger
Copy link
Contributor Author

  • Less API surface

In case we want to improve here, I would suggest going the opposite route, namely to enforce the creation of a Swarm through the SwarmBuilder, thus reducing the paths to creating a Swarm to a single one, namely through the SwarmBuilder.

That sounds reasonable to me!

@thomaseizinger
Copy link
Contributor Author

Using the builder pattern to construct a Swarm enforces at compile time that a user can not reconfigure the below values at runtime after Swarm creation. I would like to uphold this enforcement at compile time, or explicitly allow it for a subset of the values below. The latter would require more thought, though at this point I don't see a use-case where a user would need to reconfigure these after creation during runtime.

FWIW: Making them reconfigurable wasn't the goal of this, I also don't see why anyone would want it but that doesn't mean we can't allow it as long as the resulting behaviour is sane (which is ensured by the linked PRs).

@thomaseizinger
Copy link
Contributor Author

thomaseizinger commented Dec 12, 2022

I like the idea of enforcing creation of Swarm through SwarmBuilder. It means we can guide the user through a type-safe API on what they need to provide and make Swarm::with_XYZ crate-private.

@thomaseizinger thomaseizinger changed the title swarm: Deprecate SwarmBuilder swarm: Deprecate Swarm::new and enforce creation via SwarmBuilder Dec 12, 2022
@thomaseizinger thomaseizinger added priority:nicetohave help wanted getting-started Issues that can be tackled if you don't know the internals of libp2p very well and removed decision-pending Marks issues where a decision is pending before we can move forward. labels Dec 12, 2022
@thomaseizinger thomaseizinger changed the title swarm: Deprecate Swarm::new and enforce creation via SwarmBuilder swarm: Deprecate Swarm::with_XYZ and enforce creation via SwarmBuilder Dec 12, 2022
@vnermolaev
Copy link
Contributor

vnermolaev commented Mar 9, 2023

I want to tackle it. I'd like to align on the approach first,

For the struct SwarmBuilder

pub struct SwarmBuilder<TBehaviour> {
    local_peer_id: PeerId,
    transport: transport::Boxed<(PeerId, StreamMuxerBox)>,
    behaviour: TBehaviour,
    pool_config: PoolConfig,
    connection_limits: ConnectionLimits,
}

I will make all fields configurable and leave the current configuration functions in place, such as,

pub fn notify_handler_buffer_size(mut self, n: NonZeroUsize) -> Self {
        self.pool_config = self.pool_config.with_notify_handler_buffer_size(n);
        self
    }

Mark items such as the following deprecated. I will not change the visibility; it's a breaking change unless you're OK with introducing such changes immediately.

pub fn with_executor(
        transport: transport::Boxed<(PeerId, StreamMuxerBox)>,
        behaviour: TBehaviour,
        local_peer_id: PeerId,
        executor: impl Executor + Send + 'static,
    ) -> Self

Additionally, convenience functions such as below in Swarm could be removed (or flagged as deprecated) in favor of using SwarmBuilder consistently

impl<TBehaviour> Swarm<TBehaviour>
where
    TBehaviour: NetworkBehaviour,
{
    /// Builds a new `Swarm` with a provided executor.
    pub fn with_executor(
        transport: transport::Boxed<(PeerId, StreamMuxerBox)>,
        behaviour: TBehaviour,
        local_peer_id: PeerId,
        executor: impl Executor + Send + 'static,
    ) -> Self {
        SwarmBuilder::with_executor(transport, behaviour, local_peer_id, executor).build()
    }

@thomaseizinger
Copy link
Contributor Author

I want to tackle it. I'd like to align on the approach first,

Thanks!

For the struct SwarmBuilder

pub struct SwarmBuilder<TBehaviour> {
    local_peer_id: PeerId,
    transport: transport::Boxed<(PeerId, StreamMuxerBox)>,
    behaviour: TBehaviour,
    pool_config: PoolConfig,
    connection_limits: ConnectionLimits,
}

I will make all fields configurable and leave the current configuration functions in place, such as,

A builder pattern typically starts with all mandatory fields and then allows you to configure optional fields. PeerId, transport and behaviour are mandatory in our case, so those shouldn't be configurable.

Mark items such as the following deprecated. I will not change the visibility; it's a breaking change unless you're OK with introducing such changes immediately.

pub fn with_executor(
        transport: transport::Boxed<(PeerId, StreamMuxerBox)>,
        behaviour: TBehaviour,
        local_peer_id: PeerId,
        executor: impl Executor + Send + 'static,
    ) -> Self

Those functions on SwarmBuilder should remain. They should be the main entry point for the user. In a 2nd step, I'd want to change the transport parameter to a type parameter with Transport bounds and do the boxing within the function to solve the linked issue with the QUIC transport.

Does that make sense? This should be another PR though as it will likely be breaking unless we introduce new functions and deprecate these.

Additionally, convenience functions such as below in Swarm could be removed (or flagged as deprecated) in favor of using SwarmBuilder consistently

impl<TBehaviour> Swarm<TBehaviour>
where
    TBehaviour: NetworkBehaviour,
{
    /// Builds a new `Swarm` with a provided executor.
    pub fn with_executor(
        transport: transport::Boxed<(PeerId, StreamMuxerBox)>,
        behaviour: TBehaviour,
        local_peer_id: PeerId,
        executor: impl Executor + Send + 'static,
    ) -> Self {
        SwarmBuilder::with_executor(transport, behaviour, local_peer_id, executor).build()
    }

Yes those should be deprecated and eventually removed.

@vnermolaev
Copy link
Contributor

Ok, let me give you a taste what it would like

impl<TBehaviour> SwarmBuilder<TBehaviour>
where
    TBehaviour: NetworkBehaviour,
{
    /// Creates a new [`SwarmBuilder`] from the given transport, behaviour and local peer ID. The
    /// `Swarm` with its underlying `Network` is obtained via [`SwarmBuilder::build`].
    ///
    /// ## ⚠️  Performance warning
    /// BY default, all connections will be polled on the current task, thus quite bad performance
    /// characteristics should be expected. Whenever possible, use an executor which could be set
    /// either selected from available executors: [`SwarmBuilder::tokio_executor`] or
    /// [`SwarmBuilder::async_std_executor`]. Alternatively, a custom executor can be set with
    /// [`SwarmBuilder::executor`].
    pub fn new(
        transport: transport::Boxed<(PeerId, StreamMuxerBox)>,
        behaviour: TBehaviour,
        local_peer_id: PeerId,
    ) -> Self {
        Self {
            local_peer_id,
            transport,
            behaviour,
            pool_config: PoolConfig::new(None),
            connection_limits: Default::default(),
        }
    }

    /// Sets executor of this [`SwarmBuilder`] to a given value.
    pub fn executor(mut self, executor: impl Executor + Send + 'static) -> Self {
        self.pool_config = PoolConfig::new(Some(Box::new(executor)));
        self
    }

    /// Set executor for this  [`SwarmBuilder`] to the `tokio` executor.
    #[cfg(all(
        feature = "tokio",
        not(any(target_os = "emscripten", target_os = "wasi", target_os = "unknown"))
    ))]
    pub fn tokio_executor(mut self) -> Self {
        self.executor(crate::executor::TokioExecutor)
    }

    /// Set executor for this [`SwarmBuilder`] to the `async-std` executor.
    #[cfg(all(
        feature = "async-std",
        not(any(target_os = "emscripten", target_os = "wasi", target_os = "unknown"))
    ))]
    pub fn async_std_executor(mut self) -> Self {
        self.executor(crate::executor::AsyncStdExecutor)
    }
    ...
}

So the building will be SwarmBuilder::new().tokio_executor().build()
Does it align with your expectations?
Notice there is no with_ prefix. This way, the names are aligned with the rest of the methods in the SwarmBuilder impl, e.g.,

pub fn notify_handler_buffer_size(mut self, n: NonZeroUsize) -> Self {
        self.pool_config = self.pool_config.with_notify_handler_buffer_size(n);
        self
    }

@thomaseizinger
Copy link
Contributor Author

Interesting idea, thank you for sharing.

One of the reasons we introduced these runtime-based builder methods is that for tokio for example, it is required that the connection tasks run on the tokio runtime, otherwise they will panic at runtime. Thus, I think it should not be optional to call .executor. Users will do the minimal thing to make things compile and there is kind of an expectation in Rust that software that compiles, works, at least to some degree.

One thing we can do is make SwarmBuilder type-safe by using type-states that force you to call one of tokio_executor, no_executor, async_std_executor etc functions after you called SwarmBuilder::new and before you can call build.

I don't mind whether we:

  • Have separate with_tokio_executor, with_async_std_executor etc builder constructors
  • Use a type-state builder that forces you to call one of tokio_executor etc functions between new and build

But I'd like to have some form of guidance in the type-system here otherwise, using libp2p with tokio is too much of a footgun based on past user reports.

Let me know what you think!

@vnermolaev
Copy link
Contributor

Listing with_tokio_executor, etc. in fact is the starting point for this issue :) So we'd like to get rid of them, as far as I understand.

It seems to me that having a state type on SwarmBuilder already will govern what kind of executor must be used. Except for the case when the executor is custom, then indeed it is possible to make executor(executor: impl Executor + ...) available.

However, regardless of the selected approach, the type system won't prevent users from creating a swarm with, f.e., async std in the tokio context. This coupling is external to the swarm lib. I don't believe it is enforcable.

@thomaseizinger
Copy link
Contributor Author

Listing with_tokio_executor, etc. in fact is the starting point for this issue :) So we'd like to get rid of them, as far as I understand.

The starting point for this issue is that we have this API on Swarm AND on SwarmBuilder. That is the duplication I'd like to get rid of.

However, regardless of the selected approach, the type system won't prevent users from creating a swarm with, f.e., async std in the tokio context. This coupling is external to the swarm lib. I don't believe it is enforcable.

That is true. I'd expect users to only enable either the async-std or the tokio feature and thus not actually have both available. However, there is only so much we can do as you said.

@vnermolaev
Copy link
Contributor

The starting point for this issue is that we have this API on Swarm AND on SwarmBuilder. That is the duplication I'd like to get rid of.

Understood.

All right. As far as I can see, I can do one of the following:

  • stick to my proposition, f.e., SwarBuilder::new(transport, behaviour, local_peer_id).tokio_executor().X.Y.build()
  • have explicit with_T's on SwarmBuilder, f.e., SwarmBuilder::with_T_executor(transport, behaviour, local_peer_id).X.Y.build()
  • both of them, with the latter implemented as convenience functions on top of the former option, i.e., SwarmBuilder::with_T_executor(transport, behaviour, local_peer_id) = SwarmBuilder::new(transport, behaviour, local_peer_id).T_executor()

IMO, the first approach is the cleanest and follows the builder pattern to the letter.
What would you prefer?

@thomaseizinger
Copy link
Contributor Author

How is the first approach going to help new users configuring the right executor? I am worried that people will just forget calling executor and being hit with a runtime panic when they use tokio.

The number of functions should be the same between option 1 and 2, thus I'd prefer 2 as it avoids the above footgun because users must choose one of these constructors.

@thomaseizinger
Copy link
Contributor Author

Regarding builder pattern: All required arguments for a component should be part of the ctor of the builder, with optional parameters then being available as functions.

I'd consider the executor a required parameter and thus argue that option 2 also follows the builder pattern correctly.

What do you think?

@vnermolaev
Copy link
Contributor

I am fine with either approach.

In the approach I propose, there will be a default executor

Self {
            local_peer_id,
            transport,
            behaviour,
            pool_config: PoolConfig::new(None),
            connection_limits: Default::default(),
        }

I trust you that this will be inadequate in specific contexts.
All right, I will then draft option 2.

@mergify mergify bot closed this as completed in #3588 Mar 13, 2023
mergify bot pushed a commit that referenced this issue Mar 13, 2023
Mark constructors `Swarm::with_X_executor` as deprecated.
Move the deprecated functionality to `SwarmBuilder::with_X_executor`
Use `SwarmBuilder` throughout.

Resolves #3186.
Resolves #3107.

Pull-Request: #3588.
umgefahren pushed a commit to umgefahren/rust-libp2p that referenced this issue Mar 8, 2024
Mark constructors `Swarm::with_X_executor` as deprecated.
Move the deprecated functionality to `SwarmBuilder::with_X_executor`
Use `SwarmBuilder` throughout.

Resolves libp2p#3186.
Resolves libp2p#3107.

Pull-Request: libp2p#3588.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
getting-started Issues that can be tackled if you don't know the internals of libp2p very well help wanted priority:nicetohave
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants