PeerId string identifier #680

wemeetagain · 2020-06-19T21:09:03Z

Type:

Question

Description:

Currently, js-libp2p uses PeerId objects to identify peers, which are compared using equals method and printed with toB58String. In many cases, the b58 string is used to index peers (eg: in Maps, Objects, Sets, etc) and there, implicitly used for id equality.

Has any thought been given to using b58 string encoded peer-ids as the canonical peer identifier throught the codebase? In this case, the keystore would be more heavily relied on to retrieve public/private keys since they wouldn't be (optionally) attached as in PeerId objects. In many cases, checking the validity of a string identifier would still be required, but strings are convenient to use for indexing and equality checking. Perhaps this may also result in general speedups and reduced memory usage, as string equality checking is generally faster than Buffer equality checking and public keys can all be stored in one place.

I definitely haven't thought thru the intricacies, just more an idea after seeing lots of id.toB58String() in my own code, and seeing that peer.ID is also a string in go-libp2p.

Curious more than anything if this is being considered or on the roadmap, and if this has been discussed before.

The text was updated successfully, but these errors were encountered:

vasco-santos · 2020-06-21T19:14:36Z

This seems a good idea! Totally in favour of moving on a direction of using a string identifier and rely on the PeerStore.
We plan to move away from the id.toB58String() in favour of id.toString(). When we work on this change, I think we should definitely consider this suggestion.

wemeetagain · 2020-12-08T21:32:34Z

I noticed, profiling lodestar, that requesting peers from the peer store (calling libp2p.peerStore.peers) is not very performant, taking roughly 500ms per call (with roughly 50 or so peers).
It appears to mainly be a result of creating PeerId objects.

I'd expect that resolving this issue in the direction of using string peer-ids directly will result in a significant performance improvement.

vasco-santos · 2020-12-09T09:27:44Z

Thanks for the data @wemeetagain
So, once we use peerIdStr everywhere we will be able to return also the string in the peerStore.peers.

However, we should look into what is causing this performance degradation. The bottleneck should probably be in https://github.com/libp2p/js-peer-id/blob/master/src/index.js#L27 creating the b58string which we already have, or creating the cid in https://github.com/libp2p/js-peer-id/blob/master/src/index.js#L248 . Both have sync encodings/decodings.

@jacobheun do you think we can get rid of b58string usage by default in all libp2p modules for 0.31?

jacobheun · 2020-12-09T14:44:45Z

However, we should look into what is causing this performance degradation.

Yes, we need to get more benchmarks in place and run those regularly (not necessarily as PRs, perhaps daily crons). The fact that this is reassembling the data from the individual books each time is problematic, and the code is doing a lot of iteration on the data to reassemble it.

@jacobheun do you think we can get rid of b58string usage by default in all libp2p modules for 0.31?

@vasco-santos yes, we can internally change or base (likely base36 based on subdomain usefulness) and just start using toString(). If applications need to b58 for some reason, they can do the conversion before printing.

dapplion · 2021-03-24T14:37:56Z

See this CPU profile showing the bottleneck in performance.

dapplion · 2021-03-24T14:48:21Z

After refactoring peer managment in Lodestar I strongly believe moving from a class PeerId to a string as canonical identifier should be a priority. It would simplify the codebase significantly and improve performance omitting costly serialization and deserialization.

BigLep · 2022-06-28T15:54:07Z

2022-06-28 conversation: in general, it has been intentional for js-libp2p to not just strings to avoid ambiguity around peerIds, multiaddrs, etc. There have been improvements in recent js-libp2p releases. Another look at the Lodestar performance impacts will be taken once ChainSafe/lodestar#4114 is addressed.

GlenDC · 2023-01-06T09:05:02Z

I am not sure if related, but if I have code such as the following:

const node = await createLibp2p({
    // ...
})

// ...

client.publish('peers', { id: node.peerId, addrs: listenAddrs })
const peers = []
const peerTopic = await client.subscribe('peers')
for (let i = 0; i < runenv.testInstanceCount; i++) {
    const result = await peerTopic.wait.next()
    peers.push(result.value)
}
peerTopic.cancel()

I can only compare the peerIds (to make sure I do not dial to my own node for example) by using == rather then the tripple variant (===):

peers[0].id == node.peerId // true
peers[0].id === node.peerId // false

I suppose because I might need to convert it back to a type from a string, when receiving it from over the network?

wemeetagain · 2023-06-12T14:11:29Z

Related to the concerns about avoiding ambiguity around peer ids, multiaddrs, other string types:

Some discussion around "nominal types" that provides some insights: microsoft/TypeScript#202

We may use "branded types" to differentiate between various string types and functionally achieve nominal types.
As an example, see https://github.com/ChainSafe/ts-peer-id/blob/master/src/index.ts
It achieves code that works like so:

import {PeerIdString, validateString} from '@chainsafe/ts-peer-id'

const str = '...'

function doWithPeerId(id: PeerIdString): void {...}

doWithPeerId(str) // type error, `string` doesn't satisfy `PeerIdString`

validateString(str) // asserts that the `str` is of type `PeerIdString`

doWithPeerId(str) // now there's no type error

const str2 = '...' as PeerIdString // alternatively, a string can be explicitly cast to `PeerIdString`

achingbrain · 2024-04-07T11:34:34Z

I'm doing some memory analysis of processes under heavy load, I think this might be worth experimenting with - and should be extended to multiaddrs and CIDs too. We keep a lot of Uint8Arrays in memory which are famously inefficient - JS heap size is under control but the overall RSS of the process increases until it dies. Storing these types as strings should help here, though CPU usage may go up when we need to operate on these types & it'll be a breaking change.

achingbrain · 2024-04-09T12:15:33Z

One thing we could do that might be a better middle ground, is to have the PeerId/Multiaddr objects store strings internally and not Uint8Arrays - at the moment we store the PeerId multihash (which contains a byte array) and the Multiaddr as a string/byte array/tuples/string tuples.

This would be a non-breaking change and should be lighter on memory use as we'd not be storing Uint8Arrays in long-lived objects. It still gives us type safety and is a lot less disruptive than changing everything to strings.

SgtPooki · 2024-09-20T12:17:30Z

JS Colo 2024 rough discussion

PeerId interface has two multihash

what do we use more often, string version of peerid or multihash version of peerId?

when we use the public key, we pull it out of the multihash digest, which contains the protobuf with the keytype and the key, when we use protons to read that out, it returns a subarray of the main array so we're not copying.

what we could do:

deploy two bootstrappers & compare CPU and mem usage
- one where peerId stores the multihash as a multihash digest object (current)
- one where it stores the multihash as a string (potential improvement)
  - this work is a non-breaking change and should be pretty easy, so we should be able to do this work fairly easily. Dependent upon getting https://github.com/libp2p/js-libp2p-amino-dht-bootstrapper deployed in prod, ensuring it's stable, and then setting up a twin we can use for staging.

Action item:

determine whether we operate on peerIds more as strings (a lot of toString()) or the multihash digest version of a peerId
1. deploy two bootstrappers & compare CPU and mem usage
  - one where peerId stores the multihash as a multihash digest object (current)
  - one where it stores the multihash as a string (potential improvement)
    - this work is a non-breaking change and should be pretty easy, so we should be able to do this work fairly easily. Dependent upon getting https://github.com/libp2p/js-libp2p-amino-dht-bootstrapper deployed in prod, ensuring it's stable, and then setting up a twin we can use for staging.

vasco-santos added the kind/enhancement A net-new feature or improvement to an existing feature label Jun 21, 2020

jacobheun added the status/ready Ready to be worked label Jul 9, 2020

vasco-santos mentioned this issue Jan 21, 2021

fix: store provider multiaddrs during find providers #865

Merged

vasco-santos mentioned this issue Apr 14, 2021

chore: config typescript #904

Merged

6 tasks

vasco-santos mentioned this issue Apr 29, 2021

[Types] SwarmAPI.connect requires a Multiaddr but libp2p.dial allows PeerId|Multiaddr|string ipfs/js-ipfs#3638

Closed

wemeetagain mentioned this issue Dec 2, 2021

Incompatible argument types libp2p/js-libp2p-interfaces#101

Closed

wemeetagain mentioned this issue Jun 22, 2022

PeerId string identifier libp2p/js-libp2p-interfaces#256

Closed

wemeetagain mentioned this issue Aug 25, 2022

Log bad PeerId format ChainSafe/lodestar#4479

Merged

wemeetagain mentioned this issue Aug 12, 2024

Remove private key from peer id #2659

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PeerId string identifier #680

PeerId string identifier #680

wemeetagain commented Jun 19, 2020

vasco-santos commented Jun 21, 2020

wemeetagain commented Dec 8, 2020

vasco-santos commented Dec 9, 2020

jacobheun commented Dec 9, 2020

dapplion commented Mar 24, 2021 •

edited

Loading

dapplion commented Mar 24, 2021

BigLep commented Jun 28, 2022

GlenDC commented Jan 6, 2023

wemeetagain commented Jun 12, 2023

achingbrain commented Apr 7, 2024

achingbrain commented Apr 9, 2024 •

edited

Loading

SgtPooki commented Sep 20, 2024

PeerId string identifier #680

PeerId string identifier #680

Comments

wemeetagain commented Jun 19, 2020

Type:

Description:

vasco-santos commented Jun 21, 2020

wemeetagain commented Dec 8, 2020

vasco-santos commented Dec 9, 2020

jacobheun commented Dec 9, 2020

dapplion commented Mar 24, 2021 • edited Loading

dapplion commented Mar 24, 2021

BigLep commented Jun 28, 2022

GlenDC commented Jan 6, 2023

wemeetagain commented Jun 12, 2023

achingbrain commented Apr 7, 2024

achingbrain commented Apr 9, 2024 • edited Loading

SgtPooki commented Sep 20, 2024

dapplion commented Mar 24, 2021 •

edited

Loading

achingbrain commented Apr 9, 2024 •

edited

Loading