Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetch: Change identifier pb type from string to bytes? #656

Open
achingbrain opened this issue Jan 7, 2025 · 3 comments · May be fixed by #657
Open

fetch: Change identifier pb type from string to bytes? #656

achingbrain opened this issue Jan 7, 2025 · 3 comments · May be fixed by #657

Comments

@achingbrain
Copy link
Member

The fetch protocol uses this protobuf:

message FetchRequest {
	string identifier = 1;
}

One use case for fetch is resolving IPNS records directly from other nodes (ref).

The identifier for this operation is "/ipns/" + public-key-multihash-bytes.

Treating identifier as a string means we need to stringify the value before serializing it to pb bytes, and we'll also interpret any received value as a string.

golang treats public-key-multihash-bytes as a char array so this works as expected.

JavaScript strings are all UTF-16, all the time so it's possible to have a series of bytes that we can't round trip to a string if some pairs of byte values happen to be interpretable as a multi-byte character.

Since the on-the-wire format of string and bytes in protobuf is identical, JS could just unilaterally treat identifier as bytes to make the problem go away, or we can change the type in the spec definition so it's clear for implementers?

cc @aschmahmann

@github-project-automation github-project-automation bot moved this to Triage in libp2p Specs Jan 7, 2025
@achingbrain achingbrain changed the title Change fetch key pb type from string to bytes? fetch: Change identifier pb type from string to bytes? Jan 7, 2025
@MarcoPolo
Copy link
Contributor

This is a backwards compatible change, as you say. Given that we are already treating it as bytes, I think this is a reasonable change. Could you please add a link to the code that treats this as bytes just for reference?

@aschmahmann
Copy link
Contributor

Makes sense to me, this is likely just a bug in the original spec since it is used in practice for arbitrary bytes (e.g. the IPNS example you gave).

@achingbrain
Copy link
Member Author

Could you please add a link to the code that treats this as bytes just for reference?

I think it's this line:

https://github.com/libp2p/go-libp2p-pubsub-router/blob/ea64ffe9cd424c492a129e4d4dc39031cb3a174d/pubsub.go#L311C2-L311C65

getLocal is passed to newFetchProtocol as the getData function here:

https://github.com/libp2p/go-libp2p-pubsub-router/blob/ea64ffe9cd424c492a129e4d4dc39031cb3a174d/pubsub.go#L106

...which receives the identifier here:

https://github.com/libp2p/go-libp2p-pubsub-router/blob/ea64ffe9cd424c492a129e4d4dc39031cb3a174d/fetch.go#L49

Unless I'm mis-reading the code, I think the problem is that the fetch protocol as implemented in go-libp2p-pubsub-router accepts datastore keys as identifier.

If it had some sort of per-prefix lookup function or other indirection we could use the IPNS Name string representation instead and identifier could remain as a string, which generally has better DX?

achingbrain added a commit to libp2p/js-libp2p that referenced this issue Jan 8, 2025
To allow use cases like fetching IPNS records in a way compatible
with go-libp2p we need to send binary as fetch identifiers.

JavaScript strings are UTF-16 so we can't round-trip binary
reliably since some byte sequences are interpreted as multi-byte
characters.

Instead we need to accept Uint8Arrays and send them over the wire
as-is.

Refs: libp2p/specs#656

BREAKING CHANGE: registered lookup functions now receive a Uint8Array identifier instead of a string
achingbrain added a commit that referenced this issue Jan 9, 2025
Allows the use case of fetching IPNS records via go-libp2p-pubsub-router's fetch implementation from environments such as JavaScript where you cannot reliably round-trip bytes to strings and back due to strings being UTF-16.

Fixes #656
achingbrain added a commit to libp2p/js-libp2p that referenced this issue Jan 9, 2025
To allow use cases like fetching IPNS records in a way compatible with go-libp2p we need to send binary as fetch identifiers.

JavaScript strings are UTF-16 so we can't round-trip binary reliably since some byte sequences are interpreted as multi-byte or otherwise non-printable characters.

Instead we need to accept Uint8Arrays and send them over the wire as-is.

This is a backwards compatible change as far as interop goes since protobuf `bytes` and `string` types are identical on the wire, but it's breaking for API consumers in that the lookup function now needs to accept a `Uint8Array` identifier instead of a `string`.

Refs: libp2p/specs#656

BREAKING CHANGE: registered lookup functions now receive a Uint8Array identifier instead of a string
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

Successfully merging a pull request may close this issue.

3 participants