Add streams/web module #761

jackkleeman · 2024-12-31T18:27:39Z

Issue # (if available)

Description of changes

This PR adds ReadableStream and WritableStream. TransformStream is still left to do. Almost all the WPT tests for these two objects are passing, with a couple commented out and explained.

Yes, this PR is very large. It is hard to avoid this as simply getting a working readable stream requires many functions to be implemented. If useful I can try to split out readable and writable although I don't think the line count is going to be much better - most of it is in readable anyway.

I have tried to stay as faithful as possible to the spec (https://streams.spec.whatwg.org/) in the implementation. I have also borrowed occasionally from the reference implementation https://github.com/whatwg/streams/tree/main/reference-implementation for error messages and also for the pipeTo implementation which is not tightly defined by the spec.

I have yet not wired this up for use by other packages. It seems the existing llrt_stream implements a node stream for use in child process, and I don't think a web stream is appropriate there (unless we first implement a web <-> node conversion). We will want to have both stream and stream/web available for sure. The obvious first place to wire up would be fetch, though.

Great care is needed throughout to ensure that user objects are not held when we release control to the user, as they can do something re-entrant (ie, call one of those objects) and we will fail to obtain an owned borrow. The wpt tests do a very good job of finding these cases but there could be more and they create a panic. Maybe fuzzing over the public API could work here. It seems ok if we put an experimental label on the streams/web package for the time being. The other thing to be aware of when releasing control is that the reader/writer may have changed from the one you had a handle on before which isn't necessarily a panic but can lead to bugs. Again, the wpt should catch a lot of these.

Key things for review:

Any changes outside of llrt_stream_web
Try it out

Possible optimizations:

pipeTo could have a more native impl instead of copying the JS reference implementation
Lots of places we use JS promises where we may be able to use Rust async code directly and skip some indirection
~~Various ctx.evals which just need to be cleaned up and replaced with ctx.global().get or similar~~

Checklist

Created unit tests in tests/unit and/or in Rust for my feature if needed
Ran make fix to format JS and apply Clippy auto fixes
Made sure my code didn't add any additional warnings: make check
Added relevant type info in types/ directory
Updated documentation if needed (API.md/README.md/Other)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

nabetti1720 · 2025-01-01T07:35:08Z

I think it's a great job. Just a comment. :)

I have yet not wired this up for use by other packages. It seems the existing llrt_stream implements a node stream for use in child process, and I don't think a web stream is appropriate there (unless we first implement a web <-> node conversion). We will want to have both stream and stream/web available for sure. The obvious first place to wire up would be fetch, though.

Can you expose these new classes from globalThis? That would make it more web standards compliant.
https://min-common-api.proposal.wintercg.org/

jackkleeman · 2025-01-01T15:22:13Z

sure, thats easy to do

nabetti1720 · 2025-01-03T02:05:19Z

I have yet not wired this up for use by other packages. It seems the existing llrt_stream implements a node stream for use in child process, and I don't think a web stream is appropriate there (unless we first implement a web <-> node conversion). We will want to have both stream and stream/web available for sure. The obvious first place to wire up would be fetch, though.

As you say, I think support for fetch is the top priority.

The Runtime compatibility document that I often cite seems to check whether stream classes are supported using the following criteria:

// reproduction.js
const judgement = (function () {
    if (!("fetch" in self)) {
        return { result: false, message: "fetch is not defined" };
    }
    var streamPromise = fetch("/favicon/favicon.ico")
        .then(function (response) {
            return response.body;
        })
        .catch(function () {
            return fetch(
                "https://mdn-bcd-collector.gooborg.com/favicon/favicon.ico"
            ).then(function (response) {
                return response.body;
            });
        });
    if (!streamPromise) {
        return { result: false, message: "streamPromise is falsy" };
    }
    var promise = streamPromise.then(function (stream) {
        return stream.getReader();
    });
    if (!promise) {
        return { result: false, message: "Promise variable is falsy" };
    }
    return promise.then(function (instance) {
        return !!instance;
        // To check if a method exists, use the following:
        // return !!instance && "cancel" in instance;
    });
})();

console.log(await judgement);

When I checked it on my laptop, it seemed that some classes were determined to be "unsupported" because this code produced an error.

% llrt reproduction.js
TypeError: cannot read property 'getReader' of null
  at <anonymous> (/Users/shinya/Workspaces/llrt-test/reproduction.js:19:38)

% bun reproduction.js 
true

Sytten · 2025-01-03T03:06:01Z

I will have to do a full review, but something that I saw that bugged me was the eval for creating errors. I am pretty sure we don't need to do that. We should be able to create them in Rust.

jackkleeman · 2025-01-03T09:56:06Z

@nabetti1720 i think your code is testing whether fetch returns a stream as response.body, but it returns null:

  //FIXME return readable stream when implemented
    #[qjs(get)]
    pub fn body(&self) -> Null {
        Null
    }

lets get this merged first and i can look to wire it up with fetch

jackkleeman · 2025-01-03T09:56:27Z

@Sytten yes, laziness on my part. will fix all the ctx evals today

richarddavison · 2025-01-03T11:40:43Z

Thanks @jackkleeman for this fantastic contribution 🎉

Various ctx.evals which just need to be cleaned up and replaced with ctx.global().get or similar

Like you mentioned, I see some ctx evals for error creation. Best approach here is to use primordials:
BasePrimordials::get(ctx)? where you have access to the type error constructor.
If this is something frequently done you can add a method to BasePrimodials:
fn new_type_error(...)

You can also create them for rust, but then they'll lack stack traces (sometimes, depending on how they are being used). I opened up an issue regarding this:
https://github.com/quickjs-ng/quickjs/blob/master/quickjs.c#L6862-L6865
quickjs-ng/quickjs#782

Lots of places we use JS promises where we may be able to use Rust async code directly and skip some indirection

Can you elaborate a bit more on this?

jackkleeman · 2025-01-03T13:47:29Z

I have now refactored all the ctx.global.get and ctx.eval into primordials. Currently new_type and new_range are just helper methods in my crate, if you want them to go into BasePrimordials let me know

re the promises. the spec has a lot of 'upon fulfillment of promise, do X'. i have a helper method upon_promise for this. the spec also asks that we convert promises into other promises by adding a fulfillment or rejection step. and the spec also asks that we set particular promises as 'handled' ie so they don't complain about unhandled rejections. given that the spec is quite specific, ive implemented things as faithfully as possible. this inevitably means calling .then and .catch a lot with anonymous Function::new() objects.

what i have not done (at all) is use rust async functions, convert promises into futures, etc. as far as i can tell, converting promises into rust futures still calls then and catch with a Function::new() under the hood, but just with a handle to a waker, allowing the previous async code to resume. so maybe performance wise this very similar, but the argument to use more rust async code might be that its easier to follow or more idiomatic. however, it will depart from the spec somewhat which could make it harder to debug, and imo it is a lot easier to accidentally hold a reference to a user object across an await point than it is to accidentally move one into a FnOnce that we execute on promise resolution. and as a reminder, holding references to user objects when we release to user code can lead to panics.

one special case s the pipeTo implementation, which is not actually defined in the spec (instead ive copied the reference implementation). i imagine that we could make that implementation simpler and more concise if we used rust futures more there, without the concern that it will depart from the spec.

richarddavison

Some observations:

First of all, I can't thank you enough for this colossal effort! Huge work 🥇
Great that we avoid accessing globals and use primordials instead (also safer as globals can be modified in user space). However, for perf reasons we should try to clone and store the primordial value in the struct when used frequently and when possible to avoid lookup. Even though primordial lookup is faster than global lookup, it's still expensive for frequent calls.
If possible, avoid using JS Function::new() if not providable from user space. This saves a ton of indication and we can pass fn pointers to rust functions instead. If accessible/definable from user space and from Rust, use an enum to hold either a JS Function, or a Rust function.
Maybe some macros would reduce duplication or simplify instanciantion where code is very similar but can't be refactored into a shared function LMKWYT?

modules/llrt_stream_web/src/readable/reader.rs

modules/llrt_stream_web/src/lib.rs

modules/llrt_stream_web/src/queueing_strategy.rs

modules/llrt_stream_web/src/readable/byob_reader.rs

modules/llrt_stream_web/src/readable/byte_controller.rs

modules/llrt_stream_web/src/readable/iterator.rs

modules/llrt_stream_web/src/readable/tee.rs

jackkleeman · 2025-01-05T12:04:48Z

Re macros. I'll do a pass and think about what can be done. If you have any particular ideas let me know

jackkleeman · 2025-01-06T16:50:03Z

I managed to shave off 700 lines with some new functions and type aliases but not seeing anything obvious beyond those - let me know if you see any

richarddavison · 2025-01-06T20:32:09Z

I managed to shave off 700 lines with some new functions and type aliases but not seeing anything obvious beyond those - let me know if you see any

Fantastic, I'll take a second look!

Sytten

This is a first pass on the review, I didn't yet get to readable and writable modules. I should be able to do that tomorrow.
General comment is to please over-explain everything for future maintainers and split unrelated items into different files.

Sytten · 2025-01-07T02:11:49Z

libs/llrt_utils/src/bytes.rs

@@ -22,6 +26,52 @@ pub enum ObjectBytes<'js> {
    Vec(Vec<u8>),
 }

+// Requires manual implementation because rquickjs hasn't implemented JsLifetime for f32 or f64


We should push a PR to rquickjs to fix that

Sytten · 2025-01-07T02:12:48Z

libs/llrt_utils/src/clone.rs

@@ -2,18 +2,17 @@ use std::collections::HashSet;

 // Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.


Is that worth splitting in a crate? @richarddavison

Sytten · 2025-01-07T02:13:03Z

libs/llrt_utils/src/hash.rs

Same question here @richarddavison

Sytten · 2025-01-07T02:19:24Z

modules/llrt_abort/src/lib.rs

    AbortSignal::add_event_emitter_prototype(ctx)?;
+    AbortSignal::add_event_target_prototype(ctx)?;


IDK if there is a better way about doing this, but I am kinda of annoyed that we have to call those methods for all the Emitter. Also this needs to be added to the typescript typing.

the typescript typing already has it as an eventtarget, it jsut doesn't have it as an eventemitter (im not sure if it even should?

Sytten · 2025-01-07T02:23:51Z

modules/llrt_exceptions/src/lib.rs

@@ -24,9 +25,37 @@ impl DOMException {
        let message = message.0.unwrap_or(String::from(""));
        let name = name.0.unwrap_or(String::from("Error"));

+        // https://webidl.spec.whatwg.org/#dfn-error-names-table
+        let code = match name.as_str() {


AFAIK quickjs doesnt implement that spec? So realistically we need some constants somewhere if we want to use them correctly from rust.

i am not sure what you mean, i agree domexception isnt native to quickjs, but the wpt tests for streams expect that domexceptions on abort will have the code set correctly

What I mean is that if we want to throw those errors from rust we will need those names as constants somewhere

yes, i suppose so. currently it seems we are just using string literals to do so

Maybe an enum for this? The enum can have unit variants which would then mean the discriminant can be equal
to the code. Something like:

let exception_as_value = DomException::IndexSizeError.value(&ctx) //or throw exception Err(DomException::IndexSizeError.throw(&ctx)

modules/llrt_stream_web/src/lib.rs

modules/llrt_stream_web/src/queueing_strategy.rs

Sytten · 2025-01-07T03:22:06Z

Also please write the typescript typing, I did write a simplified version of it since we didnt have most of the API but now that we do we should mostly probably take back the node typing and put it in there.

jackkleeman · 2025-01-07T12:36:56Z

@Sytten re typing, this should be part of the browser globals types? this isnt node streams, which have their own types

Sytten · 2025-01-07T13:44:50Z

Right, it looks like they put it under the stream/web.ts file in the DefinitelyTyped so I would follow the same direction @jackkleeman

Sytten

Some further comments

modules/llrt_stream_web/src/readable/mod.rs

Sytten · 2025-01-07T15:10:59Z

modules/llrt_stream_web/src/readable/mod.rs

+        Self::readable_stream_from_iterable(&ctx, async_iterable)
+    }
+
+    // readonly attribute boolean locked;


My problem with OwnedBorrowMut is that it can easily bite you with a panic if you try to borrow mutably in two different places or if you try to borrow mut and there is already a read-only reference. Same thing happens on the inverse.
It's not obvious because the DerefMut/Deref calls the method that can panic.
I am wondering if we should wrap it in LLRT to force users to do a try_borrow_mut/try_borrow since a panic is always kinda bad and we could instead throw an exception most of the time.

My problem with OwnedBorrowMut is that it can easily bite you with a panic if you try to borrow mutably in two different places or if you try to borrow mut and there is already a read-only reference.

Yes, I am very well acquainted with this!

These panics imply an implementation issue (generally something re-entrant where we call out to user code and they call back in), not a user error, so I don't know if an exception is valid. An exception can leave the stream in an invalid state as these functions are generally not supposed to throw exceptions. IMO given this is a new feature its correct to panic on implementation issues.

modules/llrt_stream_web/src/readable/mod.rs

modules/llrt_stream_web/src/readable/iterator.rs

Sytten · 2025-01-07T16:29:04Z

modules/llrt_stream_web/src/readable/iterator.rs

+            .clone())
+    }
+}
+
+#[derive(Clone, JsLifetime, Trace)]
+struct IteratorPrimordials<'js> {
+    end_of_iteration: Symbol<'js>,


We might want to create an API in quickjs-ng for this.

@richarddavison can you weight in here, same question for the generator below. Can we ask the guys of quickjs to add that?

This is referring to the eval calls

yeah, without a new quickjs api i see no other option here

Agree and I'll ask their opinion. Today its very clunky to work with JS iterators

modules/llrt_stream_web/src/readable/iterator.rs

Sytten · 2025-01-07T22:42:45Z

Sorry the lines of the comments are off due to the force push, I did the review this morning.
I can help on the typing once we get there, we will want to remove references to node and double check that we support everything in there. The typing of node is also sometimes convoluted for not real reason so I do simplify it when need be.

jackkleeman · 2025-01-08T11:06:46Z

I believe we do support everything in there but good to double check. SOunds good

Web streams

748f62f

Set up global ReadableStream etc classes

43d03d2

jackkleeman force-pushed the streams branch from 88eb7d4 to 43d03d2 Compare January 2, 2025 11:18

jackkleeman added 2 commits January 2, 2025 22:20

Avoid Option for readable stream controller

fe6e226

Bundle stored_error into state

da836c9

Avoid usage of ctx.eval and ctx.globals

8e370a4

richarddavison requested changes Jan 3, 2025

View reviewed changes

jackkleeman added 5 commits January 4, 2025 18:28

Use primordials more sparingly

5ea3137

Use Function::new less

a3f67a1

Use primordials for the array constructor lookup

6dc019f

Rename ObjectExt

64afdd8

Move structured_clone into llrt_utils and use it in tee

cd066bf

jackkleeman requested a review from richarddavison January 4, 2025 21:46

Assorted cleanup

36d5e2b

Sytten suggested changes Jan 7, 2025

View reviewed changes

queuing strategy comments

513dfed

jackkleeman added 2 commits January 7, 2025 14:35

Move option helpers into utils

f23d14c

Move helpers from lib into our own utils

329c5b0

jackkleeman force-pushed the streams branch from 450e70f to 329c5b0 Compare January 7, 2025 15:01

jackkleeman added 2 commits January 7, 2025 15:19

Add docs for StreamWebModule

234c245

Add stream/web types

c70633e

Sytten reviewed Jan 7, 2025

View reviewed changes

jackkleeman added 7 commits January 8, 2025 12:04

Review comments

e4f892b

Review comments

a85e623

Split readablestream out into its own module

363b4ad

Move writablestream into its own module

3cde6bf

Check signal matches abortsignal earlier

823c27d

Use Acquire/Release for atomic store/load

52e56a5

Avoid use of super

0b5172e

		@@ -2,18 +2,17 @@ use std::collections::HashSet;

		// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.

		AbortSignal::add_event_emitter_prototype(ctx)?;
		AbortSignal::add_event_target_prototype(ctx)?;

Add streams/web module #761

Are you sure you want to change the base?

Add streams/web module #761

Conversation

jackkleeman commented Dec 31, 2024 • edited Loading

Issue # (if available)

Description of changes

Checklist

nabetti1720 commented Jan 1, 2025 • edited Loading

jackkleeman commented Jan 1, 2025

nabetti1720 commented Jan 3, 2025 • edited Loading

Sytten commented Jan 3, 2025

jackkleeman commented Jan 3, 2025

jackkleeman commented Jan 3, 2025

richarddavison commented Jan 3, 2025 • edited Loading

jackkleeman commented Jan 3, 2025

richarddavison left a comment

Choose a reason for hiding this comment

jackkleeman commented Jan 5, 2025

jackkleeman commented Jan 6, 2025

richarddavison commented Jan 6, 2025

Sytten left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sytten commented Jan 7, 2025

jackkleeman commented Jan 7, 2025

Sytten commented Jan 7, 2025

Sytten left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackkleeman Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sytten commented Jan 7, 2025 • edited Loading

jackkleeman commented Jan 8, 2025

jackkleeman commented Dec 31, 2024 •

edited

Loading

nabetti1720 commented Jan 1, 2025 •

edited

Loading

nabetti1720 commented Jan 3, 2025 •

edited

Loading

richarddavison commented Jan 3, 2025 •

edited

Loading

Sytten left a comment •

edited

Loading

jackkleeman Jan 8, 2025 •

edited

Loading

Sytten commented Jan 7, 2025 •

edited

Loading