Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streams/web module #761

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open

Add streams/web module #761

wants to merge 23 commits into from

Conversation

jackkleeman
Copy link

@jackkleeman jackkleeman commented Dec 31, 2024

Issue # (if available)

#544

Description of changes

This PR adds ReadableStream and WritableStream. TransformStream is still left to do. Almost all the WPT tests for these two objects are passing, with a couple commented out and explained.

Yes, this PR is very large. It is hard to avoid this as simply getting a working readable stream requires many functions to be implemented. If useful I can try to split out readable and writable although I don't think the line count is going to be much better - most of it is in readable anyway.

I have tried to stay as faithful as possible to the spec (https://streams.spec.whatwg.org/) in the implementation. I have also borrowed occasionally from the reference implementation https://github.com/whatwg/streams/tree/main/reference-implementation for error messages and also for the pipeTo implementation which is not tightly defined by the spec.

I have yet not wired this up for use by other packages. It seems the existing llrt_stream implements a node stream for use in child process, and I don't think a web stream is appropriate there (unless we first implement a web <-> node conversion). We will want to have both stream and stream/web available for sure. The obvious first place to wire up would be fetch, though.

Great care is needed throughout to ensure that user objects are not held when we release control to the user, as they can do something re-entrant (ie, call one of those objects) and we will fail to obtain an owned borrow. The wpt tests do a very good job of finding these cases but there could be more and they create a panic. Maybe fuzzing over the public API could work here. It seems ok if we put an experimental label on the streams/web package for the time being. The other thing to be aware of when releasing control is that the reader/writer may have changed from the one you had a handle on before which isn't necessarily a panic but can lead to bugs. Again, the wpt should catch a lot of these.

Key things for review:

  1. Any changes outside of llrt_stream_web
  2. Try it out

Possible optimizations:

  1. pipeTo could have a more native impl instead of copying the JS reference implementation
  2. Lots of places we use JS promises where we may be able to use Rust async code directly and skip some indirection
  3. Various ctx.evals which just need to be cleaned up and replaced with ctx.global().get or similar

Checklist

  • Created unit tests in tests/unit and/or in Rust for my feature if needed
  • Ran make fix to format JS and apply Clippy auto fixes
  • Made sure my code didn't add any additional warnings: make check
  • Added relevant type info in types/ directory
  • Updated documentation if needed (API.md/README.md/Other)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@nabetti1720
Copy link
Contributor

nabetti1720 commented Jan 1, 2025

I think it's a great job. Just a comment. :)

I have yet not wired this up for use by other packages. It seems the existing llrt_stream implements a node stream for use in child process, and I don't think a web stream is appropriate there (unless we first implement a web <-> node conversion). We will want to have both stream and stream/web available for sure. The obvious first place to wire up would be fetch, though.

Can you expose these new classes from globalThis? That would make it more web standards compliant.
https://min-common-api.proposal.wintercg.org/

@jackkleeman
Copy link
Author

sure, thats easy to do

@nabetti1720
Copy link
Contributor

nabetti1720 commented Jan 3, 2025

I have yet not wired this up for use by other packages. It seems the existing llrt_stream implements a node stream for use in child process, and I don't think a web stream is appropriate there (unless we first implement a web <-> node conversion). We will want to have both stream and stream/web available for sure. The obvious first place to wire up would be fetch, though.

As you say, I think support for fetch is the top priority.

The Runtime compatibility document that I often cite seems to check whether stream classes are supported using the following criteria:

// reproduction.js
const judgement = (function () {
    if (!("fetch" in self)) {
        return { result: false, message: "fetch is not defined" };
    }
    var streamPromise = fetch("/favicon/favicon.ico")
        .then(function (response) {
            return response.body;
        })
        .catch(function () {
            return fetch(
                "https://mdn-bcd-collector.gooborg.com/favicon/favicon.ico"
            ).then(function (response) {
                return response.body;
            });
        });
    if (!streamPromise) {
        return { result: false, message: "streamPromise is falsy" };
    }
    var promise = streamPromise.then(function (stream) {
        return stream.getReader();
    });
    if (!promise) {
        return { result: false, message: "Promise variable is falsy" };
    }
    return promise.then(function (instance) {
        return !!instance;
        // To check if a method exists, use the following:
        // return !!instance && "cancel" in instance;
    });
})();

console.log(await judgement);

When I checked it on my laptop, it seemed that some classes were determined to be "unsupported" because this code produced an error.

% llrt reproduction.js
TypeError: cannot read property 'getReader' of null
  at <anonymous> (/Users/shinya/Workspaces/llrt-test/reproduction.js:19:38)

% bun reproduction.js 
true

@Sytten
Copy link
Collaborator

Sytten commented Jan 3, 2025

I will have to do a full review, but something that I saw that bugged me was the eval for creating errors. I am pretty sure we don't need to do that. We should be able to create them in Rust.

@jackkleeman
Copy link
Author

@nabetti1720 i think your code is testing whether fetch returns a stream as response.body, but it returns null:

  //FIXME return readable stream when implemented
    #[qjs(get)]
    pub fn body(&self) -> Null {
        Null
    }

lets get this merged first and i can look to wire it up with fetch

@jackkleeman
Copy link
Author

@Sytten yes, laziness on my part. will fix all the ctx evals today

@richarddavison
Copy link
Contributor

richarddavison commented Jan 3, 2025

Thanks @jackkleeman for this fantastic contribution 🎉

Various ctx.evals which just need to be cleaned up and replaced with ctx.global().get or similar

Like you mentioned, I see some ctx evals for error creation. Best approach here is to use primordials:
BasePrimordials::get(ctx)? where you have access to the type error constructor.
If this is something frequently done you can add a method to BasePrimodials:
fn new_type_error(...)

You can also create them for rust, but then they'll lack stack traces (sometimes, depending on how they are being used). I opened up an issue regarding this:
https://github.com/quickjs-ng/quickjs/blob/master/quickjs.c#L6862-L6865
quickjs-ng/quickjs#782

Lots of places we use JS promises where we may be able to use Rust async code directly and skip some indirection

Can you elaborate a bit more on this?

@jackkleeman
Copy link
Author

I have now refactored all the ctx.global.get and ctx.eval into primordials. Currently new_type and new_range are just helper methods in my crate, if you want them to go into BasePrimordials let me know

re the promises. the spec has a lot of 'upon fulfillment of promise, do X'. i have a helper method upon_promise for this. the spec also asks that we convert promises into other promises by adding a fulfillment or rejection step. and the spec also asks that we set particular promises as 'handled' ie so they don't complain about unhandled rejections. given that the spec is quite specific, ive implemented things as faithfully as possible. this inevitably means calling .then and .catch a lot with anonymous Function::new() objects.

what i have not done (at all) is use rust async functions, convert promises into futures, etc. as far as i can tell, converting promises into rust futures still calls then and catch with a Function::new() under the hood, but just with a handle to a waker, allowing the previous async code to resume. so maybe performance wise this very similar, but the argument to use more rust async code might be that its easier to follow or more idiomatic. however, it will depart from the spec somewhat which could make it harder to debug, and imo it is a lot easier to accidentally hold a reference to a user object across an await point than it is to accidentally move one into a FnOnce that we execute on promise resolution. and as a reminder, holding references to user objects when we release to user code can lead to panics.

one special case s the pipeTo implementation, which is not actually defined in the spec (instead ive copied the reference implementation). i imagine that we could make that implementation simpler and more concise if we used rust futures more there, without the concern that it will depart from the spec.

Copy link
Contributor

@richarddavison richarddavison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some observations:

  1. First of all, I can't thank you enough for this colossal effort! Huge work 🥇
  2. Great that we avoid accessing globals and use primordials instead (also safer as globals can be modified in user space). However, for perf reasons we should try to clone and store the primordial value in the struct when used frequently and when possible to avoid lookup. Even though primordial lookup is faster than global lookup, it's still expensive for frequent calls.
  3. If possible, avoid using JS Function::new() if not providable from user space. This saves a ton of indication and we can pass fn pointers to rust functions instead. If accessible/definable from user space and from Rust, use an enum to hold either a JS Function, or a Rust function.
  4. Maybe some macros would reduce duplication or simplify instanciantion where code is very similar but can't be refactored into a shared function LMKWYT?

modules/llrt_stream_web/src/readable/reader.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/lib.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/queueing_strategy.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/byob_reader.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/byob_reader.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/byte_controller.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/byte_controller.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/iterator.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/tee.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/tee.rs Outdated Show resolved Hide resolved
@jackkleeman
Copy link
Author

Re macros. I'll do a pass and think about what can be done. If you have any particular ideas let me know

@jackkleeman
Copy link
Author

I managed to shave off 700 lines with some new functions and type aliases but not seeing anything obvious beyond those - let me know if you see any

@richarddavison
Copy link
Contributor

I managed to shave off 700 lines with some new functions and type aliases but not seeing anything obvious beyond those - let me know if you see any

Fantastic, I'll take a second look!

Copy link
Collaborator

@Sytten Sytten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a first pass on the review, I didn't yet get to readable and writable modules. I should be able to do that tomorrow.
General comment is to please over-explain everything for future maintainers and split unrelated items into different files.

@@ -22,6 +26,52 @@ pub enum ObjectBytes<'js> {
Vec(Vec<u8>),
}

// Requires manual implementation because rquickjs hasn't implemented JsLifetime for f32 or f64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should push a PR to rquickjs to fix that

@@ -2,18 +2,17 @@ use std::collections::HashSet;

// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that worth splitting in a crate? @richarddavison

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here @richarddavison

Comment on lines 18 to +19
AbortSignal::add_event_emitter_prototype(ctx)?;
AbortSignal::add_event_target_prototype(ctx)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK if there is a better way about doing this, but I am kinda of annoyed that we have to call those methods for all the Emitter. Also this needs to be added to the typescript typing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the typescript typing already has it as an eventtarget, it jsut doesn't have it as an eventemitter (im not sure if it even should?

@@ -24,9 +25,37 @@ impl DOMException {
let message = message.0.unwrap_or(String::from(""));
let name = name.0.unwrap_or(String::from("Error"));

// https://webidl.spec.whatwg.org/#dfn-error-names-table
let code = match name.as_str() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK quickjs doesnt implement that spec? So realistically we need some constants somewhere if we want to use them correctly from rust.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure what you mean, i agree domexception isnt native to quickjs, but the wpt tests for streams expect that domexceptions on abort will have the code set correctly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that if we want to throw those errors from rust we will need those names as constants somewhere

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i suppose so. currently it seems we are just using string literals to do so

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe an enum for this? The enum can have unit variants which would then mean the discriminant can be equal
to the code. Something like:

let exception_as_value = DomException::IndexSizeError.value(&ctx)
//or throw exception
Err(DomException::IndexSizeError.throw(&ctx)

modules/llrt_stream_web/src/lib.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/queueing_strategy.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/queueing_strategy.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/queueing_strategy.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/queueing_strategy.rs Outdated Show resolved Hide resolved
@Sytten
Copy link
Collaborator

Sytten commented Jan 7, 2025

Also please write the typescript typing, I did write a simplified version of it since we didnt have most of the API but now that we do we should mostly probably take back the node typing and put it in there.

@jackkleeman
Copy link
Author

@Sytten re typing, this should be part of the browser globals types? this isnt node streams, which have their own types

@Sytten
Copy link
Collaborator

Sytten commented Jan 7, 2025

Right, it looks like they put it under the stream/web.ts file in the DefinitelyTyped so I would follow the same direction @jackkleeman

Copy link
Collaborator

@Sytten Sytten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some further comments

modules/llrt_stream_web/src/readable/mod.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/mod.rs Outdated Show resolved Hide resolved
Self::readable_stream_from_iterable(&ctx, async_iterable)
}

// readonly attribute boolean locked;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem with OwnedBorrowMut is that it can easily bite you with a panic if you try to borrow mutably in two different places or if you try to borrow mut and there is already a read-only reference. Same thing happens on the inverse.
It's not obvious because the DerefMut/Deref calls the method that can panic.
I am wondering if we should wrap it in LLRT to force users to do a try_borrow_mut/try_borrow since a panic is always kinda bad and we could instead throw an exception most of the time.

Copy link
Author

@jackkleeman jackkleeman Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem with OwnedBorrowMut is that it can easily bite you with a panic if you try to borrow mutably in two different places or if you try to borrow mut and there is already a read-only reference.

Yes, I am very well acquainted with this!

These panics imply an implementation issue (generally something re-entrant where we call out to user code and they call back in), not a user error, so I don't know if an exception is valid. An exception can leave the stream in an invalid state as these functions are generally not supposed to throw exceptions. IMO given this is a new feature its correct to panic on implementation issues.

modules/llrt_stream_web/src/readable/mod.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/mod.rs Outdated Show resolved Hide resolved
Comment on lines +543 to +549
.clone())
}
}

#[derive(Clone, JsLifetime, Trace)]
struct IteratorPrimordials<'js> {
end_of_iteration: Symbol<'js>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to create an API in quickjs-ng for this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richarddavison can you weight in here, same question for the generator below. Can we ask the guys of quickjs to add that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is referring to the eval calls

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, without a new quickjs api i see no other option here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree and I'll ask their opinion. Today its very clunky to work with JS iterators

@Sytten
Copy link
Collaborator

Sytten commented Jan 7, 2025

Sorry the lines of the comments are off due to the force push, I did the review this morning.
I can help on the typing once we get there, we will want to remove references to node and double check that we support everything in there. The typing of node is also sometimes convoluted for not real reason so I do simplify it when need be.

@jackkleeman
Copy link
Author

I believe we do support everything in there but good to double check. SOunds good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants