Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add File interface #181

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Conversation

andreiltd
Copy link
Contributor

@andreiltd andreiltd commented Dec 4, 2024

@andreiltd andreiltd marked this pull request as draft December 4, 2024 14:35
@andreiltd andreiltd force-pushed the file-builtin branch 4 times, most recently from 94de20c to 72b5f33 Compare December 9, 2024 14:25
@andreiltd andreiltd marked this pull request as ready for review December 9, 2024 17:02
Copy link
Member

@tschneidereit tschneidereit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I left a few inline comments on the code as written, after getting far enough in the review I realized that I find it very difficult to reason through how the code implements the specification.

For example, when trying to understand the File constructor, I'm not sure how it implements the log of (simplifying here!) iterating through the list of fileBits and appending them to the blob underlying the File. If I understand the code correctly, it has a special case for a sequence whose first entry is a File instance. If that case is hit, it copies that File's contents and ignores the rest of the sequence?

But I might misunderstand the code here—that's very plausible!

Which is why I'd like to ask for something that's unfortunately a bunch of additional work: could you go through the code and add comments for which parts of the spec are implemented where? You can see examples of that in the various other builtins, in particular the ones for the fetch API. One that I think is quite decent is Response::redirect.

I used to have a script to turn the spec's source into C++ comments, but unfortunately seem to have lost it :( In the meantime, perhaps the easiest way to quickly get decent comments to use is roughly this:

  1. find the spec algorithm you want to implement/get comments for
  2. select the algorithms text in your browser
  3. select "view source" (this works well at least in Firefox, I haven't tried other browsers)
  4. copy the source
  5. paste it into an html-to-markdown converter, such as https://htmlmarkdown.com/
  6. prefix all lines in the output with //

That should give you something that's at least not too far away from usable.

As I said, I know that I'm asking for a whole bunch of extra work here, and I'm sorry for that. I don't really know of a better way to ensure that we're actually implementing the spec as written: the WPT suite gives us decent assurances, but it's not complete and doesn't cover all corner cases.

builtins/web/file.cpp Outdated Show resolved Hide resolved
builtins/web/blob.cpp Show resolved Hide resolved
builtins/web/file.cpp Outdated Show resolved Hide resolved
@andreiltd
Copy link
Contributor Author

andreiltd commented Dec 11, 2024

Hey @tschneidereit ! Thanks for the feedback! I can definitely add comments with the steps from the spec - that’s a good point.

I've been thinking about this implementation, and I feel like part of the issue here is that I don’t have a clear understanding of how to model inheritance using jsapi. According to the spec, the File API extends Blob. This patch models it using composition: embedding the Blob into a File slot and delegating methods and properties. Maybe that's a correct way, but...

I noticed that the JS_InitClass function allows you to pass a prototype of the parent class, which seems like something I should use here. But I couldn’t find any documentation on how to properly implement this. For example, how do I call the parent constructor from a subclass? How do I control which constructor parameters are passed to the parent?

We shouldn’t need to add a special case for something like File([new File(["TEST"]). This should already be covered by the Blob implementation since it’s equivalent to File([new Blob(["TEST"]). In other words, Blob::is_instance() should return true for both Blob and File objects.

BTW, I don't mind redoing this PR from scratch as long as we can come up with better and more idiomatic code :)

@andreiltd
Copy link
Contributor Author

andreiltd commented Dec 12, 2024

Hey @tschneidereit I refactored the code so that File is now a subclass of Blob. This simplifies the implementation significantly and, hopefully doesn’t rely on too much "black magic." I’ve also added the steps from the spec in the comments.

I will prepare a separate PR for Blob, adding steps from spec to the documentation.

@@ -51,13 +51,19 @@ class Blob : public TraceableBuiltinImpl<Blob> {
static const JSPropertySpec properties[];

static constexpr unsigned ctor_length = 0;
enum Slots { Data, Type, Endings, Readers, Count };
enum Slots { Data, Type, Endings, Readers, Reserved1, Reserved2, Count };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are being reserved by the FIle API, it may just make sense to directly call these MaybeFileName and MaybeFileModified, since those are their only meanings. Alternatively explicitly note they are reserved by the File API - FileAPISlot1 and FileAPISlot2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I'm curious why not just have a File slot and have it point to a file instance here, where the File builtin has the Name and Modified slots?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original goal was to use construct

JS::Construct(cx, blob_ctor_val, file_ctor_val, blob_args, &self);

so that internally it behaved like:

Reflect.construct(Blob, blobArgs, File);

and automatically allocated an object with the File class (allowing to set up extra slots such as name and lastModified). But for some reason JS_NewObjectForConstructor in the Blob constructor always produced a Blob object regardless of the newTarget.

I now switched to a simpler manual approach:

  • Create a File-class object directly,
  • Call Blob’s initialization routine on it (for shared functionality),
  • Then add the File-specific slots in the File constructor.

This keeps the “subclass” logic clean, avoids forcing Blob to know about File, and ensures we get a proper File object with all its extra slots.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly it also looks like using subclassing is now discouraged:

Warning: The standard committee now holds the position that the built-in subclassing mechanism in previous spec versions is over-engineered and causes non-negligible performance and security impacts. New built-in methods consider less about subclasses, and engine implementers are investigating whether to remove certain subclassing mechanisms. Consider using composition instead of inheritance when enhancing built-ins.

Comment on lines 39 to 41
auto now = std::chrono::system_clock::now();
auto ms_since_epoch =
std::chrono::duration_cast<std::chrono::milliseconds>(now.time_since_epoch()).count();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the hostAPI wrapper function for these - MonotonicClock::now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the WallClock simple wrapper. The File implementation expects the timestamp to be the time in ms since Unix Epoch. This is what wasi_wall_clock API provides.

Should we also revisit the Performance implementation? It looks like it is using std::chrono.


bool File::name_get(JSContext *cx, unsigned argc, JS::Value *vp) {
METHOD_HEADER(0);
// TODO: Change this class so that its prototype isn't an instance of the class
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what is meant by this? Is the instance prototype not supposed to be an instance of the class, or the File.prototype constructor "prototype" property?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, no idea. I've been carrying this comment from other classes without giving it much thought.

wasi_clocks_wall_clock_datetime_t time_now;
wasi_clocks_wall_clock_now(&time_now);

uint64_t total_nanoseconds = time_now.seconds * 1000000000ULL + time_now.nanoseconds;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will overflow around year 2554. ¯\_(ツ)_/¯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File API
3 participants