-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to avoid per-function allocation of fast interpreter #3921
Comments
In my understanding, by design, both the slow and fast interpreters are capable of executing Wasm opcode from read-only flash memory. This is managed by the flag At least, the fast interpreter uses more RAM than the slow one to store the processed Wasm code. If slow interpreter is unable to execute a .wasm file from read-only flash, I believe this is a problem and we should definitely investigate it. |
No. That undocumented flag is always set to |
Its hard to understand exactly what happens when the binary is loaded, but it seems like the fast interpreter produces new opcodes optimzied for speed, while the slow interpreter patches the existing opcodes. So fast interpeter works with the wasm binary in read only memory because it doesn't need to patch existing opcodes, instead producing new ones in RAM. |
It doesn't seem to match our design; we'll take a closer look and keep you updated. |
@wenyongh @xujuntwt95329 @loganek @TianlongLiang @yamt I've noticed that there are two functions that will write back to the binary file, which does not comply with the read-only flash requirement.
To address this issue, we have two options:
(As the author mentioned above.) I used to think that not altering the binary content was one of our design principles. It appears that it's not, but I still believe this principle is important for certain scenarios, like streaming and this particular embedded hardware. I'm inclined to choose the first option even though it's slower (we all know the classic interpreter is the slowest and it won't make it much worse). Please share your thoughts. |
Agree to enable classic interpreter without modifying the binary first. For fast interpreter, I guess there is little room to reduce the size of pre-compiled code. |
Feature
Option to make the fast interpreter not allocate so much memory;
Right now, the fast interpreter allocates close to 1000 bytes of memory for each function in the web assembly which is a lot of RAM for an embedded environment.
I would like to be able to turn avoid this, even if it means letting the code run slower.
Also because the fast interpreter can be run from read-only (flash) memory, while the slow interpreter can not, and the wasm binary must be copied into RAM.
Benefit
Reducing RAM usage.
Implementation
The big culprit is in
wasm_loader_ctx_init()
where the two structsBranchBlock
andConst
constitute allocations of 192 + 704 = 896 bytes of memory per function.Alternatives
Alternative could be to let the slow interpreter work from read-only memory, to avoid the memory needed when copying it into RAM.
The text was updated successfully, but these errors were encountered: