Please note that this post expresses my personal opinion and mine alone, not those of my employer or my teammates.
I am writing it because I can’t hold my opinion inside anymore. My head was starting to feel like a balloon that is going to burst any minute. I am writing my thoughts and feelings down to free myself from this pressure.
What is asm.js?
Subset unfortunately is a very opaque word. I am not planning to describe asm.js in details, but I am going to give a very short overview of important aspects. All details are available in the specification.
That’s exactly what asm.js does. It fixes the language by saying: fancy features are hard, let us stick to arithmetic. In practice right now asm.js essentially limits you to:
- arithmetic operations;
- loads and stores into
ArrayBufferView, all views sharing a single
- calls to functions that take numbers as arguments and return number as the result, possible function targets are limited so you can’t, for example, create a closure and pass it somewhere.
Additionally asm.js attaches static typing rules to various permitted syntactical constructs. Only code that is consistently typed is considered to be valid. Typing rules are straightforward but not completely trivial, because they for example capture distinction between real 32-bit integers (
signed) and things that can be coerced to them (
(x|0 + y|0)|0 would be typed as
ToInt32(ToInt32(x) + ToInt32(y)). If you ever programmed in an assembly or another not-so-high-level language then you’ll notice that it exactly matches semantics of overflowing 32-bit integer addition. I’ll return to this expression later, so keep it in mind.
The last part of asm.js is the notion of module. All asm.js must be packed into a function that looks like this:
There are some important things to notice here:
"use asm"hint tells VM to attempt to validate function
Mas an asm.js module and possibly compile it through a separate compilation pipeline if validation succeeds;
ArrayBufferthat is going to be a data storage for all the code inside the module as asm.js rules do not allow any allocation besides creating a number of
ArrayBufferViewobjects upon module entry.
foreignis an object that contains external functions that you can call from inside your module, you have to explicitly “import” them into local variables upon module entry.
- asm.js is very strict, if you remove a single
|the code above will be rejected by the validator;
- asm.js puts the compiler’s interest before human’s e.g. you have to initialize
0to tell the validator/compiler that
ais a variable of type int. In practice this is already a no-op for any serious compiler;
Math.imulfunction that performs overflowing 32bit integer multiplication.
What is OdinMonkey?
OdinMonkey is a different beast. It’s a module built on top of IonMonkey that takes asm.js code (detected by
"use asm" annotation), verifies that it is consistently typed and compiles it ahead of time into optimized native code.
This ahead-of-time-static-typing is exactly what makes OdinMonkey different from a normal JS engine and closer to say a C++ compiler.
also known as It would be great to use asm.js to speed jQuery up!
| and don’t use normal JS objects or strings, only numbers and a single typed array, to help GC.
I believe that once you take a step on this path it is very hard to abandon it. You will go further and further away from the full language itself, yet you will continue to add things to the host language that your subset desires for performance reasons (see
Which brings me to two questions that bother me and inflate my head.
But why are we taking this step now?
This is the first one. Honestly I don’t know the answer and I am not going to guess.
When I sit down and think about performance gains that asm.js-implementation OdinMonkey-style brings to the table I don’t see anything that would not be possible to achieve within a normal JIT compilation framework and thus simultaneously make human written and compiler generated output faster.
When I take the code above and look at it, as a V8 engineer would look, I can clearly see ways to generate C++ quality native code without actually relying on AOT or static typing.
[to be completely honest: there is a certainly a tricky bit with moving out-of-bounds access handling to protection violation handler and not-so-tricky one with actually having `imul` in ES standard to allow efficient number multiplication but neither really require AOT]
"use hungarian"; and consistently prefix your variable names with their “class”:
Another thing that makes me thinking is the addition of things like
Math.imul to the language, it is essentially like adding a bytecode instruction.
Let’s be honest and make it explicit: if you want to ship performance critical cross-platform code across the wire define a set of bytecode instructions. You don’t need anything fancy: arithmetic, access to typed heap, calls to named functions (API), calls to local functions, etc.
It has multiple advantages:
Layers of polyfills will start piling up in any case as soon as asm.js will proceed to add more “bytecodes” targeted for more efficient execution: e.g.
BinaryData objects. All these features will not be available in older browsers anyway.
Another example of a bytecode like functionality that the host language itself has not use for is FunctionFuture that asm.js needs to solve its startup issues, as type-checking and generating native code for the whole module takes noticable time if its source is several megabytes. This API is geared towards off-thread compilation and ability to cache generated native code.
[Here it should be noted that normal JIT does not have issues with off-thread and lazy compilation and solves these issues transparently]
Grau ist alle Theorie
Summarizing my concerns:
"use asm"annotation. I don’t believe that anything like asm.js is needed to generate highly efficient native code, it’s a leaky abstraction. Neither do I want developers to be penalized because they forgot a single
+sign somewhere when cranking out asm.js style code manually.
[At this point I have spent several hours writing these things down and I ate all the food that I stashed for Easter holidays. I am starting to think that I failed to clearly convey my thoughts and my feelings so it will be appropriate to starve to death. Thanks for reading anyway.]
At the same time I strongly believe that JIT compilers should strive to achieve continuous performance in the sense that reasonable perturbations to the input source should cause reasonable changes in the performance. Programmer must not be forced to write meaningless
e = +e to tell VM that
e is a double. In the right world there should be no difference in performance between the following two ways of doing the same thing: