Something something jsonDecode

by Slava Egorov

Language shapes the way we think, and determines what we can think about.

— Benjamin Lee Whorf

Sapir–Whorf hypothesis

aka linguistic relativity

I speak Spanish to God,
Italian to women,
French to men,
and German to my horse.

— Charles V (allegedly)

I make my mobile apps in Dart, and my backends in JavaScript (Java, C#, C++, Rust, etc)

— modern developer

🀬 jsonDecode

What are we comparing?

                    // JavaScript
                    let obj = JSON.parse(str);
                    // String into an Object/Array/...
                    // Implemented in C++, e.g. in V8:
                    //    src/json/json-parser.{h,cc}

                    final map = jsonDecode(str);
                    // String into an Map/List/...
                    // Implemented in Dart, e.g.
                    // sdk/lib/_internal/
                    //   vm/lib/convert_patch.dart
                    //   wasm/lib/convert_patch.dart
                    //   wasm_js_compatibility/lib/convert_patch.dart
                    //   js_runtime/lib/convert_patch.dart
                    //   js_dev_runtime/patch/convert_patch.dart

Web is complicated

  • Comes with fast JSON.parse
  • but that makes JS Object, not a Map
  • does not go into Wasm structs directly
  • takes a JS string not bytes

I will focus on native (aka Dart VM, Dart Native Runtime)

Optimize anything in 3 steps:

  1. Measure performance, if it is okay - you are done.
  2. Identify and remove some unnecessary work.
  3. Go to step 1.

                    // Measures JSON decoding speed in ns per byte
                    double measureSpeed(int N, String input, int byteLength) {
                      final sw = Stopwatch()..start();
                      for (var i = 0; i < N; i++) {
                      final usPerIteration = sw.elapsedMicroseconds / N;
                      final nsPerByte = (usPerIteration * 1000) / byteLength;
                      return nsPerByte;

                    // Measure JSON decoding speed in ns per byte
                    function measureSpeed(N, input, byteLength) {
                      let start =;
                      for (let i = 0; i < N; i++) {
                      let end =;
                      let usPerIteration = (end - start) * 1000 / N;
                      let nsPerByte = (usPerIteration * 1000) / byteLength;
                      return nsPerByte;

                    $ du -h github_events.json
                    64K github_event.json
                    $ v8 json-benchmark.js -- github_events.json
                    JSON.parse: 2.28 ns/byte
                    $ dart compile exe json-benchmark.dart
                    $ json-benchmark.exe github_events.json
                    JsonDecoder: 8.79 ns/byte

should we rewrite JsonDecoder in C++?

maybe not yet

« I want to speak Dart to my horse! »


but where does String come from?


usually it arrives from the network in form of UTF8 bytes


JSON.parse can only parse from string, but in Dart...

                        import 'dart:convert';

                        print(const JsonDecoder());
                        // Instance of 'JsonDecoder'
                        print(const Utf8Decoder().fuse(const JsonDecoder()));
                        // Instance of '_JsonUtf8Decoder' (surprise!)

                    // Measures JSON decoding speed in ns per byte
                    double measureSpeed(int N, Uint8List input) {
                      final decoder =
                          const Utf8Decoder().fuse(const JsonDecoder());
                      // ...
                      for (var i = 0; i < N; i++) {
                      // ...

                    // Measure JSON decoding speed in ns per byte
                    function measureSpeed(N, input /* ArrayBuffer */) {
                      // ...
                      for (let i = 0; i < N; i++) {
                        // decodeUtf8 just calls V8's String::NewFromUtf8
                      // ...

                    $ v8 json-benchmark.js -- github_events.json
                    UTF8Decode+JSON.parse: 2.78 ns/byte
                    $ dart compile exe json-benchmark.dart
                    $ json-benchmark.exe github_events.json
                    JsonUtf8Decoder: 8.53 ns/byte

time to looks closer

                    $ dart compile aot-snapshot json-benchmark.dart
                    $ perf record -g dartaotruntime json-benchmark.aot \
                    $ perf report
41.82% _ChunkedJsonParser.parseString
19.27% _ChunkedJsonParser.parse
 6.84% Uint8List.[]
 5.44% _ChunkedJsonParser.parseStringToBuffer
 3.01% String.hashCode
 2.83% _LinkedHashMapMixin._insert
 2.40% _StringBase._createOneByteString
 2.10% _LinkedHashMapMixin._set
 1.90% _LinkedHashMapMixin._findValueOrInsertPoint
 1.77% _StringBase.createFromCharCodes
 1.46% allocateOneByteString
 1.13% _LinkedHashMapMixin.[]=
 1.02% _LinkedHashMapMixin._init
 0.95% _ChunkedJsonParser.parseNumber

                        Uint8List bytes;


                        // The expectation is that this is just a few
                        // CPU instructions: check index in bounds,
                        // load memory. Not a call!

                        import 'dart:typed_data';

                        int foo(Uint8List bytes) => bytes[0];

                        void main() {

                    $ dart compile exe                                         \
                        --extra-gen-snapshot-options                           \
                          --disassemble-optimized                              \
                        --extra-gen-snapshot-options                           \
                          --code-comments                                      \
                        --extra-gen-snapshot-options                           \
                          --print-flow-graph-filter=foo                        \
                        -v test.dart

                        mov r2, r1         ;; r1: bytes
                        ldr r3, [r2, #15]  ;; load length
                        asr r0, r3, #1
                        movz r1, #0x0
                        cmp r1, r0         ;; length ≤ 0?
                        bcs ->oob
                        ldrb r0, [r2, #23] ;; load byte
                        ret                ;; return

                   oob: stp fp, lr, [sp, #-16]!
                        mov fp, sp
                        bl 0x101232568

                        mov r2, r1         ;; r1: bytes  
                        ldr r3, [r2, #15]  ;; load length
                        asr r0, r3, #1                   
                        movz r1, #0x0                    
                        cmp r1, r0         ;; length ≤ 0?
                        bcs ->oob                        
                        ldrb r0, [r2, #23] ;; load byte
                        ret                ;; return

                   oob: stp fp, lr, [sp, #-16]!          
                        mov fp, sp                       
                        bl 0x101232568                   

                    $ dart compile aot-snapshot                                \
                        --extra-gen-snapshot-options                           \
                          --dwarf-stack-traces                                 \
                        --extra-gen-snapshot-options                           \
                          --code-comments                                      \
                        --extra-gen-snapshot-options                           \
       β”‚     B61
       β”‚     Loop 0
       β”‚     v183 <- LoadField(v2 T{_JsonUtf8Parser} . chunk)
  1.49%β”‚       mov  r12, QWORD PTR [rdx+0x3f]
       β”‚     v323 <- BoxInt64(v6)
       β”‚       mov  rax, rcx
  0.99%β”‚       add  rax, rax
  2.36%β”‚     ↓ jno  96
       β”‚     β†’ call stub _iso_stub_AllocateMintSharedWithoutFPURegsStub
       β”‚       mov  QWORD PTR [rax+0x7],rcx
       β”‚     v359 <- LoadClassId(v183) int64
  3.59%β”‚ 96:   mov  ecx, DWORD PTR [r12-0x1]
  4.16%β”‚       shr  ecx, 0xc
       β”‚     MoveArgument(v183, SP+1)
  0.12%β”‚       mov  QWORD PTR [rsp+0x8],r12
       β”‚     MoveArgument(v323, SP+0)
  0.16%β”‚       mov  QWORD PTR [rsp],rax
       β”‚     v184 <- DispatchTableCall(cid=v359 List.[], v183, v323)
  0.16%β”‚       mov  rax, QWORD PTR [r14+0x58]
 25.07%β”‚     β†’ call QWORD PTR [rax+rcx*8]
       β”‚     v324 <- UnboxInt64([non-speculative], v184)
  4.11%β”‚       sar  rax,1
       β”‚     goto:14 B67
  0.44%β”‚     ↓ jmp  e8

                        // `T` is the type of the character container.
                        abstract class _ChunkedJsonParser<T> {
                          int getChar(int index);

                          // Various parsing methods written in terms of [getChar].

                        class _JsonStringParser extends _ChunkedJsonParser<String> {
                          String chunk = '';
                          int getChar(int position) => chunk.codeUnitAt(position);

                        class _JsonUtf8Parser extends _ChunkedJsonParser<List<int>> {
                          static final Uint8List emptyChunk = Uint8List(0);
                          List<int> chunk = emptyChunk;

                          int getChar(int position) => chunk[position];

Performance antipattern #1

Base class provides generic implementation which relies on small operation (getChar) overriden by subclasses.

Why: If more than one subclass is instantiated in the program then compiler will not be able to fully specialize the code.

Performance antipattern #2

Code which works with bytes is written in terms of List<int> instead of Uint8List

Why: List<T> has many different implementations. If compiler can't narrow it down to a specific implementation it will be forced to generate virtual calls for things like l[i] and l.length

commit df80cf91404e8e3b0f0a4eb271467448d126199e
Author: Slava Egorov <[email protected]>
Date:   Thu Mar 21 11:10:10 2024 +0000

[core] Improve JSON decoding performance

Avoid polymorphic character access by turning _ChunkedJsonParser
into a mixin instead of the base class.


Change-Id: Id2080724e07d16e96734a80629c8bd8906dc590b
Reviewed-by: Daco Harkes <[email protected]>
Commit-Queue: Slava Egorov <[email protected]>

                        mixin _ChunkedJsonParser<T> on _JsonParserWithListener {
                          // Generic implementation in terms of [getChar]

                        class _JsonUtf8Parser
                          extends _JsonParserWithListener
                          with _ChunkedJsonParser<Uint8List> {
                          Uint8List chunk = emptyChunk;

                        // To maintain API compatibility we still need to
                        // accept arbitrary List<int>
                        void parseChunk(List<int> value, int start, int end) {
                          if (value is Uint8List) {
                            chunk = value;
                          } else {
                            // Slow-path: copy value into fresh Uint8List.
                            // We assume this never happens.
                            chunk = Uint8List(end - start);
                            // ...

                    $ dart compile exe json-benchmark.dart
                    $ json-benchmark.exe github_events.json
                    JsonUtf8Decoder: 5.95 ns/byte (~30% faster)

Performance antipattern #3

Parsing from String rather than bytes

Why: For space savings String has two different implementations in Dart (one byte and two byte). Compiler usually does not know which reaches particulate place, so s.codeUnitAt(i) is a call or a (s is _OneByteString ? ... : ...) pattern.

                        int foo(String bytes) => bytes.codeUnitAt(0);

                        void main() {

                        movz r0, #0x61 ;; πŸ˜‚

                        int foo(String bytes, int i) => bytes.codeUnitAt(i);

                        void main(List<String> args) {
                          foo("abcd", args.length);

     ldr r2, [pp, #5960] ;; constant "abcd"
     mov r3, r1
     movz r0, #0x4
     cmp r1, r0
     bcs -> oob
     add tmp, r2, r3
     ldrb r0, [tmp, #15]

oob: stp fp, lr, [sp, #-16]!
     mov fp, sp
     bl 0x105232578

                        int foo(String bytes, int i) => bytes.codeUnitAt(i);

                        void main(List<String> args) {
                          foo(args.first, args.length);

                        ;; load class id from string object
                        ldr r1, [r3, #-1]
                        ubfm r1, r1, #12, #31
                        lsl r1, r1, #1
                        cmp r1, #0xba ;; compare cid == kTwoByteStringCid
                        bne ->2
                        ;; Load from _OneByteString
                     1: add tmp, r3, r2
                        ldrb r1, [tmp, #15]
                        mov r0, r1
                        b ->done
                        ;; Load from _TwoByteString
                     2: add tmp, r3, r2 lsl #1
                        ldrh r1, [tmp, #15]
                        mov r0, r1

parse bytes not strings!

Dart probably needs UTF-8 backed String,
but aaaaaanyway back to JSON.

                        while (...) {                      β”Œ->β–ˆβ–ˆβ–ˆ<-┐
                          if (...) {                       |  β–ˆβ–ˆβ–ˆ  |
                            continue;                      |  β–ˆβ–ˆβ–ˆ--β”˜
                          }                                |  β–ˆβ–ˆβ–ˆ
                          if (...) {                       |β”Œ-β–ˆβ–ˆβ–ˆ
                            return /* ... */;              || β–ˆβ–ˆβ–ˆ--> return
                          }                                |β””>β–ˆβ–ˆβ–ˆ
                          if (..) {                        |  β–ˆβ–ˆβ–ˆ-┐
                            throw /* ... */;               |  ... |
                          }                                |  β–ˆβ–ˆβ–ˆ |
                        }                                  β””--β–ˆβ–ˆβ–ˆ |
                                                              β–ˆβ–ˆβ–ˆ--> throw
(AOT code emition block ordering was suboptimal)

                        while (...) {                      β”Œ->β–ˆβ–ˆβ–ˆ<-┐
                          if (...) {                       |  β–ˆβ–ˆβ–ˆ  |
                            continue;                      |  β–ˆβ–ˆβ–ˆ--β”˜
                          }                                |  β–ˆβ–ˆβ–ˆ
                          if (...) {                       |β”Œ-β–ˆβ–ˆβ–ˆ
                            return /* ... */;              || β–‘β–‘β–‘--> return
                          }                                |β””>β–ˆβ–ˆβ–ˆ
                          if (..) {                        |  β–ˆβ–ˆβ–ˆ-┐
                            throw /* ... */;               |  ... |
                          }                                |  β–ˆβ–ˆβ–ˆ |
                        }                                  β””--β–ˆβ–ˆβ–ˆ |
                                                              β–ˆβ–ˆβ–ˆ--> throw
[vm] Use codegen block order in regalloc in AOT.
[vm/compiler] Improve AOT block scheduler
[vm/compiler] Move reorder_blocks onto the graph.
did not move the needle much (maybe ~5%)

                      // In _ChunkedJsonParser.parse
                      while (position < length) {
                        int char = getChar(position);
                        switch (char) {
                          case SPACE:
                          case CARRIAGE_RETURN:
                          case NEWLINE:
                          case TAB:
                          // other characters

                        // In _ChunkedJsonParser.parseString
                        while (position < end) {
                          int char = getChar(position++);
                          bits |= char;
                          if (char > BACKSLASH) continue;
                          // Escape sequence? Use more complex parsing.
                          if (char == BACKSLASH) return handleEscapes(...);
                          if (char == QUOTE) return ...;  // end of string
                          if (char < SPACE) fail(...);  // invalid json

Similar pattern of code:

  1. Read a character;
  2. Use a chain of ifs to categorize it.

Idea: could use lookup table instead!

                          do {
                            char = getChar(position);
                            bits |= char;
                            final attrs = _characterAttributes.codeUnitAt(char);
                            if ((attrs & simpleStringEndBit) != 0) break;
                          } while (position < end);
                          if (char == QUOTE) return ...; // end of string
                          if (char == BACKSLASH) return handleEscapes(...);
                          if (char < SPACE) fail(...); // invalid json

                      static const String _characterAttributes =
                        '!!!!!!!!!##!!#!!!!!!!!!!!!!!!!!!" !                             '
                        '                            !                                   '
                        '                                                                '
                        '                                                                ';

                      static const String _characterAttributes =
                        // This string length is 256. If `ch` is end
                        // a simple string (e.g. `ch` is QUOTE, BACKSLASH or a
                        // control character, then `_characterAttributes.codeUnitAt(ch)`
                        // will have [simpleStringEndBit] set.
                        // Similarly if `ch` is a whitespace (SPACE, CR, LF, TAB)
                        // then it has [whiteSpaceBit] set.

This helped a bit, but:

  1. if ((attrs & simpleStringEndBit) != 0) was producing bad code;
  2. Bounds checks on getChar(position) were causing code quality issues;
  3. Interupt checks in loop headers were eating time;

so I landed some changes first

[vm] Fix pragma vm:unsafe:no-interrupts
[vm] Add pragma vm:unsafe:no-bounds-checks
[vm] Enable test pattern (a&b == 0) fusion in AOT on X64/ARM64
[vm/libs] Improve JsonUtf8Decoder performance.
(This improved benchmark by another ~30-40%)
8.41% _ChunkedJsonParser.parseString
8.36% _ChunkedJsonParser.parse
8.07% String.hashCode
7.14% _LinkedHashMapMixin._insert
6.26% _ChunkedJsonParser.parseStringToBuffer
4.56% _LinkedHashMapMixin._set
4.36% _Utf8Decoder.convertChunked
3.99% _JsonUtf8Parser.getString
3.76% _LinkedHashMapMixin._findValueOrInsertPoint
3.13% _LinkedHashMapMixin._init
2.30% StringBuffer._addPart
2.21% _ChunkedJsonParser.parseStringEscape
2.19% allocateOneByteString
1.65% dart::Instance::CheckedHandle(dart::Zone*, dart::ObjectPtr)
1.60% dart::BootstrapNatives::DN_StringBuffer_createStringFromUint16Array(dart::Thread*, dart::Zone*, dart::NativeArguments*)
1.58% _OperatorEqualsAndHashCode._hashCode
1.55% _LinkedHashMapMixin.[]=
commit d91679987930f4fd6b0a5b3a3f328b30841ceea1
Author: Slava Egorov <[email protected]>
Date:   Tue Aug 13 12:00:48 2024 +0000

[vm/corelib] Optimize building of Maps in JSON decoder.

Instead of gradually adding key-value pairs into the Map as JSON
is being parsed collect all key values first and then allocate
the map with appropriate capacity.



Commit-Queue: Slava Egorov <[email protected]>
Reviewed-by: Lasse Nielsen <[email protected]>

                        $ dart compile exe json-benchmark.dart
                        $ json-benchmark.exe github_events.json
                        JsonUtf8Decoder: 3.5 ns/byte (~30% slower than V8)
Input JSONns/bytevs V8
Input JSONns/bytevs V8

More improvements are possible:

  1. Strings with escapes hit bad performance in StringBuffer;
  2. Can intern strings and speculate about the next possible Map key.

but what if we rethink this?

Fastest piece of code is
the one which does not need to run.

—  zen of optimization

                        #include "simdjson.h"

                        // On-Demand JSON: A Better Way to Parse Documents?
                        // This library uses SIMD to blaze through the
                        // document many characters at a time.

                        ondemand::parser parser;
                        ondemand::document doc = parser.iterate(json);
                        // Extract "created_at" from 100th status.
                        // This does not actually parse most of the document.

                        $ dart create -t console dart_simdjson
                        $ cd dart_simdjson
                        $ mkdir -p src/third_party/simdjson
                        $ curl ... # pull simdjson.{h,cpp}
                        $ mkdir hook
                        $ vi hook/build.dart

Native Assets!

// src/simdjson_api.cpp
extern "C" void* simdjson_parse(
    const uint8_t* data, size_t size) {
  ondemand::parser parser;
  ondemand::document doc = parser.iterate(input_data, size);

  // ...

                    // lib/simdjson_capi.dart
                      Pointer<Void> Function(Pointer<Uint8>, Size)
                    external Pointer<Void> simdjson_parse(
                        Pointer<Uint8> buf, int len);

                        // hook/build.dart
                          name: packageName,
                          assetName: 'simdjson_capi.dart',
                          language: Language.cpp,
                          sources: [
                          defines: {'SIMDJSON_EXCEPTIONS': '0'},
                          flags: ['--std=c++20', '-O3'],

                        # Run in JIT mode. Will invoke build hook to build
                        # necessary native dependencies.

                        $ dart --enable-experiment=native-assets run bin/benchmark.dart

                        # Build in AOT mode. Will invoke build hook to build
                        # necessary native dependencies.

                        $ dart --enable-experiment=native-assets build bin/benchmark.dart
                        $ tree bin/benchmark/
                        β”œβ”€β”€ benchmark.exe
                        └── lib

One possible way to use simdjson:

  1. Take schema and generate layout descriptor;
  2. Use simdjson to parse JSON and inflate it into a native object based on the descriptor.
  3. Dart can then takes data from native object by fixed offsets.



we removed intermediate String


Maps are often also just intermediary objects

Uint8ListMap X

So we would like to eliminate them

                    class Tweet {
                      final String? created_at;
                      final int id;

                    // binary layout:
                    //   created_at  const char* at offset  0
                    //   id          int64_t     at offset 16

                    class Tweet {
                      final Pointer<Void> ptr;


                      String? get created_at => util.loadOptionalString(ptr, 0);
                      int get id => util.loadInt(ptr, 16);

                      factory Tweet.fromJsonBytes(Uint8List bytes) =>
                        Tweet._(simdjson_parse(bytes, bytes.length, descriptor));
                      static final descriptor = 'created_at,s?,id,i!';

Would be nice to generate with a macro!

... but alas: you can't yet augment fields

                        'Tweet': {
                          'created_at': Field.required(PrimitiveType('String')),
                          'id': Field.required(PrimitiveType('int')),
                          'text': Field.required(PrimitiveType('String')),
                          'user': Field.required(CompoundType('User')),
                           // ... 15 more fields
                        'User': {
                          'id': Field.required(PrimitiveType('int')),
                          'name': Field.required(PrimitiveType('String')),
                          // ... 30 more fields

                        $ dart --enable-experiment=native-assets build bin/benchmark.dart
                        $ build/benchmark.exe twitter_timeline.json
                        1.62 ns/byte
Program loads data which it does not access.

Tree shaker knows!

... but it can't prune descriptors

                        // runtime/docs/compiler/
                        T Function()? weakRef<T>(T Function()? x) => x;

                        // tldr: `weakRef(f)` will become `null` in AOT
                        // if compiler figures out that `f` is not referenced
                        // from anywhere else.

                        class Tweet {
                            static final String _descriptors = [
                              if (_weakRef(_offsetOf$created_at) != null)
                            static int _offsetOf$created_at() => 0;
                            String get created_at =>
                              utils.loadString(_data, _offsetOf$created_at());

                        $ dart --enable-experiment=native-assets build bin/benchmark.dart
                        $ build/benchmark.exe twitter_timeline.json
                        0.70 ns/byte
but not parsing much now: we don't access any fields :)

have a mental model

Can't improve performance if you don't understand the cost.

look for things to stop doing

That's the simplest way to optimize things.

Maybe jsonDecode is doing alright?

3ns/byte is ~300Mb/s. Give server-side Dart a try :)

pragmatic amalgamation

I want to write Dart but I don't want to reinvent the wheel.