Making base64 fast enough to disappear

The problem

Base64 is the kind of thing you never think about until it’s the slowest line in a flamegraph.

I kept hitting it in a React Native image pipeline: every encode and decode of a large payload ran on the JS thread, and react-native-quick-base64 was already the fastest JavaScript-reachable option. So the ceiling wasn’t the library’s algorithm — it was everything happening between JavaScript and native on every single call. That’s where I went looking.

The decision

There were three separate costs, and I wanted to be honest about which ones were actually worth touching. The rule I gave myself: a speedup has to be worth the complexity it adds, and the complexity has to stay contained — no architecture-specific code leaking into the public API, no behaviour changes that force a migration.

That ruled out a rewrite. What it ruled in was a sequence of targeted changes, each one measurable on its own: first stop copying data needlessly across the JSI boundary, then make the actual codec use the CPU’s SIMD units, then delete an encoding round-trip I shouldn’t have been paying for at all.

What I did

Eliminating ArrayBuffer copies at the JSI boundary (#49)

Every call was copying the payload as it crossed between JavaScript and C++. For small strings that’s noise; for image-sized buffers it’s most of the cost. This PR reworked the JSI boundary to operate on the underlying buffer directly instead of duplicating it on the way in and out.

A SIMD-accelerated codec via simdutf (#50)

The core change: replace the scalar base64.h implementation with simdutf, which encodes and decodes using SIMD instructions — processing 16 or 32 bytes per instruction instead of one. simdutf picks the best instruction set available at runtime, so the same binary stays fast across the architectures React Native actually ships on.

Skipping the UTF-16 → UTF-8 round-trip on decode (#51)

JavaScript strings are UTF-16; the decoder was re-encoding them to UTF-8 before doing any work. Using getStringData to read the string data directly removed that round-trip entirely — a smaller win than the SIMD codec, but free, and it compounds on every decode.

The proof

On a SIMD micro-benchmark, large-payload decode came out roughly 150× faster than the scalar path. That number is the headline, but it’s also the least honest one on its own — micro-benchmarks measure the codec in isolation. End to end, inside the real image pipeline where base64 is one step among many, the improvement landed at a still-decisive 2–3×. Both numbers matter; neither means anything without the other.

All three changes landed in v3.0.0 — a major release of a library I don’t maintain. The external sign-off is the part I can’t give myself.

One honest moment

The first version of the SIMD path looked great in the benchmark and was subtly wrong on a payload-length edge case — the padding handling diverged from the scalar path on certain sizes. The benchmark was happy; the test suite was not. That’s exactly the trade-off I’d flagged going in: SIMD buys speed and charges correctness risk, so the path is only worth shipping behind tests that can catch it lying.

Making base64 fast enough to disappear

The problem

The decision

What I did

Eliminating ArrayBuffer copies at the JSI boundary (#49)

A SIMD-accelerated codec via simdutf (#50)

Skipping the UTF-16 → UTF-8 round-trip on decode (#51)

The proof

One honest moment

How it went

Stop copying buffers at the boundary

Swap in the SIMD codec

Drop the UTF-16 round-trip