Getting Fancy with Rust and WebAssembly (Part 4) - Smaller wasm file size

I. Introduction

In the previous article "Getting Fancy with Rust and WebAssembly (Part 3) - DOM Manipulation and Type Conversion", we discussed the interaction between Rust and JS, including mutual function calls between Rust and JS. A particularly mind-blowing feature was using JS to call Rust structs. JS itself doesn't even have structs, yet it can call Rust structs. This is really great for the development experience of Rust developers!

Based on the previous series of articles, it's already sufficient to develop a complete functionality using Rust.

However, when importing wasm files in the frontend, there may still be some issues, such as large wasm files leading to longer webpage loading times and poor user experience. This article will explore various ways to reduce the size of Rust-compiled wasm files, thereby reducing the time it takes for the frontend to load wasm files.

For a while, I was using Go to develop WebAssembly, and the compiled wasm files were quite large. It was really painstaking to reduce the size of wasm files, and 1xMB size wasm files were really painful...... But size optimization often points to a "seductive point of no return" - switching to Rust development 😆! If you're already using Rust to develop WebAssembly, congratulations, you're already winning at the starting line in terms of wasm size compared to Go.

II. Environment

Rust 1.70.0
wasm-bindgen 0.2.87

III. Checking the Size

Before actually reducing the size, we need to first look at what the current size is, to facilitate later comparison of before and after sizes.

There are multiple ways to check the size, here are a few recommendations (for Linux and MacOS), you can use any one of them.

ls

You can use ls -l or ll:

$ ll pkg/hello_wasm_bg.wasm
-rw-r--r--  1 hunter  staff    23K Jul 20 21:52 pkg/hello_wasm_bg.wasm

stat

$ stat pkg/hello_wasm_bg.wasm
16777222 142141572 -rw-r--r-- 1 hunter staff 0 23347 "Jul 20 21:52:53 2023" "Jul 20 21:52:01 2023" "Jul 20 21:52:01 2023" "Jul 20 21:52:01 2023" 4096 48 0 pkg/hello_wasm_bg.wasm

wc

$ wc -c pkg/hello_wasm_bg.wasm
   23347 pkg/hello_wasm_bg.wasm

Using wc as an example, the current wasm file size is 23347b.

IV. Code Level

Link-Time Optimization (LTO) refers to a type of interprocedural optimization performed during program linking. It allows the compiler to optimize across multiple compilation units during the linking phase, thereby improving the program's performance, reliability, and security.

Optimization at the code level mainly utilizes LTO (Link-Time Optimization).

1. Within the Code

Enable LTO in Cargo.toml:

[profile.release]
lto = true

Although enabling LTO can reduce the compiled size, it also increases compilation time.

After enabling LTO, by default, it ensures compilation time while reducing the compiled size to a certain extent. If your requirement is for a smaller size rather than shorter time, then you can manually specify the compilation level to make LTO change.

The following levels can be used within the code:

s: The default LTO level. It performs the most basic LTO optimizations, including function inlining, function rewriting, data rearrangement, etc.
z: The highest LTO level. It performs more complex LTO optimizations, including dead code elimination, memory allocation optimization, security optimization, etc.

So you can configure it in Cargo.toml like this:

[profile.release]
lto = true
opt-level = 'z'

The original file size was 23347b, now let's look at the size after compilation:

$ wc -c pkg/hello_wasm_bg.wasm
   19879 pkg/hello_wasm_bg.wasm

It's clearly reduced in size! However, using the z level doesn't necessarily mean that the size will always be smaller than s, sometimes s can also be smaller than z, this needs to be determined based on the code situation.

2. Outside the Code

Outside the code, you can use wasm-opt for optimization. It can perform various optimizations on WebAssembly modules, of course, this article focuses on size aspects (digging a pit, we'll discuss in detail later /dog head). And wasm-opt can optimize all wasm files that conform to the WebAssembly specification, so even if you didn't write it in Rust, you can still use it for optimization. (Think about the wasm I used to write in Go, there's also one more way to optimize it......)

First, let's look at the basic optimization parameters of wasm-opt:

-o: Specify the output file of the optimized module
-O: Enable default optimization, equivalent to the -Os parameter
-O0: No optimization
-O1: Perform some basic optimizations, such as function inlining optimization and dead code elimination optimization
-O2: Perform more thorough optimizations, such as function rewriting, data rearrangement, memory allocation optimization, etc.
-O3: Perform the most thorough optimizations, including some optimizations that may affect program functionality
-O4: Same as -O3, but will enable more aggressive optimizations
-Os: The optimization goal is to reduce code size, will perform some optimizations that may affect performance
-Oz: Same as -Os, but will enable more aggressive optimizations

Based on the theme of this article, we will use the -Os and -Oz parameters here, which correspond to the levels in the above "Within the Code" section.

Here, let's execute with the -Oz parameter on the original wasm file, and see the comparison effect:

$ wc -c pkg/output.wasm
   23194 pkg/output.wasm

Then let's execute wasm-opt on the wasm file compiled after enabling "Within the Code" LTO, and see the comparison effect:

$ wc -c pkg/output.wasm
   19871 pkg/output.wasm

Overall, the size of the wasm file is getting smaller and smaller. It's just that the case I have here is using the code from the series of articles, without any actual complex code, and the size itself is already very small, so it won't be particularly effective.

V. Network Level

At the network level, it's the well-known compression of network transmission, where the client and server agree on the same compression algorithm, then the server compresses when sending out, and the client decompresses when receiving. Compression at the network level can compress the transmitted message without losing information.

For example, the gzip compression algorithm that everyone is familiar with, however, there are several compression algorithms:

gzip
compress
deflate
br

Among them, gzip also has the highest compression rate, so here we'll take gzip as an example.

At the network level, compress the wasm file with gzip to reduce its size during transmission. Although it reduces the size during transmission, the browser needs to consume some performance to decrypt the compressed data when it receives it.

1. Enabling GZIP

Enabling GZIP is actually simple, you just need to agree with the frontend and backend to both use gzip.

First, when the frontend requests the wasm file, it needs to put the compression modes supported by the browser in the request header:

Accept-Encoding: gzip, deflate

Then, after the server receives this request, it can give out the compression modes that the server also supports, and tell the browser what compression mode the server will use.

The way to communicate with the browser is to put the information in the response header:

Content-Encoding: gzip

This enables GZIP.

Then, the browser receives the response body and header, knows that the backend uses gzip compression, so the browser will automatically use gzip to decompress and get the complete data.

2. Server Support

You might wonder, the browser can automatically decrypt, but how does the server automatically encrypt? Does the backend need to write code to encrypt the file?

Of course not, just let the http server complete this operation. Here we'll take the familiar Nginx as an example.

The simplest is to enable gzip with one line of configuration:

gzip on;

You can also specify some parameters for gzip, such as the types that can be encrypted, the minimum encryption length, etc.:

gzip on;
gzip_types      text/plain application/xml;
gzip_proxied    no-cache no-store private expired auth;
gzip_min_length 1000;
...

For more http server configurations, you can refer to their respective official documentation.

VI. Physical Level

You might be surprised, what physical level?!

That's right, it's really at the physical level - directly perform gzip physical compression on the wasm file! Haha, this method is really amazing, I discovered it when I was looking for ways to reduce size while developing wasm with Go. If you've optimized your wasm to the end of the road, you might as well boldly try this solution. 😆

Remember in the "Network Level" section above, there was a question about whether manual compression is needed, well, here it's manual compression and decompression all the way, haha.

1. Physical Compression

First, perform physical gzip compression on the wasm file, here we'll use the original wasm (23347b):

gzip -c pkg/hello_wasm_bg.wasm > pkg/output.gz

Then, look at its size:

$ wc -c pkg/output.gz
   10403 pkg/output.gz

The effect is excellent, reduced from 23347b to 10403b!

Then let's go through the optimization of the above "Code Level" and see the final size:

$ wc -c pkg/output.gz
    9237 pkg/output.gz

The effect is even more excellent, reduced from 19871b to 9237b!

So, here we physically compress and store the wasm file, and when the browser requests, it directly requests the .gz file.

2. Physical Decompression

After the browser gets the .gz file, it needs to physically decompress it.

Here we recommend using the frontend library pako to decompress the .gz file:

async function gunzipWasm() {
  const res = await fetch("target.gz")
  let buffer = await pako.ungzip(await res.arrayBuffer())
  // A fetched response might be decompressed twice on Firefox.
  // See https://bugzilla.mozilla.org/show_bug.cgi?id=610679
  if (buffer[0] === 0x1F && buffer[1] === 0x8B)
  {buffer = pako.ungzip(buffer)}
  return buffer
}

Then it can be used directly.

VII. BUFF Stacking

Here we directly use all the above methods, stack the buffs directly, and see how much size we can reduce for this case (accumulated in this series of articles). In the "Physical Level" section, we've already accumulated the buffs except for the "Network Level", so we can directly use its results. And in the "Network Level" section, using gzip for compression, we estimate the compression rate of gzip at 40%.

Then the final wasm size of this case will be 5542b, with a compression rate of about 77%!

Of course, we also need to calculate an initial language buff - Rust, using Rust itself has already caused the wasm file size to be very small.

VIII. Conclusion

In this article, we introduced wasm file size optimization solutions from three levels: code level, network level, and physical level, with a total of four solutions.

Finally, after stacking all the buffs, the current case (accumulated in this series of articles) can reduce the size by 77%, which really feels great, haha.

Hope this can be helpful to everyone.