Rust is the perfect language. Many bugs are caught at compile-time and therefore you never need to debug weird runtime issues. JavaScript is the language of the web. Combined, Rust and JavaScript are the perfect pairing and there are no issues at all.
This is a pre-release, unlisted draft article. The presented information might be inaccurate. Once ready, it is going to be part 5 of the Wasm in the Wild series.
The article is also available as a talk. TODO: link to talk on YouTube
Content
The demo application
To demonstrate how flawlessly the two languages combine, I will show you a toy project that detects whether the user is a human or a bot. The idea is quite simple, the user clicks a button and we track the mouse movements until the button is clicked. JavaScript code forwards the movements to Rust code.
We then use the power of Rust to check if the mouse moved in robotically straight lines, or if there is enough irregular jitter in the movement that it could be from a human. Foolproof bot detection right here!
You might ask, why we even need JavaScript for this. Can we not do it all in Rust?
The answer is, yes, you could modify the DOM from Rust and even set up event handlers. However, that kind of code gets too verbose for my taste, especially for an introductory article. Hence, I prefer to use JavaScript for anything that accesses the browser API.
If you think of it in terms of the Model-view-controller design pattern, JavaScript code handles the view and the controller, while Rust code represent the model.
Now then, let us start with the model, using Rust. The struct MyBotDetection
is going to encapsulate all of the Rust state and expose public methods to the
JavaScript using wasm-bindgen.
// lib.rs
#[wasm_bindgen]
pub struct MyBotDetection {
events: Vec<Event>,
}
The events stored inside contain extracted information about mouse movements recorded on the JavaScript side.
// lib.rs
pub type Timestamp = u32;
#[derive(Debug, Clone)]
#[wasm_bindgen]
pub struct Event {
pub timestamp: Timestamp,
pub coordinate: Coordinate,
}
#[derive(Debug, Clone, Copy)]
#[wasm_bindgen]
pub struct Coordinate {
pub x: i32,
pub y: i32,
}
To add a new event, MyBotDetection
exposes a method called add_event
which accepts a MouseEvent
.
// lib.rs
#[wasm_bindgen]
impl MyBotDetection {
#[wasm_bindgen(js_name = addEvent)]
pub fn add_event(
&mut self,
timestamp: Timestamp,
event: web_sys::MouseEvent,
) -> Result<(), JsValue> {
let new_event = Event {
timestamp,
coordinate: Coordinate {
x: event.client_x(),
y: event.client_y(),
},
};
self.events.push(new_event);
Ok(())
}
}
MouseEvent
is a generated type by the
web-sys crate. It allows passing the event
argument of a JavaScript-side mouse event handler directly to Rust. Feeding the
mouse movements from JavaScript to Rust therefore becomes as easy as
demonstrated by the next code sample.
// main.js
window.addEventListener(
"mousemove",
(event) => {
detection.addEvent(Date.now(), event);
}
);
This was the first part of the controller code in the MVC pattern.
Now, to evaluate the recorded events, we expose another function, eloquently
named is_bot
, which returns an output object.
// lib.rs
#[wasm_bindgen]
impl MyBotDetection {
#[wasm_bindgen(js_name = isBot)]
pub fn is_bot(&self) -> BotDetectionOutput {
let human_score = self.jitter();
let result_text =
if human_score < 0.5 { "Robot" }
else { "Human" }
.to_owned();
let timestamp =
self.events.last()
.map(|e| e.timestamp)
.unwrap_or_default();
BotDetectionOutput {
timestamp,
human_score,
result_text,
}
}
}
#[derive(Debug, Clone)]
#[wasm_bindgen]
pub struct BotDetectionOutput {
#[wasm_bindgen(js_name = humanScore)]
pub human_score: f32,
pub timestamp: Timestamp,
result_text: String,
}
#[wasm_bindgen]
impl BotDetectionOutput {
pub fn text(&self) -> String {
self.result_text.clone()
}
}
I am afraid I need to keep the insides of self.jitter()
a secret, as I
would not want to leak my perfect bot detection algorithm. Just trust me that on
that. All you need to know is that it returns a number between 0 and 1, where a
higher number means it is more likely that the input came from a human. This
number is stored in the field human_score
, alongside the timestamp of the last
input, and a string representation of the bot detection result in the
result_text
field.
The fields human_score
and timestamp
are marked as public, hence they are
directly accessible from JavaScript code. For result_text
, a string value,
JavaScript code has to use a method that clones the underlying string. This is
a common pattern for wasm-bindgen users because direct field access is only
allowed for types with the
Copy
trait marker.
So far so good, we only lack a way to create the initial MyBotDetection
and
then we are ready to write the JavaScript wrapper code around our perfect Rust module.
For this, we can mark a Rust method as a constructor, which will be used when JavaScript
creates the object with the new
keyword.
// lib.rs
#[wasm_bindgen]
impl MyBotDetection {
#[wasm_bindgen(constructor)]
pub fn new() -> Self {
Self {
events: vec![],
}
}
}
With all this Rust code ready, we can now look at the JavaScript code to handle user interactions.
To start, here are the state variables on the JavaScript side. Event handlers will write to these state variables, as a kind of controller code that glues the model and the view together.
// main.js
// Create a Rust object
let detection = new MyBotDetection();
// Other state displayed in the user interface
/** @type { import("./pkg/cursed_rust.js").Event[] } */
let allEvents = [];
/** @type { BotDetectionOutput | null } */
let latestResult = null;
Mouse movement events are handled as shown earlier. But we also need to update
the allEvents
state variable every time we add an event.
// main.js
window.addEventListener(
"mousemove",
(event) => {
detection.addEvent(Date.now(), event);
allEvents = detection.events;
}
);
Additionally, we need a button the user can press to start the evaluation.
function amIBot() {
latestResult = detection.isBot();
}
document.getElementById('button1').onclick = amIBot;
The view code consists of more JavaScript and, of course, some HTML.
<!-- index.html -->
<body>
<h1>Demo: Bot detection</h1>
<div id="screen">
<p id="human-score">Human Score: 0.0</p>
<p id="bot-result">No data yet</p>
<p class="small">Results:</p>
<div id="previous-results"></div>
</div>
<p>Move your mouse to generate events.</p>
<p id="counter">Events generated: 0</p>
<div id="buttons">
<button id="button1">Am I Bot?</button>
</div>
<script src="./main.js" type="module"></script>
</body>
The content of the HTML above is kept in sync with the state variables by the
JavaScript function update
which is called in a requestAnimationFrame
loop.
// main.js
// DOM references
const counterElement = document.getElementById('counter');
const humanScoreElement = document.getElementById('human-score');
const botResultElement = document.getElementById('bot-result');
/** Keep DOM up to date with state. */
function update() {
if (latestResult) {
humanScoreElement.innerText =
`Human Score: ${latestResult.humanScore.toFixed(3)}`;
botResultElement.innerText =
latestResult.text();
}
counterElement.innerText =
`Events generated: ${allEvents.length}`;
requestAnimationFrame(update);
}
requestAnimationFrame(update);
Alright, we got everything in place. In a typical JavaScript project, we would now start manually testing it to see if I messed something up. But here, I simply run it through the Rust compiler and if it works, I know there are no bugs in the model. Maybe the controller or view code is bugged but that should be easy to check at first glance.
Can you see just how easy things are with wasm-bindgen? This was less setup than most JavaScript frameworks require. But we get the blazingly fast speed of Rust, as well as the type-safety, for anything that is not related to the user interface. I am telling you, write everything in Rust + JS from now on and thank me later.
No issues at all
To get the application up in my browser, I compile the Rust code to a Wasm module in two steps.
cargo build --release --target=wasm32-unknown-unknown
wasm-bindgen ${CARGO_TARGET_DIR}/wasm32-unknown-unknown/release/cursed_rust.wasm --target web --out-dir pkg
Then I load the module in JavaScript…
// main.js
import init, { Coordinate, MyBotDetection } from "./pkg/cursed_rust.js"
await init();
…and I am ready to display the website in my browser. Here is how that looks.
As I move the cursor around, the count of recorded events should increase and when I click the “Am I Bot?” button, I should see a result. Let’s try…
Wait a second, I’m moving the cursor but it’s not changing anything. Weird. Let me check the web console real quick…
Aha! There is an undefined JavaScript variable! In the absence of a compiler with type checks, this naturally results in a runtime error. Well, I suppose the Rust compiler cannot stop me from writing bad JavaScript code. My bad.
You see, I assign allEvents = detection.events;
while the field events
is
not even exposed to JavaScript. Hence, its value is undefined
.
Simply making the field public should yield an exposed field as desired.
// lib.rs
#[wasm_bindgen]
pub struct MyBotDetection {
// Change this to public
// events: Vec<Event>,
pub events: Vec<Event>,
}
Except, now I am facing a compiler error.
error[E0277]: the trait bound `Vec<Event>: std::marker::Copy` is not satisfied
--> src/lib.rs:26:26
|
26 | pub events: Vec<Event>,
| ^ the trait `std::marker::Copy` is not implemented for `Vec<Event>`
|
note: required by a bound in `__wbg_get_mybotdetection_events::assert_copy`
--> src/lib.rs:23:1
|
23 | #[wasm_bindgen]
| ^^^^^^^^^^^^^^^ required by this bound in `assert_copy`
= note: this error originates in the attribute macro `wasm_bindgen` (in Nightly builds, run with -Z macro-backtrace for more info)
This is the Rust I learned to love! At compile time, it tells me that
wasm-bindgen requires the field to be marked with Copy
. I guess this makes
sense since we are returning a value from Rust to JavaScript space. Safe
borrowing does not work across the language boundary, so the only safe option is
to copy the data over.
I am happy. Rustc, the compiler, told me about this copy happening rather than making it implicitly happen. Nevertheless, I want the copy to happen, so let me add a new method with an explicit clone of the data.
// lib.rs
#[wasm_bindgen]
impl MyBotDetection {
#[wasm_bindgen(constructor)]
pub fn events(&self, start: usize, end: usize) -> Vec<Event> {
self.events[start..end].to_vec()
}
}
This will return a Rust Vec
of the requested range. Thanks to wasm-bindgen,
the Vec
will automatically look like a JavaScript array in the browser.
A quick recompilation and a reload of the page later, I now see the number increase as I move the cursor. But somehow, it is stuck at 2 events. This time, there is no error in the console. What is the issue?
Luckily, we are in the browser, debugging this is straight-forward. I simply print the value of allEvents to see what it is.
// main.js
allEvents = detection.events;
console.log("allEvents is", allEvents);
Silly me! I created the function on the Rust side but I forgot to update the
JavaScript code to call the function. So, now I was setting the variable
allEvents
to the function itself.
Coincidentally, looking at allEvent.length
still yielded a number, just not
the one I expected. Instead, it used the length field of a JavaScript function,
which is the number of expected arguments to the function. 2 in this case.
Forgive me, my Rust-trained brain sometimes relies too much on the compiler to catch simple mistakes like this.
Now just for science, what happens if I call the function but provide no arguments to it? Presumably, it will be a runtime error message in the console, right?
// main.js
allEvents = detection.events();
console.log("allEvents is", allEvents);
Interestingly, the Rust side does not complain about this call missing two
arguments. It seems to set both start and end to 0, returning an empty vector as
a result. Not very Rust-like if you ask me but I guess we can blame JavaScript
for doing an implicit conversion from undefined
to 0
.
Enough science now, let’s fix this up real quick.
allEvents = detection.events(0, detection.num_events());
Finally, this works and I see the number of events moving up as I move around
the cursor! (num_events
is another Rust function I added, I am sure you can
figure out what it does on your own. Or check it out
here.)
Now I can also press the button and it immediately verifies that I am human, based on my erratic mouse movements.
Cursed Rust + JS interactions
Probably, by now, you have realized that combining Rust + JavaScript comes with a few foot guns attached. You might even have started to wonder how the ownership and borrowing model of Rust can be brought to terms with the JavaScript garbage collector. All of this was a setup and foreshadowing for what follows. Follow me a step deeper into the rabbit hole.
Null pointers in Rust
My next goal is to show a list of results on the website, rather than just the one. We need a small update to the Rust model for that, to store previous results.
The changes on the Rust side are pretty simple. I add a field to
MyBotDetection
and two methods to access it.
#[wasm_bindgen]
pub struct MyBotDetection {
events: Vec<Event>,
// +++ new field
saved_results: Vec<BotDetectionOutput>,
}
#[wasm_bindgen]
impl MyBotDetection {
#[wasm_bindgen(js_name = saveResult)]
pub fn save_result(&mut self, result: BotDetectionOutput) {
self.saved_results.push(result);
}
#[wasm_bindgen(getter)]
pub fn results(&self) -> Vec<BotDetectionOutput> {
self.saved_results.clone()
}
}
Simple enough, right? Likewise, the JavaScript changes are pretty simple.
function amIBot() {
latestResult = detection.isBot();
// +++ one new line to save the result
detection.saveResult(latestResult);
}
To display the result, I add a loop to the view code.
/** Keep DOM up to date with state. */
function update() {
// [...] old code above
// new code, reading results from `detection.results`
previousResultsElement.innerHTML = "";
let results = detection.results;
for (let i = 0; i < results.length; i++) {
// create DOM elements
const row = document.createElement("div")
const firstDiv = document.createElement("div")
const secondDiv = document.createElement("div")
// set CSS classes
row.classList.add("result-row");
firstDiv.classList.add("small");
secondDiv.classList.add("small");
// set content
secondDiv.innerText = results[i].text();
firstDiv.innerText = (new Date(results[i].timestamp)).toUTCString();
row.appendChild(firstDiv);
row.appendChild(secondDiv);
previousResultsElement.appendChild(row);
}
// [...]
}
Now let’s test this. Moving around the cursor is no issue. But when I click the “Am I Bot?” button, I instantly see an error.
Great. A null pointer exception from inside safe Rust. Did you spot the problem?
The bug has to be in these two JavaScript lines and the called Rust functions.
latestResult = detection.isBot();
detection.saveResult(latestResult);
The problem is a kind of move-semantic in JavaScript that wasm-bindgen
introduces. You see, when we return an instance of BotDetectionOutput
from
Rust to JavaScript, the data is not serialized and copied to JavaScript. No, the
object is still stored in the linear Wasm memory. JavaScript merely gets a
pointer to that Rust object, which it can use to call Wasm methods.
Wasm-bindgen has to ensure that all accesses from JavaScript follow the rules of Rust, most importantly no mutable aliasing. For that, it generates a bunch of code to manage the lifetime of the object at runtime.
This means we have to follow the same borrowing rules as usual in Rust but we can no longer catch the errors at compile time. We have to notice errors on our own, or it will blow up in our face as a runtime error.
In this case, what happens is that latestResult
in JavaScript holds ownership
of the produced result. Calling detection.saveResult(latestResult)
takes that
ownership back to Rust, away from JavaScript code. Therefore, after the call,
latestResult
is left holding a Wasm null pointer. I can demonstrate it by
printing the value before and after the call.
console.log("latestResult before is", latestResult);
detection.saveResult(latestResult);
console.log("latestResult after is", latestResult);
As you can see, the first log entry shows an object holding a wasm-bindgen pointer with the address 1144816, whereas the second log shows the pointer set to zero.
Wasm-bindgen is doing the right thing here, it prevents JavaScript from holding a
dangling pointer to an object which Rust assumes ownership of. This means
replacing the wasm pointer with a null pointer, which is fine, until the next
update
call tries to read latestResult
. Then, the runtime safety checks by
wasm-bindgen recognize that the value of latestResult
was moved previously and
it rightfully panics when we want to access it again.
Luckily, the fix is quite simple in this case. Simply borrow the parameter on the Rust side.
// Takes ownership of BotDetectionOutput and leaves behind a null pointer
#[wasm_bindgen(js_name = saveResult)]
pub fn save_result(&mut self, result: BotDetectionOutput) {
self.saved_results.push(result);
}
// Only borrows BotDetectionOutput, keeping ownership on the side of the JS binding
#[wasm_bindgen(js_name = saveBorrowedResult)]
pub fn save_borrowed_result(&mut self, result: &BotDetectionOutput) {
self.saved_results.push(result.clone());
}
By calling saveBorrowedResult
instead of saveResult
on the JavaScript side, the
issue is fixed now. Here is a screenshot listing a couple of results.
Looks good, right? Except, hm, what is wrong with the date? I certainly did not test this in 1970!
JavaScript has no integers
To fix the wrong date issue, we have to go all the way back to the original code for tracking the timestamp. Specifically, this line.
pub type Timestamp = u32;
Using a 32-bit unsigned integer to store the millisecond UNIX timestamp is simply not big enough. A nasty integer truncation happens silently when the JavaScript number
is converted to a u32
. Pretty annoying that neither a compile-time warning nor a runtime error caught this!
Anyway, let’s just use u64
, easy, right? Unfortunately, this yields a runtime error.
pub type Timestamp = u64;
Wait, so my browser happily converts a JavaScript number to a u32
, truncating the
value in the process, but it refuses to convert it to a u64
? Can you start to
see why I am somewhat cynical about the greatness of Rust’s type-safety in the browser?
Whelp, that is how it is. Wasm-bindgen has no mapping between u64
to number
because number
uses a floating point number representation and cannot
represent all values of a u64
. The more you think about it, the more you
recognize that the people behind wasm-bindgen had good reasons to avoid such
implicit conversions.
Anyway, I will not beat around the bush, there are two solutions here. Either,
we make an explicit conversion on the JavaScript side, e.g.
BigInt(Date.now())
, or we use f64
on the Rust side. I chose the latter
since the JavaScript number
type conceptually maps best to an f64
.
pub type Timestamp = f64;
This isn’t too bad, looking back. But I feel a bit betrayed by the safety
guarantees I expected from bringing Rust to the browser. I would have expected
some kind of a warning from the dev tooling, at least. But okay, I will remember
to use f64
for JavaScript numbers from now on, just like I learned to use the right
number types in C code all those years ago.
The next and last bug, however, is the pinnacle of cursed programming that I have experienced with Rust so far. The same mistake I am about to reproduce on this toy project has collectively cost me around a full week of coding to finally track down the bug in a complex setup. What I present to you is a simplified case to highlight the core issue, avoiding the chain of indirect problems it had caused in my project. Knowing how this works might save you some time in the future.
JavaScript accessing Rust fields works in mysterious ways
In this last section, I want to extend the website with a bit of dev-tooling to tune the bot detection. As it turns out, the jitter function I used so far does not reliably distinguish humans from bots.
I extend the page with a debug info section including two buttons. It ends up looking like this. (Full code available on my GitHub.)
Clicking on the Jitter function evaluates the last 100 events and then prints the score in the field where you can currently see a question mark. Pressing the button to the right labeled “Set x = 0” does the same thing but first sets all x coordinates of the last 100 events to zero. This enables easy comparisons between the current algorithm and one that ignores the x-axis jitter.
Here are the relevant code changes.
// lib.rs
#[wasm_bindgen]
impl MyBotDetection {
#[wasm_bindgen(js_name = fromEvents)]
pub fn from_events(events: Vec<Event>) -> Self {
Self {
events,
..Self::new()
}
}
}
// main.js
function printWindowedJitter() {
let windowSize = 100;
let window;
if (allEvents.length <= windowSize) {
window = allEvents;
} else {
window = allEvents.slice(-windowSize);
}
const tmp = MyBotDetection.fromEvents(window);
debugInfo.innerText = tmp.isBot().humanScore.toFixed(3);
}
function setXtoZero() {
for (let i = 0; i < allEvents.length; i++) {
allEvents[i].coordinate.x = 0;
}
printWindowedJitter();
}
The Rust function MyBotDetection::from_events
accepts a list of events and
creates a new MyBotDetection
object from it. This allows me to take a list of
events and create a temporary MyBotDetection
instance to call isBot()
on it.
The main point of doing it this way is to not modify the original
MyBotDetection
when I set the x-coordinate values to 0.
And yet, the code I just gave you is utterly cursed. There are two major bugs in it. Can you guess them? If you did guess them, please leave a comment telling me how you could see it. Because I had to do quite a bit of debugging and had to read through the code generated by wasm-bindgen to understand what was going on.
Now, when you just run this, at first things will look fine. However, there seems to be no noticeable change in the jitter evaluation after setting the x values to 0. But if I click on the same button twice without moving the mouse in between, I get a curious “array contains a value of the wrong type” error.
Back to debugging with simple console logs! What is the content of window
?
// main.js
function printWindowedJitter() {
let windowSize = 100;
let window;
if (allEvents.length <= windowSize) {
window = allEvents;
} else {
window = allEvents.slice(-windowSize);
}
// + LOG window
console.log("window", window);
const tmp = MyBotDetection.fromEvents(window);
debugInfo.innerText = tmp.isBot().humanScore.toFixed(3);
}
What is going on here? It looks like window
is filled with null pointers. Even
on the first click, when things are fine. Then why does it only crash on the
first click?
Let me start by untangling the easier bug. It is a curious combination of a problem I have already shown you before, alongside an unfortunate timing issue with the console.
In fact, it is the same move-semantics into a null pointer problem we already
saw earlier. This time, calling MyBotDetection.fromEvents(window)
moves
ownership of all the objects inside the window
array to Rust. The previous
assignment window = allEvents.slice(-windowSize)
copies all the values in the
given range of allEvents
to a new JavaScript array. At this point, we already
have aliasing problems as far as Rust is concerned since both allEvents
and
window
reference the same values. Hence, wasm-bindgen marks the values inside
allEvents
to be null.
Okay, so the array allEvents
itself is still well and alive on the JavaScript
side. But all the objects inside are now null-pointers. When we pass this array
to Rust, it will complain that this is not an array of the expected type, as it
contains unexpected null pointers.
For the first click on the button, this problem is hidden, as we are not using
the values of allEvents
again after values are moved out. We only look at the
length to display the number of events, which remains unaffected by the
replacement of the values with null pointers.
On the second click, however, we derive a new window
from the defunct
allEvents
array and send it to Rust again. If, however, I move the cursor
between the clicks, the array is overwritten with a new clone from the Rust
side, so things are fine then.
We can solve the null pointer problem by restoring allEvents at the end of the function.
// main.js
function printWindowedJitter() {
let windowSize = 100;
let window;
if (allEvents.length <= windowSize) {
window = allEvents;
} else {
window = allEvents.slice(-windowSize);
}
const tmp = MyBotDetection.fromEvents(window);
debugInfo.innerText = tmp.isBot().humanScore.toFixed(3);
// + add this line to restore `allEvents` with freshly cloned values
allEvents = detection.events(0, detection.num_events());
}
But this does not explain the console logs. Something is still weird.
Here is what is going on. At first, the browser just shows a summary of the array. Then, when we expand it with a click, it reads all the contained values. But at that time, the underlying objects are no longer owned by JavaScript, hence we get null-pointers. If we wanted to see the field values at the time when the logging happened, we would need to write logging code more like this.
for (let el of window) {
// This shows a pointer, which when expanded still points to nothing.
console.log(el);
// This shows the values for real.
console.log(el, el.coordinate.x, el.coordinate.y);
}
This shows how the values are valid on the first click but are invalid on the second click.
In the video above, we see valid wasm pointers and valid x & y values logged. Amusingly, we can still expand the object and see it being replaced with a null pointer by that time.
With all that out of the way, let us tackle the final problem. I suspected the x value changes do not affect the jitter. Using the browser’s debugger to step through the execution line by line, I was able to observe how different values change and I could confirm that the x values passed to Rust are simply not set to 0. How could that be?
To demonstrate it very clearly, consider this code and its output.
function setXtoZero() {
for (let i = 0; i < allEvents.length; i++) {
console.log("x was", allEvents[i].coordinate.x);
allEvents[i].coordinate.x = 0;
console.log("x should be 0 now but actually is", allEvents[i].coordinate.x);
}
printWindowedJitter();
}
We set the value to 0 and read it right back as the old value! Why did the writing operation not work?
To understand this, let me show you the relevant data structure again.
#[derive(Debug, Clone)]
#[wasm_bindgen]
pub struct Event {
pub timestamp: Timestamp,
pub coordinate: Coordinate,
}
#[derive(Debug, Clone, Copy)]
#[wasm_bindgen]
pub struct Coordinate {
pub x: i32,
pub y: i32,
}
Notice that coordinate
is a public field with an auto-generated getter from
wasm-bindgen. This is possible because Coordinate
is marked with the Copy
trait. Naive as I was, I expected this meant that reading and writing this field
from JavaScript therefore works as if I did it from Rust. Boy, was I wrong.
You see, to make the nice JavaScript syntax even possible, wasm-bindgen generates JavaScript as follows.
// cursed_rust.js (generated by wasm-bindgen)
export class Event {
/**
* @returns {Coordinate}
*/
get coordinate() {
const ret = wasm.__wbg_get_event_coordinate(this.__wbg_ptr);
return Coordinate.__wrap(ret);
}
}
This means that any access to the coordinate
field access is funneled through
this getter function. This is standard JavaScript
syntax
and has existed basically forever.
Wasm-bindgen uses this feature to make something rather complicated that is happening in the background look like a normal field access.
First, it executes a generated wasm function __wbg_get_event_coordinate
and
passes the raw wasm pointer this.__wbg_ptr
stored in the wrapper class
Event
. Then it takes the result, itself a raw wasm pointer, and wraps it in a
Coordinate
class. So far so good, we get a shim of type-safety on top of the
raw pointer. This feels typical for Rust, it was not surprising to me when I saw
it the first time, just a nice detail to know and understand.
The generated code inside __wbg_get_event_coordinate
is quite a bit more interesting,
however. Using cargo expand
, we can
see its content in plain Rust. Let me also add a few comments to it to make it
easier to understand.
// (proc-macro generated code)
pub unsafe extern "C" fn __wbg_get_event_coordinate(
js: u32,
) -> wasm_bindgen::convert::WasmRet<
<Coordinate as wasm_bindgen::convert::IntoWasmAbi>::Abi,
> {
use wasm_bindgen::__rt::{WasmRefCell, assert_not_null};
use wasm_bindgen::convert::IntoWasmAbi;
// nice hack to fail compilation if Coordinate does not implement Copy
fn assert_copy<T: Copy>() {}
assert_copy::<Coordinate>();
// unsafe pointer re-interpretation
// relies on JS accesses to only go through the safe type wrappers
let js = js as *mut WasmRefCell<Event>;
// check that the js object ownership has not moved, marked by a null pointer
assert_not_null(js);
// now read the actual value
let val = (*js).borrow().coordinate;
// Create a new Coordinate object and return a raw pointer to it
<Coordinate as IntoWasmAbi>::into_abi(val).into()
}
The last line, which creates a Coordinate, does the sneaky bit. It allocates a
new object, as seen by the IntoWasmAbi
code.
// (proc-macro generated code)
impl wasm_bindgen::convert::IntoWasmAbi for Coordinate {
type Abi = u32;
fn into_abi(self) -> u32 {
use wasm_bindgen::__rt::alloc::rc::Rc;
use wasm_bindgen::__rt::WasmRefCell;
Rc::into_raw(Rc::new(WasmRefCell::new(self))) as u32
}
}
As you can see, the new object is put in a new WasmRefCell
which itself is put
in an Rc
. WasmRefCell
is, for all intents and purposes, equivalent to the
standard RefCell
. So really, what is happening here is the well-known pattern
of inner mutability of a shared object.
This makes sense, since in JavaScript developers will be cloning around objects
left and right all the time, relying on the garbage collector to eventually
clean up all copies of it. Without access to the garbage collector’s inner
workings, the only sensible way to represent that on the Rust side is to have
explicit reference counting of the object. Hence, an Rc
. Inner mutability,
provided by RefCell
, is also a must, since JavaScript might want to modify the
data.
Now, take a moment to appreciate what that means for our Copy
field. Accessing
the field from JavaScript will automatically create a new copy and allocate it
on the heap if it is Copy
, simply because the assumed shared ownership on the
JavaScript side requires reference counting on the Rust side.
To put it simply, when we have this JavaScript code, …
allEvents[i].coordinate.x = 0;
… under the hood we have this, expressed in Rust syntax.
let coordinate = allEvents[i].coordinate.copy();
let mut rcCoordinate = Rc::new(RefCell::new(coordinate));
rcCoordinate.borrow_mut().x = 0;
// rcCoordinate is now dropped and forgotten
// allEvents[i].coordinate is unchanged, lol
Usually, when I use a Copy
field, I only think about the read path. For the
write path, I rarely think about what Copy
’s semantics are. After all, Rust
left to its own does the intuitive thing. When assigning to a field of a field
in Rust, even with types that have Copy
, it borrows mutably through as many
layers as necessary.
let mut a = ...;
// This is equal to `(&mut a.b).c = 0`
a.b.c = 0;
But in this case, JavaScript does a non-borrowed getter call first, followed by a setter call on the copy. This is because wasm-bindgen does not support borrowed Rust values being passed to JavaScript. (How would that even work?)
So, in order to set the x values to 0 from JavaScript, we cannot use a nested field access. Instead, we could overwrite the first field with an entirely new value, like this.
// main.js
// modifies a copy, not modifying coordinate.x
// allEvents[i].coordinate.x = 0;
// solution: overwrites coordinate
allEvents[i].coordinate = new Coordinate(0, allEvents[i].coordinate.y);
Sure enough, this works as expected.
Now just imagine a deeply nested Rust structure, with something like
a.b.c.d = 0;
on the JS side. Rust will create a deep copy of b
, of which it
will then create a deep copy of c
, before setting the d
field to zero.
This innocent-looking statement causes two unwanted reference-counted heap allocations on
the Rust side while failing to update the original value.
And this, my fellow Rustaceans, is what I call cursed Rust + JS code.
Closing words
I hope this post was both entertaining and educational. My goal was to show some of the insanity that wasm-bindgen has to solve quietly in the background for us when we use Rust in the browser. These are fundamental problems of clashing Rust with JavaScript. In no way should this be seen as a major critique of wasm-bindgen.
In fact, shout out to everyone who has contributed to wasm-bindgen, I love your work and I appreciate how much better it has gotten over the years! And I want to apologize for the inevitably existing group of people who will read this post superficially and walk away with the conclusion that we should never use Rust in the browser. However, I trust that the majority of my audience can be more nuanced than that.
The truth is, Rust is a great language and it has its role in the browser, despite the issues shown here. Some articles will describe the good, some describe the bad. My blog post is a source for both sides because I want to provide the full picture. More content arguing both sides will appear on my blog over the next year.
Thank you for your interest! What is your advice to avoid the issues that I faced? Discussion, as usual, on reddit: r/rust
Links
- Broken source code: github.com/jakmeier/cursed-rust-js.
- Working source code: github.com/jakmeier/cursed-rust-js/tree/solution.
- Comments: TODO.
- Talk: TODO.