|
1 | 1 | # Why the Component Model? |
2 | 2 |
|
3 | | -If you've tried out WebAssembly, you'll be familiar with the concept of a _module_. Roughly speaking, a module corresponds to a single `.wasm` file, with functions, memory, imports and exports, and so on. These "core" modules can run in the browser, or via a separate runtime such as Wasmtime or WAMR. A module is defined by the [WebAssembly Core Specification](https://webassembly.github.io/spec/core/), and if you compile a program written in Rust, C, Go or whatever to WebAssembly, then a core module is what you'll get. |
| 3 | +At a high level, the component model builds upon WebAssembly _core modules_ |
| 4 | +to enhance interoperability between languages and libraries, |
| 5 | +both by enriching the type system |
| 6 | +used for checking the safety of interactions between modules, |
| 7 | +and by clearly defining and enforcing |
| 8 | +the low-level calling contract between separately-compiled modules. |
| 9 | +To understand what the limitations of core modules are, |
| 10 | +we start by defining them. |
4 | 11 |
|
5 | | -Core modules are, however, limited in how they expose their functionality to the outside world to functions that take and return only a small number of core WebAssembly types (essentially only integers and floating-point numbers). Richer types, such as strings, lists, records (a.k.a. structs), etc. have to be represented in terms of integers and floating point numbers, for example by the use of pointers and offsets. Those representations are often times not interchangeable across languages. For example, a string in C might be represented entirely differently from a string in Rust or in JavaScript. |
| 12 | +## WebAssembly core modules |
6 | 13 |
|
7 | | -For Wasm modules to interoperate, therefore, there needs to be an agreed-upon way for exposing those richer types across module boundaries. |
| 14 | +A module is defined by the [WebAssembly Core Specification](https://webassembly.github.io/spec/core/). |
8 | 15 |
|
9 | | -In the component model, these type definitions are written in a language called [WIT (Wasm Interface Type)](./wit.md), and the way they translate into bits and bytes is called the [Canonical ABI (Application Binary Interface)](./../advanced/canonical-abi.md). A Wasm [component](./components.md) is thus a wrapper around a core module that specifies its imports and exports using such [Interfaces](./interfaces.md). |
| 16 | +WebAssembly programs can be written by hand, |
| 17 | +but it's more likely that you will use a higher level programming language |
| 18 | +such as Rust, C, Go, JavaScript, or Python to build WebAssembly programs. |
| 19 | +Many existing toolchains currently produce a |
| 20 | +[WebAssembly core module](https://webassembly.github.io/spec/core/syntax/modules.html)—a single |
| 21 | +binary `.wasm` file. |
10 | 22 |
|
11 | | -The agreement of an interface adds a new dimension to Wasm portability. Not only are components portable across architectures and operating systems, but they are now portable across languages. A Go component can communicate directly and safely with a C or Rust component. It need not even know which language another component was written in - it needs only the component interface, expressed in WIT. Additionally, components can be linked into larger graphs, with one component's exports satisfying another's imports. |
| 23 | +A core module usually corresponds to a single binary `.wasm` file. |
| 24 | +Here's what the `file` command outputs for a sample `.wasm` file: |
| 25 | +```console |
| 26 | +$ file adder.wasm |
| 27 | +adder.wasm: WebAssembly (wasm) binary module version 0x1 (MVP) |
| 28 | +``` |
12 | 29 |
|
13 | | -Combined with Wasm's strong sandboxing, this opens the door to yet further benefits. By expressing higher-level semantics than integers and floats, it becomes possible to statically analyse and reason about a component's behaviour - to enforce and guarantee properties just by looking at the surface of the component. The relationships within a graph of components can be analysed, for example to verify that a component containing business logic has no access to a component containing personally identifiable information. |
| 30 | +A core module is a set of definitions. |
| 31 | +Kinds of definitions include: |
| 32 | +* _Functions_ define executable units of code |
| 33 | + (sequences of instructions along with declarations |
| 34 | + for the names of arguments |
| 35 | + and the types of arguments and return values). |
| 36 | +* [_Linear memories_](https://webassembly.github.io/spec/core/syntax/modules.html#syntax-mem) |
| 37 | + define buffers of uninterpreted bytes that can be read from |
| 38 | + and written to by instructions. |
| 39 | +* _Imports_ define the names of other modules |
| 40 | + that are required to be available to execute |
| 41 | + the functions in the module, |
| 42 | + along with type signatures for required functions |
| 43 | + in the imported module. |
| 44 | +* _Exports_ define the names of functions within |
| 45 | + the module that should be accessible externally. |
| 46 | +* And others; see [the Core Specification](https://webassembly.github.io/spec/core/syntax/modules.html) |
| 47 | + for the complete list. |
14 | 48 |
|
15 | | -Moreover, a component interacts with a runtime or other components _only_ by calling its imports and having its exports called. Specifically, unlike core modules, a component may not export Wasm memory, and thus it cannot indirectly communicate to others by writing to its memory and having others read from that memory. This not only reinforces sandboxing, but enables interoperation between languages that make different assumptions about memory - for example, allowing a component that relies on Wasm GC (garbage collected) memory to collaborate with one that uses conventional linear memory. |
| 49 | +Core modules can be run in the browser, |
| 50 | +or via a separate runtime such as [Wasmtime](https://wasmtime.dev/) |
| 51 | +or [WAMR](https://github.com/bytecodealliance/wasm-micro-runtime). |
| 52 | + |
| 53 | +### Limitations of core modules |
| 54 | + |
| 55 | +Core modules are limited in the computation they can perform and |
| 56 | +how they expose their functionality to the outside world. |
| 57 | +In WebAssembly core modules, functions are restricted, essentially, |
| 58 | +to using integer (`i32` or `i64`) or floating-point (`f32` or `f64`) types. |
| 59 | +Only these types can be passed as arguments to functions, |
| 60 | +and only these types can be returned from functions as results. |
| 61 | +Compound types common in higher-level programming languages, |
| 62 | +such as strings, lists, arrays, enums (enumerations), or structs (records), |
| 63 | +have to be represented in terms of integers and floating-point numbers. |
| 64 | + |
| 65 | +For example, for a function to accept a string, the string argument |
| 66 | +might be represented as two separate arguments: |
| 67 | +an integer offset into a memory |
| 68 | +and an integer representing the length of the string. |
| 69 | +Recall that a (linear) memory is an uninitialized region of bytes |
| 70 | +declared within a module. |
| 71 | + |
| 72 | +In pseudocode, a type signature for a string-manipulating function |
| 73 | +might look like: |
| 74 | + |
| 75 | +``` |
| 76 | +remove-duplicates: func(offset: i32, length: i32) -> [i32, i32] |
| 77 | +``` |
| 78 | + |
| 79 | +supposing that `remove-duplicates` is a function |
| 80 | +to create a new string consisting of the unique characters |
| 81 | +in its argument. |
| 82 | +The return type is a list of two 32-bit integers. |
| 83 | +The first integer is an offset into one of the linear memories |
| 84 | +declared by the module—where the newly allocated string starts—and |
| 85 | +the second integer is the length of the string. |
| 86 | +After calling the function, |
| 87 | +the caller has to reach into the appropriate linear memory |
| 88 | +and read the output string, using the returned offset and length. |
| 89 | + |
| 90 | +For this to work, the module defining the `remove-duplicates` function |
| 91 | +would also need to include |
| 92 | +an export declaration that exports a memory to be used |
| 93 | +for the argument and result strings. Pseudocode: |
| 94 | + |
| 95 | +``` |
| 96 | +export "string_mem" (mem 1) |
| 97 | +``` |
| 98 | + |
| 99 | +And, the module using the `remove-duplicates` function |
| 100 | +would need to import this memory. Pseudocode: |
| 101 | + |
| 102 | +``` |
| 103 | +import "strings" "string_mem" |
| 104 | +``` |
| 105 | + |
| 106 | +(This pseudocode is still simplified, since the importer |
| 107 | +also needs to declare the size of the memory being |
| 108 | +imported.) |
| 109 | + |
| 110 | +Note that there is nothing in the type system to prevent |
| 111 | +the returned length from being confused with the returned offset, |
| 112 | +since both are integers. |
| 113 | +Also, the name of the memory used for the input and output strings |
| 114 | +must be established by convention, |
| 115 | +and there is also nothing in the type system to stop client code |
| 116 | +from indexing into a different memory |
| 117 | +(as long as the sum of the offset and length is within bounds). |
| 118 | + |
| 119 | +We would prefer to write a pseudocode type signature like this: |
| 120 | + |
| 121 | +``` |
| 122 | +remove-duplicates: func(s: string) -> string |
| 123 | +``` |
| 124 | + |
| 125 | +and dispense with the memory exports and imports altogether. |
| 126 | + |
| 127 | +The complexity doesn't stop there! |
| 128 | +Data representations are frequently specific to each programming language. |
| 129 | +For example, a string in C is represented entirely differently |
| 130 | +from a string in Rust or in JavaScript. |
| 131 | +Moreover, to make this approach work, modules must import and export memories, |
| 132 | +which can be error-prone, as different languages |
| 133 | +make different assumptions about memory layout. |
| 134 | + |
| 135 | +For WebAssembly modules written in different languages to interoperate smoothly, |
| 136 | +there needs to be an agreed-upon way to expose these richer types across module boundaries. |
| 137 | + |
| 138 | +## Components |
| 139 | + |
| 140 | +Components solve the two problems that we've seen so far: |
| 141 | +the limited type system of core module functions, |
| 142 | +and cross-language interoperability. |
| 143 | +Conceptually, a component is a WebAssembly binary |
| 144 | +(which may or may not contain modules) |
| 145 | +that is restricted to interact |
| 146 | +only through the modules' imported and exported functions. |
| 147 | +Components use a different binary format. |
| 148 | +Compared to core modules, components also use a richer |
| 149 | +mechanism by default for expressing the types of functions: _interfaces_. |
| 150 | + |
| 151 | +### Interfaces |
| 152 | + |
| 153 | +Interfaces are expressed in a separate language called [WebAssembly Interface Types (WIT)](./wit.md). |
| 154 | +[Interfaces](./wit.md#interfaces) contain definitions of _types_ |
| 155 | +and type signatures for [_functions_](./wit.md#functions). |
| 156 | +The bit-level representations of types are specified by |
| 157 | +the [Canonical ABI (Application Binary Interface)](./../advanced/canonical-abi.md). |
| 158 | +Together, interfaces and the Canonical ABI |
| 159 | +achieve the goal of clearly defining and enforcing |
| 160 | +the low-level calling contract between modules. |
| 161 | + |
| 162 | +### Interoperability |
| 163 | + |
| 164 | +WebAssembly core modules are already portable across different architectures |
| 165 | +and operating systems; |
| 166 | +components retain these benefits and, using the Component Model ABI, |
| 167 | +add portability across different programming languages. |
| 168 | +A component implemented in Go can communicate directly and safely |
| 169 | +with a C or Rust component, by relying on the shared conventions of the Component Model ABI. |
| 170 | +Writing a component doesn't even require knowledge |
| 171 | +of which language its dependent components are implemented in, |
| 172 | +only the component interface expressed in WIT. |
| 173 | +Additionally, components can be [composed](../composing-and-distributing.md) into larger graphs, |
| 174 | +with one component's exports satisfying another's imports. |
| 175 | + |
| 176 | +### Benefits of the component model |
| 177 | + |
| 178 | +Putting all of the pieces together: |
| 179 | +the component model introduces a binary WebAssembly format |
| 180 | +that encapsulates WebAssembly modules. |
| 181 | +This format enables the construction of WebAssembly modules |
| 182 | +that interact with each other only through exports and imports of functions |
| 183 | +whose types are expressed using WIT. |
| 184 | + |
| 185 | +Building upon Wasm's strong [sandboxing](https://webassembly.org/docs/security/), |
| 186 | +the component model has further benefits. |
| 187 | +Rich types make it easier to know what a component or interface |
| 188 | +is doing at a glance |
| 189 | +and have guarantees of what bad things cannot happen. |
| 190 | +Richer type signatures express richer semantic properties |
| 191 | +than type signatures made up only of integers and floats. |
| 192 | +The relationships within a graph of components can be statically analysed: |
| 193 | +for example, to verify that a component containing business logic |
| 194 | +has no access to a component containing personally identifiable information. |
| 195 | + |
| 196 | +Moreover, a component interacts with a runtime or other components |
| 197 | +_only_ by calling its imports and having its exports called. |
| 198 | +Specifically, unlike core modules, a component may not export a memory |
| 199 | +and thus it cannot indirectly communicate to others |
| 200 | +by writing to its memory and having others read from that memory. |
| 201 | +This not only reinforces sandboxing, but enables interoperation |
| 202 | +between languages that make different assumptions about memory: |
| 203 | +for example, allowing a component that relies on garbage-collected memory |
| 204 | +to interoperate with one that uses conventional linear memory. |
| 205 | + |
| 206 | +## Using components |
16 | 207 |
|
17 | 208 | Now that you have a better idea about how the component model can help you, take a look at [how to build components](../language-support.md) in your favorite language! |
18 | 209 |
|
19 | | -> For more background on why the component model was created, take a look at the specification's [goals](https://github.com/WebAssembly/component-model/blob/main/design/high-level/Goals.md), [use cases](https://github.com/WebAssembly/component-model/blob/main/design/high-level/UseCases.md) and [design choices](https://github.com/WebAssembly/component-model/blob/main/design/high-level/Choices.md). |
| 210 | +## Further reading |
| 211 | + |
| 212 | +For more background on why the component model was created, |
| 213 | +take a look at the specification's [goals](https://github.com/WebAssembly/component-model/blob/main/design/high-level/Goals.md), |
| 214 | +[use cases](https://github.com/WebAssembly/component-model/blob/main/design/high-level/UseCases.md) |
| 215 | +and [design choices](https://github.com/WebAssembly/component-model/blob/main/design/high-level/Choices.md). |
0 commit comments