Skip to content

Commit f998c87

Browse files
committed
Editing pass
1 parent da8ccaf commit f998c87

File tree

1 file changed

+83
-63
lines changed

1 file changed

+83
-63
lines changed

_posts/2015-04-24-FFI-and-Rust.md

Lines changed: 83 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -5,37 +5,39 @@ author: Alex Crichton
55
description: "Zero-cost and safe FFI in Rust"
66
---
77

8-
98
Rust's quest for world domination was never destined to happen overnight, so
10-
Rust needs to be able to interoperate with the existing world just as easily
11-
as it talks to itself. To solve this problem, **Rust lets you communicate with C
12-
APIs at no extra cost while providing strong safety guarantees**.
13-
14-
This is also referred to as Rust's foreign function interface (FFI) and is the
15-
method by which Rust communicates with other programming languages. Following
16-
Rust's design principles, this is a **zero cost abstraction** where function
17-
calls between Rust and C have identical performance to C function calls. FFI
18-
bindings can also leverage language features such as ownership and borrowing to
19-
provide a **safe interface**.
9+
Rust needs to be able to interoperate with the existing world just as easily as
10+
it talks to itself. In particular, **Rust makes it easy to communicate with C
11+
APIs without overhead, and to leverage its ownership system to provide much
12+
stronger safety guarantees for those APIs at the same time**.
13+
14+
In more detail, Rust's *foreign function interface* (FFI) is the way that it
15+
communicated with other languages. Following Rust's design principles, the FFI
16+
provides a **zero cost abstraction** where function calls between Rust and C
17+
have identical performance to C function calls. FFI bindings can also leverage
18+
language features such as ownership and borrowing to provide a **safe
19+
interface** that enforces protocols around pointers and other resources. These
20+
protocols usually appear only in the documentation for C APIs -- at best -- but
21+
Rust makes them explicit.
2022

2123
In this post we'll explore how to encapsulate unsafe FFI calls to C in safe,
22-
zero-cost abstractions by looking at some examples of interacting with C.
23-
Working with C is, however, just an example, as we'll also see how Rust can
24-
easily talk to languages like Python and Ruby just as seamlessly as C.
24+
zero-cost abstractions. Working with C is, however, just an example; we'll also
25+
see how Rust can easily talk to languages like Python and Ruby just as
26+
seamlessly as with C.
2527

26-
### Talking to C
28+
### Rust talking to C
2729

28-
First, let's start with an example of calling C code from Rust and then
29-
demonstrate that Rust imposes no additional overhead. Starting off simple,
30-
here's a C program which will simply double all the input it's given:
30+
Let's start with a simple example of calling C code from Rust and then
31+
demonstrate that Rust imposes no additional overhead. Here's a C program which
32+
will simply double all the input it's given:
3133

3234
```c
3335
int double_input(int input) {
3436
return input * 2;
3537
}
3638
```
3739
38-
To call this from Rust, one would write this program:
40+
To call this from Rust, you might write a program like this:
3941
4042
```rust
4143
extern crate libc;
@@ -51,18 +53,18 @@ fn main() {
5153
}
5254
```
5355

54-
And that's it! You can try this out for yourself by [checking out the code on
55-
GitHub][rust2c] and running `cargo run` from that directory. At the source level
56-
we can see that there's no burden in calling an external function, and we'll see
57-
soon that the generated code indeed has no overhead. There are, however, a few
58-
subtle aspects of this Rust program so let's cover each piece in detail.
56+
And that's it! You can try this out for yourself by
57+
[checking out the code on GitHub][rust2c] and running `cargo run` from that
58+
directory. **At the source level we can see that there's no burden in calling an
59+
external function beyond stating its signature, and we'll see soon that the
60+
generated code indeed has no overhead, either.** There are, however, a few
61+
subtle aspects of this Rust program, so let's cover each piece in detail.
5962

6063
[rust2c]: https://github.com/alexcrichton/rust-ffi-examples/tree/master/rust-to-c
6164

62-
First up we see `extern crate libc`. [This crate][libc] provides many useful
63-
type definitions for FFI bindings when talking with C, and it is necessary
64-
to ensure that both C and Rust agree on the types crossing the language
65-
boundary.
65+
First up we see `extern crate libc`. [The libc crate][libc] provides many useful
66+
type definitions for FFI bindings when talking with C, and it makes it easy to
67+
ensure that both C and Rust agree on the types crossing the language boundary.
6668

6769
[libc]: https://crates.io/crates/libc
6870

@@ -89,9 +91,12 @@ fn main() {
8991

9092
We see one of the crucial aspects of FFI in Rust here, the `unsafe` block. The
9193
compiler knows nothing about the implementation of `double_input`, so it must
92-
assume that memory unsafety *could* happen in this scenario. This may seem
93-
limiting, but Rust has just the right set of tools to allow consumers to not
94-
worry about `unsafe` (more on this in a moment).
94+
assume that memory unsafety *could* happen whenever you call a foreign function.
95+
The `unsafe` block is how the programmer takes responsibility for ensuring
96+
safety -- you are promising that the actual call you make will not, in fact,
97+
violate memory safety, and thus that Rust's basic guarantees are upheld. This
98+
may seem limiting, but Rust has just the right set of tools to allow consumers
99+
to not worry about `unsafe` (more on this in a moment).
95100

96101
Now that we've seen how to call a C function from Rust, let's see if we can
97102
verify this claim of zero overhead. Almost all programming languages can call
@@ -111,11 +116,11 @@ exactly the same cost as it would be in C.
111116

112117
### Safe Abstractions
113118

114-
One of Rust's core design principles is its emphasis on ownership, and FFI is no
115-
exception here. When binding a C library in Rust you not only have the benefit
116-
of 0 overhead, but you are also able to make it *safer* than C can! Bindings
117-
can leverage the ownership and borrowing principles in Rust to codify comments
118-
typically found in a C header about how its API should be used.
119+
Most features in Rust tie into its core concept of ownership, and the FFI is no
120+
exception. When binding a C library in Rust you not only have the benefit of zero
121+
overhead, but you are also able to make it *safer* than C can! **Bindings can
122+
leverage the ownership and borrowing principles in Rust to codify comments
123+
typically found in a C header about how its API should be used.**
119124

120125
For example, consider a C library for parsing a tarball. This library will
121126
expose functions to read the contents of each file in the tarball, probably
@@ -151,25 +156,35 @@ impl Tarball {
151156
}
152157
```
153158

154-
Here the `*mut tarball_t` pointer is *owned by* a `Tarball`, so we already have
155-
rich knowledge about the lifetime of the resource. Additionally, the `file`
156-
method returns a **borrowed slice** whose lifetime is connected to the same
157-
lifetime as the source tarball itself. This is Rust's way of indicating that the
158-
returned data cannot outlive the tarball, statically preventing bugs that may be
159-
encountered when just using C.
160-
161-
A key aspect of the Rust binding here is that it is a safe function! Although it
162-
has an `unsafe` implementation (due to calling an FFI function), this interface
163-
is safe to call and will not cause tough-to-track-down segfaults. And don't
164-
forget, all of this is coming at 0 cost as the raw types in C are representable
165-
in Rust with no extra allocations or overhead.
166-
167-
### Talking to Rust
168-
169-
A major feature of Rust is that it does not have a garbage collector or
170-
runtime, and one of the benefits of this is that Rust can be called from C with
171-
no setup at all. This means that the zero overhead FFI not only applies when
172-
Rust calls into C, but also when C calls into Rust!
159+
Here the `*mut tarball_t` pointer is *owned by* a `Tarball`, which is
160+
responsible for any destruction and cleanup. So we already have rich knowledge
161+
about the lifetime of the resource: if you have access to a `Tarball`, you know
162+
that the pointer inside must still be valid. Additionally, the `file` method
163+
returns a **borrowed slice** whose lifetime is implicitly connected to the
164+
lifetime of the source tarball itself (the `&self` argument). This is Rust's way
165+
of indicating that the returned slice can only be used within the lifetime of
166+
the tarball, which in turn means that the slice will always point to valid
167+
memory. Thus, Rust statically prevents dangling pointer bugs that are easy to
168+
make when working directly with C. (If you're not familiar with this kind of
169+
borrowing in Rust, have a look at Yehuda Katz's [blog post] on ownership.)
170+
171+
[blog post]: http://blog.skylight.io/rust-means-never-having-to-close-a-socket/
172+
173+
A key aspect of the Rust binding here is that it is a safe function, meaning
174+
that callers do not have to use `unsafe` blocks to invoke it! Although it has an
175+
`unsafe` *implementation* (due to calling an FFI function), the *interface* uses
176+
borrowing to guarantee that no memory unsafety can occur in any Rust code that
177+
uses it. That is, due to Rust's static checking, it's simply not possible to
178+
cause a segfault using the API on the Rust side. And don't forget, all of this
179+
is coming at zero cost: the raw types in C are representable in Rust with no
180+
extra allocations or overhead.
181+
182+
### C talking to Rust
183+
184+
**Despite guaranteeing memory safety, Rust does not have a garbage collector or
185+
runtime, and one of the benefits of this is that Rust code can be called from C
186+
with no setup at all.** This means that the zero overhead FFI not only applies
187+
when Rust calls into C, but also when C calls into Rust!
173188

174189
Let's take the example above, but reverse the roles of each language. As before,
175190
all the code below is [available on GitHub][c2rust]. First we'll start off with
@@ -185,10 +200,10 @@ pub extern fn double_input(input: i32) -> i32 {
185200
```
186201

187202
As with the Rust code before, there's not a whole lot here but there are some
188-
subtle aspects in play. First off we've got our function definition with a
203+
subtle aspects in play. First off, we've labeled our function definition with a
189204
`#[no_mangle]` attribute. This instructs the compiler to not mangle the symbol
190205
name for the function `double_input`. Rust employs name mangling similar to C++
191-
to ensure that libraries do not clash with one another, and this attributes
206+
to ensure that libraries do not clash with one another, and this attribute
192207
means that you don't have to guess a symbol name like
193208
`double_input::h485dee7f568bebafeaa` from C.
194209

@@ -245,19 +260,24 @@ more languages.
245260
[rb2rust]: https://github.com/alexcrichton/rust-ffi-examples/tree/master/ruby-to-rust
246261
[js2rust]: https://github.com/alexcrichton/rust-ffi-examples/tree/master/node-to-rust
247262
248-
A common desire for writing C code in these languages is to speed up some
249-
component of a library or application that's performance critical. With the
250-
features of Rust we've seen here, however, Rust is just as suitable for this
251-
sort of usage. One of Rust's first production users,
263+
When writing code in these languages, you sometimes want to speed up some
264+
component that's performance critical, but in the past this often required
265+
dropping all the way to C, and thereby giving up the memory safety, high-level
266+
abstractions, and ergonomics of these languages.
267+
268+
The fact that Rust can talk to easily with C, however, means that it is also
269+
viable for this sort of usage. One of Rust's first production users,
252270
[Skylight](https://www.skylight.io), was able to improve the performance and
253271
memory usage of their data collection agent almost instantly by just using Rust,
254272
and the Rust code is all published as a Ruby gem.
255273
256274
Moving from a language like Python and Ruby down to C to optimize performance is
257275
often quite difficult as it's tough to ensure that the program won't crash in a
258276
difficult-to-debug way. Rust, however, not only brings zero cost FFI, but *also*
259-
the same safety guarantees the original source language, enabling this sort of
260-
optimization to happen even more frequently!
277+
makes it possible to retain the same safety guarantees as the original source
278+
language. In the long run, this should make it much easier for programmers in
279+
these languages to drop down and do some systems programming to squeeze out
280+
critical performance when they need it.
261281
262282
FFI is just one of many tools in the toolbox of Rust, but it's a key component
263283
to Rust's adoption as it allows Rust to seamlessly integrate with existing code

0 commit comments

Comments
 (0)