3
3
# Introduction
4
4
5
5
References are one of the more flexible and powerful tools available in
6
- Rust. A reference can point anywhere: into the managed or exchange
7
- heap, into the stack, and even into the interior of another data structure. A
8
- reference is as flexible as a C pointer or C++ reference. However,
9
- unlike C and C++ compilers, the Rust compiler includes special static checks
10
- that ensure that programs use references safely. Another advantage of
11
- references is that they are invisible to the garbage collector, so
12
- working with references helps reduce the overhead of automatic memory
13
- management.
6
+ Rust. They can point anywhere: into the heap, stack, and even into the
7
+ interior of another data structure. A reference is as flexible as a C pointer
8
+ or C++ reference.
9
+
10
+ Unlike C and C++ compilers, the Rust compiler includes special static
11
+ checks that ensure that programs use references safely.
14
12
15
13
Despite their complete safety, a reference's representation at runtime
16
14
is the same as that of an ordinary pointer in a C program. They introduce zero
@@ -26,7 +24,7 @@ through several examples.
26
24
27
25
References, sometimes known as * borrowed pointers* , are only valid for
28
26
a limited duration. References never claim any kind of ownership
29
- over the data that they point to: instead, they are used for cases
27
+ over the data that they point to, instead, they are used for cases
30
28
where you would like to use data for a short time.
31
29
32
30
As an example, consider a simple struct type ` Point ` :
@@ -36,27 +34,23 @@ struct Point {x: f64, y: f64}
36
34
~~~
37
35
38
36
We can use this simple definition to allocate points in many different ways. For
39
- example, in this code, each of these three local variables contains a
40
- point, but allocated in a different place:
37
+ example, in this code, each of these local variables contains a point,
38
+ but allocated in a different place:
41
39
42
40
~~~
43
41
# struct Point {x: f64, y: f64}
44
- let on_the_stack : Point = Point {x: 3.0, y: 4.0};
45
- let managed_box : @Point = @Point {x: 5.0, y: 1.0};
46
- let owned_box : Box<Point> = box Point {x: 7.0, y: 9.0};
42
+ let on_the_stack : Point = Point {x: 3.0, y: 4.0};
43
+ let on_the_heap : Box<Point> = box Point {x: 7.0, y: 9.0};
47
44
~~~
48
45
49
46
Suppose we wanted to write a procedure that computed the distance between any
50
- two points, no matter where they were stored. For example, we might like to
51
- compute the distance between ` on_the_stack ` and ` managed_box ` , or between
52
- ` managed_box ` and ` owned_box ` . One option is to define a function that takes
53
- two arguments of type ` Point ` —that is, it takes the points by value. But if we
54
- define it this way, calling the function will cause the points to be
55
- copied. For points, this is probably not so bad, but often copies are
47
+ two points, no matter where they were stored. One option is to define a function
48
+ that takes two arguments of type ` Point ` —that is, it takes the points __ by value__ .
49
+ But if we define it this way, calling the function will cause the points __ to be
50
+ copied__ . For points, this is probably not so bad, but often copies are
56
51
expensive. Worse, if the data type contains mutable fields, copying can change
57
- the semantics of your program in unexpected ways. So we'd like to define a
58
- function that takes the points by pointer. We can use references to do
59
- this:
52
+ the semantics of your program in unexpected ways. So we'd like to define
53
+ a function that takes the points just as a __ reference__ /__ borrowed pointer__ .
60
54
61
55
~~~
62
56
# struct Point {x: f64, y: f64}
@@ -68,30 +62,27 @@ fn compute_distance(p1: &Point, p2: &Point) -> f64 {
68
62
}
69
63
~~~
70
64
71
- Now we can call ` compute_distance() ` in various ways:
65
+ Now we can call ` compute_distance() `
72
66
73
67
~~~
74
68
# struct Point {x: f64, y: f64}
75
69
# let on_the_stack : Point = Point{x: 3.0, y: 4.0};
76
- # let managed_box : @Point = @Point{x: 5.0, y: 1.0};
77
- # let owned_box : Box<Point> = box Point{x: 7.0, y: 9.0};
70
+ # let on_the_heap : Box<Point> = box Point{x: 7.0, y: 9.0};
78
71
# fn compute_distance(p1: &Point, p2: &Point) -> f64 { 0.0 }
79
- compute_distance(&on_the_stack, managed_box);
80
- compute_distance(managed_box, owned_box);
72
+ compute_distance(&on_the_stack, on_the_heap);
81
73
~~~
82
74
83
75
Here, the ` & ` operator takes the address of the variable
84
76
` on_the_stack ` ; this is because ` on_the_stack ` has the type ` Point `
85
77
(that is, a struct value) and we have to take its address to get a
86
78
value. We also call this _ borrowing_ the local variable
87
- ` on_the_stack ` , because we have created an alias : that is, another
79
+ ` on_the_stack ` , because we have created __ an alias __ : that is, another
88
80
name for the same data.
89
81
90
- In contrast, we can pass the boxes ` managed_box ` and ` owned_box ` to
91
- ` compute_distance ` directly. The compiler automatically converts a box like
92
- ` @Point ` or ` ~Point ` to a reference like ` &Point ` . This is another form
93
- of borrowing: in this case, the caller lends the contents of the managed or
94
- owned box to the callee.
82
+ In contrast, we can pass ` on_the_heap ` to ` compute_distance ` directly.
83
+ The compiler automatically converts a box like ` Box<Point> ` to a reference like
84
+ ` &Point ` . This is another form of borrowing: in this case, the caller lends
85
+ the contents of the box to the callee.
95
86
96
87
Whenever a caller lends data to a callee, there are some limitations on what
97
88
the caller can do with the original. For example, if the contents of a
@@ -134,10 +125,10 @@ let on_the_stack2 : &Point = &tmp;
134
125
135
126
# Taking the address of fields
136
127
137
- As in C, the ` & ` operator is not limited to taking the address of
128
+ The ` & ` operator is not limited to taking the address of
138
129
local variables. It can also take the address of fields or
139
130
individual array elements. For example, consider this type definition
140
- for ` rectangle ` :
131
+ for ` Rectangle ` :
141
132
142
133
~~~
143
134
struct Point {x: f64, y: f64} // as before
@@ -153,9 +144,7 @@ Now, as before, we can define rectangles in a few different ways:
153
144
# struct Rectangle {origin: Point, size: Size}
154
145
let rect_stack = &Rectangle {origin: Point {x: 1.0, y: 2.0},
155
146
size: Size {w: 3.0, h: 4.0}};
156
- let rect_managed = @Rectangle {origin: Point {x: 3.0, y: 4.0},
157
- size: Size {w: 3.0, h: 4.0}};
158
- let rect_owned = box Rectangle {origin: Point {x: 5.0, y: 6.0},
147
+ let rect_heap = box Rectangle {origin: Point {x: 5.0, y: 6.0},
159
148
size: Size {w: 3.0, h: 4.0}};
160
149
~~~
161
150
@@ -167,109 +156,29 @@ operator. For example, I could write:
167
156
# struct Size {w: f64, h: f64} // as before
168
157
# struct Rectangle {origin: Point, size: Size}
169
158
# let rect_stack = &Rectangle {origin: Point {x: 1.0, y: 2.0}, size: Size {w: 3.0, h: 4.0}};
170
- # let rect_managed = @Rectangle {origin: Point {x: 3.0, y: 4.0}, size: Size {w: 3.0, h: 4.0}};
171
- # let rect_owned = box Rectangle {origin: Point {x: 5.0, y: 6.0}, size: Size {w: 3.0, h: 4.0}};
159
+ # let rect_heap = box Rectangle {origin: Point {x: 5.0, y: 6.0}, size: Size {w: 3.0, h: 4.0}};
172
160
# fn compute_distance(p1: &Point, p2: &Point) -> f64 { 0.0 }
173
- compute_distance(&rect_stack.origin, &rect_managed .origin);
161
+ compute_distance(&rect_stack.origin, &rect_heap .origin);
174
162
~~~
175
163
176
164
which would borrow the field ` origin ` from the rectangle on the stack
177
- as well as from the managed box, and then compute the distance between them.
165
+ as well as from the owned box, and then compute the distance between them.
178
166
179
- # Borrowing managed boxes and rooting
167
+ # Lifetimes
180
168
181
- We’ve seen a few examples so far of borrowing heap boxes, both managed
182
- and owned. Up till this point, we’ve glossed over issues of
183
- safety. As stated in the introduction, at runtime a reference
184
- is simply a pointer, nothing more. Therefore, avoiding C's problems
185
- with dangling pointers requires a compile-time safety check.
169
+ We’ve seen a few examples of borrowing data. Up till this point, we’ve glossed
170
+ over issues of safety. As stated in the introduction, at runtime a reference
171
+ is simply a pointer, nothing more. Therefore, avoiding C's problems with
172
+ dangling pointers requires a compile-time safety check.
186
173
187
- The basis for the check is the notion of _ lifetimes _ . A lifetime is a
174
+ The basis for the check is the notion of __ lifetimes __ . A lifetime is a
188
175
static approximation of the span of execution during which the pointer
189
176
is valid: it always corresponds to some expression or block within the
190
- program. Code inside that expression can use the pointer without
191
- restrictions. But if the pointer escapes from that expression (for
192
- example, if the expression contains an assignment expression that
193
- assigns the pointer to a mutable field of a data structure with a
194
- broader scope than the pointer itself), the compiler reports an
195
- error. We'll be discussing lifetimes more in the examples to come, and
196
- a more thorough introduction is also available.
197
-
198
- When the ` & ` operator creates a reference, the compiler must
199
- ensure that the pointer remains valid for its entire
200
- lifetime. Sometimes this is relatively easy, such as when taking the
201
- address of a local variable or a field that is stored on the stack:
202
-
203
- ~~~
204
- struct X { f: int }
205
- fn example1() {
206
- let mut x = X { f: 3 };
207
- let y = &mut x.f; // -+ L
208
- // ... // |
209
- } // -+
210
- ~~~
211
-
212
- Here, the lifetime of the reference ` y ` is simply L, the
213
- remainder of the function body. The compiler need not do any other
214
- work to prove that code will not free ` x.f ` . This is true even if the
215
- code mutates ` x ` .
216
-
217
- The situation gets more complex when borrowing data inside heap boxes:
218
-
219
- ~~~
220
- # struct X { f: int }
221
- fn example2() {
222
- let mut x = @X { f: 3 };
223
- let y = &x.f; // -+ L
224
- // ... // |
225
- } // -+
226
- ~~~
227
-
228
- In this example, the value ` x ` is a heap box, and ` y ` is therefore a
229
- pointer into that heap box. Again the lifetime of ` y ` is L, the
230
- remainder of the function body. But there is a crucial difference:
231
- suppose ` x ` were to be reassigned during the lifetime L? If the
232
- compiler isn't careful, the managed box could become * unrooted* , and
233
- would therefore be subject to garbage collection. A heap box that is
234
- unrooted is one such that no pointer values in the heap point to
235
- it. It would violate memory safety for the box that was originally
236
- assigned to ` x ` to be garbage-collected, since a non-heap
237
- pointer * ` y ` * still points into it.
238
-
239
- > * Note:* Our current implementation implements the garbage collector
240
- > using reference counting and cycle detection.
241
-
242
- For this reason, whenever an ` & ` expression borrows the interior of a
243
- managed box stored in a mutable location, the compiler inserts a
244
- temporary that ensures that the managed box remains live for the
245
- entire lifetime. So, the above example would be compiled as if it were
246
- written
247
-
248
- ~~~
249
- # struct X { f: int }
250
- fn example2() {
251
- let mut x = @X {f: 3};
252
- let x1 = x;
253
- let y = &x1.f; // -+ L
254
- // ... // |
255
- } // -+
256
- ~~~
257
-
258
- Now if ` x ` is reassigned, the pointer ` y ` will still remain valid. This
259
- process is called * rooting* .
260
-
261
- # Borrowing owned boxes
262
-
263
- The previous example demonstrated * rooting* , the process by which the
264
- compiler ensures that managed boxes remain live for the duration of a
265
- borrow. Unfortunately, rooting does not work for borrows of owned
266
- boxes, because it is not possible to have two references to an owned
267
- box.
268
-
269
- For owned boxes, therefore, the compiler will only allow a borrow * if
270
- the compiler can guarantee that the owned box will not be reassigned
271
- or moved for the lifetime of the pointer* . This does not necessarily
272
- mean that the owned box is stored in immutable memory. For example,
177
+ program.
178
+
179
+ The compiler will only allow a borrow * if it can guarantee that the data will
180
+ not be reassigned or moved for the lifetime of the pointer* . This does not
181
+ necessarily mean that the data is stored in immutable memory. For example,
273
182
the following function is legal:
274
183
275
184
~~~
@@ -294,7 +203,7 @@ and `x` is declared as mutable. However, the compiler can prove that
294
203
and in fact is mutated later in the function.
295
204
296
205
It may not be clear why we are so concerned about mutating a borrowed
297
- variable. The reason is that the runtime system frees any owned box
206
+ variable. The reason is that the runtime system frees any box
298
207
_ as soon as its owning reference changes or goes out of
299
208
scope_ . Therefore, a program like this is illegal (and would be
300
209
rejected by the compiler):
@@ -337,31 +246,34 @@ Once the reassignment occurs, the memory will look like this:
337
246
+---------+
338
247
~~~
339
248
340
- Here you can see that the variable ` y ` still points at the old box,
341
- which has been freed.
249
+ Here you can see that the variable ` y ` still points at the old ` f `
250
+ property of Foo, which has been freed.
342
251
343
252
In fact, the compiler can apply the same kind of reasoning to any
344
- memory that is _ (uniquely) owned by the stack frame _ . So we could
253
+ memory that is (uniquely) owned by the stack frame . So we could
345
254
modify the previous example to introduce additional owned pointers
346
255
and structs, and the compiler will still be able to detect possible
347
- mutations:
256
+ mutations. This time, we'll use an analogy to illustrate the concept.
348
257
349
258
~~~ {.ignore}
350
259
fn example3() -> int {
351
- struct R { g: int }
352
- struct S { f: Box<R> }
260
+ struct House { owner: Box<Person> }
261
+ struct Person { age: int }
353
262
354
- let mut x = box S {f: box R {g: 3}};
355
- let y = &x.f.g;
356
- x = box S {f: box R {g: 4}}; // Error reported here.
357
- x.f = box R {g: 5}; // Error reported here.
358
- *y
263
+ let mut house = box House {
264
+ owner: box Person {age: 30}
265
+ };
266
+
267
+ let owner_age = &house.owner.age;
268
+ house = box House {owner: box Person {age: 40}}; // Error reported here.
269
+ house.owner = box Person {age: 50}; // Error reported here.
270
+ *owner_age
359
271
}
360
272
~~~
361
273
362
- In this case, two errors are reported, one when the variable ` x ` is
363
- modified and another when ` x.f ` is modified. Either modification would
364
- invalidate the pointer ` y ` .
274
+ In this case, two errors are reported, one when the variable ` house ` is
275
+ modified and another when ` house.owner ` is modified. Either modification would
276
+ invalidate the pointer ` owner_age ` .
365
277
366
278
# Borrowing and enums
367
279
@@ -412,7 +324,7 @@ circle constant][tau] and not that dreadfully outdated notion of pi).
412
324
413
325
The second match is more interesting. Here we match against a
414
326
rectangle and extract its size: but rather than copy the ` size `
415
- struct, we use a by -reference binding to create a pointer to it. In
327
+ struct, we use a __ by -reference binding __ to create a pointer to it. In
416
328
other words, a pattern binding like ` ref size ` binds the name ` size `
417
329
to a pointer of type ` &size ` into the _ interior of the enum_ .
418
330
@@ -526,12 +438,12 @@ time one that does not compile:
526
438
527
439
~~~ {.ignore}
528
440
struct Point {x: f64, y: f64}
529
- fn get_x_sh(p: @ Point) -> &f64 {
441
+ fn get_x_sh(p: & Point) -> &f64 {
530
442
&p.x // Error reported here
531
443
}
532
444
~~~
533
445
534
- Here, the function ` get_x_sh() ` takes a managed box as input and
446
+ Here, the function ` get_x_sh() ` takes a reference as input and
535
447
returns a reference. As before, the lifetime of the reference
536
448
that will be returned is a parameter (specified by the
537
449
caller). That means that ` get_x_sh() ` promises to return a reference
@@ -540,17 +452,18 @@ subtly different from the first example, which promised to return a
540
452
pointer that was valid for as long as its pointer argument was valid.
541
453
542
454
Within ` get_x_sh() ` , we see the expression ` &p.x ` which takes the
543
- address of a field of a managed box. The presence of this expression
544
- implies that the compiler must guarantee that, so long as the
545
- resulting pointer is valid, the managed box will not be reclaimed by
546
- the garbage collector. But recall that ` get_x_sh() ` also promised to
455
+ address of a field of a Point. The presence of this expression
456
+ implies that the compiler must guarantee that , so long as the
457
+ resulting pointer is valid, the original Point won't be moved or changed.
458
+
459
+ But recall that ` get_x_sh() ` also promised to
547
460
return a pointer that was valid for as long as the caller wanted it to
548
461
be. Clearly, ` get_x_sh() ` is not in a position to make both of these
549
462
guarantees; in fact, it cannot guarantee that the pointer will remain
550
463
valid at all once it returns, as the parameter ` p ` may or may not be
551
464
live in the caller. Therefore, the compiler will report an error here.
552
465
553
- In general, if you borrow a managed ( or owned) box to create a
466
+ In general, if you borrow a structs or boxes to create a
554
467
reference, it will only be valid within the function
555
468
and cannot be returned. This is why the typical way to return references
556
469
is to take references as input (the only other case in
0 commit comments