Skip to content

Commit bc7ff3f

Browse files
committed
TRPL copyedits: strings
1 parent a54e93c commit bc7ff3f

File tree

1 file changed

+91
-23
lines changed

1 file changed

+91
-23
lines changed

src/doc/trpl/strings.md

Lines changed: 91 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,34 @@
11
% Strings
22

3-
Strings are an important concept for any programmer to master. Rust's string
3+
Strings are an important concept for any programmer to master. Rusts string
44
handling system is a bit different from other languages, due to its systems
55
focus. Any time you have a data structure of variable size, things can get
6-
tricky, and strings are a re-sizable data structure. That being said, Rust's
6+
tricky, and strings are a re-sizable data structure. That being said, Rusts
77
strings also work differently than in some other systems languages, such as C.
88

9-
Let's dig into the details. A *string* is a sequence of Unicode scalar values
10-
encoded as a stream of UTF-8 bytes. All strings are guaranteed to be
11-
validly encoded UTF-8 sequences. Additionally, strings are not null-terminated
12-
and can contain null bytes.
9+
Lets dig into the details. A string is a sequence of Unicode scalar values
10+
encoded as a stream of UTF-8 bytes. All strings are guaranteed to be a valid
11+
encoding of UTF-8 sequences. Additionally, unlike some systems languages,
12+
strings are not null-terminated and can contain null bytes.
1313

14-
Rust has two main types of strings: `&str` and `String`.
14+
Rust has two main types of strings: `&str` and `String`. Let’s talk about
15+
`&str` first. These are called ‘string slices’. String literals are of the type
16+
`&'static str`:
1517

16-
The first kind is a `&str`. These are called *string slices*. String literals
17-
are of the type `&str`:
18-
19-
```{rust}
20-
let string = "Hello there."; // string: &str
18+
```rust
19+
let string = "Hello there."; // string: &'static str
2120
```
2221

23-
This string is statically allocated, meaning that it's saved inside our
22+
This string is statically allocated, meaning that its saved inside our
2423
compiled program, and exists for the entire duration it runs. The `string`
2524
binding is a reference to this statically allocated string. String slices
2625
have a fixed size, and cannot be mutated.
2726

28-
A `String`, on the other hand, is a heap-allocated string. This string
29-
is growable, and is also guaranteed to be UTF-8. `String`s are
30-
commonly created by converting from a string slice using the
31-
`to_string` method.
27+
A `String`, on the other hand, is a heap-allocated string. This string is
28+
growable, and is also guaranteed to be UTF-8. `String`s are commonly created by
29+
converting from a string slice using the `to_string` method.
3230

33-
```{rust}
31+
```rust
3432
let mut s = "Hello".to_string(); // mut s: String
3533
println!("{}", s);
3634

@@ -54,8 +52,78 @@ fn main() {
5452
Viewing a `String` as a `&str` is cheap, but converting the `&str` to a
5553
`String` involves allocating memory. No reason to do that unless you have to!
5654

57-
That's the basics of strings in Rust! They're probably a bit more complicated
58-
than you are used to, if you come from a scripting language, but when the
59-
low-level details matter, they really matter. Just remember that `String`s
60-
allocate memory and control their data, while `&str`s are a reference to
61-
another string, and you'll be all set.
55+
## Indexing
56+
57+
Because strings are valid UTF-8, strings do not support indexing:
58+
59+
```rust,ignore
60+
let s = "hello";
61+
62+
println!("The first letter of s is {}", s[0]); // ERROR!!!
63+
```
64+
65+
Usually, access to a vector with `[]` is very fast. But, because each character
66+
in a UTF-8 encoded string can be multiple bytes, you have to walk over the
67+
string to find the nᵗʰ letter of a string. This is a significantly more
68+
expensive operation, and we don’t want to be misleading. Furthermore, ‘letter’
69+
isn’t something defined in Unicode, exactly. We can choose to look at a string as
70+
individual bytes, or as codepoints:
71+
72+
```rust
73+
let hachiko = "忠犬ハチ公";
74+
75+
for b in hachiko.as_bytes() {
76+
print!("{}, ", b);
77+
}
78+
79+
println!("");
80+
81+
for c in hachiko.chars() {
82+
print!("{}, ", c);
83+
}
84+
85+
println!("");
86+
```
87+
88+
This prints:
89+
90+
```text
91+
229, 191, 160, 231, 138, 172, 227, 131, 143, 227, 131, 129, 229, 133, 172,
92+
忠, 犬, ハ, チ, 公,
93+
```
94+
95+
As you can see, there are more bytes than `char`s.
96+
97+
You can get something similar to an index like this:
98+
99+
```rust
100+
# let hachiko = "忠犬ハチ公";
101+
let dog = hachiko.chars().nth(1); // kinda like hachiko[1]
102+
```
103+
104+
This emphasizes that we have to go through the whole list of `chars`.
105+
106+
## Concatenation
107+
108+
If you have a `String`, you can concatenate a `&str` to the end of it:
109+
110+
```rust
111+
let hello = "Hello ".to_string();
112+
let world = "world!";
113+
114+
let hello_world = hello + world;
115+
```
116+
117+
But if you have two `String`s, you need an `&`:
118+
119+
```rust
120+
let hello = "Hello ".to_string();
121+
let world = "world!".to_string();
122+
123+
let hello_world = hello + &world;
124+
```
125+
126+
This is because `&String` can automatically coerece to a `&str`. This is a
127+
feature called ‘[`Deref` coercions][dc]’.
128+
129+
[dc]: deref-coercions.html

0 commit comments

Comments
 (0)