Skip to content

Commit e364b6c

Browse files
committed
lexical structure: move the description of shebang-removal
This takes place after CRLF normalization. It's better not to list the shebang in a Lexer block, as it isn't a token that can be fed to a macro.
1 parent 5f51269 commit e364b6c

File tree

2 files changed

+25
-30
lines changed

2 files changed

+25
-30
lines changed

src/crates-and-source-files.md

Lines changed: 3 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,9 @@
22

33
> **<sup>Syntax</sup>**\
44
> _Crate_ :\
5-
> &nbsp;&nbsp; SHEBANG<sup>?</sup>\
65
> &nbsp;&nbsp; [_InnerAttribute_]<sup>\*</sup>\
76
> &nbsp;&nbsp; [_Item_]<sup>\*</sup>
87
9-
> **<sup>Lexer</sup>**\
10-
> SHEBANG : `#!` \~`\n`<sup>\+</sup>[](#shebang)
11-
12-
138
> Note: Although Rust, like any other language, can be implemented by an
149
> interpreter as well as a compiler, the only existing implementation is a
1510
> compiler, and the language has always been designed to be compiled. For these
@@ -51,6 +46,8 @@ that apply to the containing module, most of which influence the behavior of
5146
the compiler. The anonymous crate module can have additional attributes that
5247
apply to the crate as a whole.
5348

49+
> **Note**: The file's contents may be preceded by a [shebang].
50+
5451
```rust
5552
// Specify the crate name.
5653
#![crate_name = "projx"]
@@ -63,28 +60,6 @@ apply to the crate as a whole.
6360
#![warn(non_camel_case_types)]
6461
```
6562

66-
## Shebang
67-
68-
A source file can have a [_shebang_] (SHEBANG production), which indicates
69-
to the operating system what program to use to execute this file. It serves
70-
essentially to treat the source file as an executable script. The shebang
71-
can only occur at the beginning of the file.
72-
It is ignored by the compiler. For example:
73-
74-
<!-- ignore: tests don't like shebang -->
75-
```rust,ignore
76-
#!/usr/bin/env rustx
77-
78-
fn main() {
79-
println!("Hello!");
80-
}
81-
```
82-
83-
A restriction is imposed on the shebang syntax to avoid confusion with an
84-
[attribute]. The `#!` characters must not be followed by a `[` token, ignoring
85-
intervening [comments] or [whitespace]. If this restriction fails, then it is
86-
not treated as a shebang, but instead as the start of an attribute.
87-
8863
## Preludes and `no_std`
8964

9065
This section has been moved to the [Preludes chapter](names/preludes.md).
@@ -153,19 +128,17 @@ or `_` (U+005F) characters.
153128
[_InnerAttribute_]: attributes.md
154129
[_Item_]: items.md
155130
[_MetaNameValueStr_]: attributes.md#meta-item-attribute-syntax
156-
[_shebang_]: https://en.wikipedia.org/wiki/Shebang_(Unix)
157131
[`ExitCode`]: ../std/process/struct.ExitCode.html
158132
[`Infallible`]: ../std/convert/enum.Infallible.html
159133
[`Termination`]: ../std/process/trait.Termination.html
160134
[attribute]: attributes.md
161135
[attributes]: attributes.md
162-
[comments]: comments.md
163136
[function]: items/functions.md
164137
[module]: items/modules.md
165138
[module path]: paths.md
139+
[shebang]: input-format.md#shebang-removal
166140
[trait or lifetime bounds]: trait-bounds.md
167141
[where clauses]: items/generics.md#where-clauses
168-
[whitespace]: whitespace.md
169142

170143
<script>
171144
(function() {

src/input-format.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,31 @@ Each pair of characters `U+000D` (CR) immediately followed by `U+000A` (LF) is r
1919

2020
Other occurrences of the character `U+000D` (CR) are left in place (they are treated as [whitespace]).
2121

22+
## Shebang removal
23+
24+
If the remaining sequence begins with the characters `!#`, the characters up to and including the first `U+000A` (LF) are removed from the sequence.
25+
26+
For example, the first line of the following file would be ignored:
27+
28+
<!-- ignore: tests don't like shebang -->
29+
```rust,ignore
30+
#!/usr/bin/env rustx
31+
32+
fn main() {
33+
println!("Hello!");
34+
}
35+
```
36+
37+
As an exception, if the `#!` characters are followed (ignoring intervening [comments] or [whitespace]) by a `[` token, nothing is removed.
38+
This prevents an [inner attribute] at the start of a source file being removed.
39+
2240
## Tokenization
2341

2442
The resulting sequence of characters is then converted into tokens as described in the remainder of this chapter.
2543

44+
[inner attribute]: attributes.md
2645
[BYTE ORDER MARK]: https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
46+
[comments]: comments.md
2747
[Crates and source files]: crates-and-source-files.md
48+
[_shebang_]: https://en.wikipedia.org/wiki/Shebang_(Unix)
49+
[whitespace]: whitespace.md

0 commit comments

Comments
 (0)