Skip to content

Commit 36900eb

Browse files
xtask: Add code generator for device path nodes
See xtask/src/device_path/README.md for details of the design. The actual generated code is added in the following commit.
1 parent 3a42bd1 commit 36900eb

File tree

10 files changed

+2662
-0
lines changed

10 files changed

+2662
-0
lines changed

xtask/Cargo.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,12 @@ clap = { version = "4.0.4", features = ["derive"] }
1010
# The latest fatfs release (0.3.5) is old, use git instead to pick up some fixes.
1111
fatfs = { git = "https://github.com/rafalh/rust-fatfs.git", rev = "87fc1ed5074a32b4e0344fcdde77359ef9e75432" }
1212
fs-err = "2.6.0"
13+
heck = "0.4.0"
1314
mbrman = "0.5.0"
1415
nix = "0.25.0"
16+
proc-macro2 = "1.0.46"
17+
quote = "1.0.21"
1518
regex = "1.5.4"
1619
serde_json = "1.0.73"
20+
syn = { version = "1.0.101", features = ["full"] }
1721
tempfile = "3.2.0"

xtask/src/device_path/README.md

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# Device path code generation
2+
3+
This module of `xtask` generates code for reading and building UEFI
4+
device paths. The command is `cargo xtask gen-code`.
5+
6+
There are a large number of device path nodes; the UEFI Specification
7+
devotes some 40 pages to describing them all. These definitions are
8+
specified in Rust-like code in [`spec.rs`], and the code generator
9+
produces [`src/proto/device_path/device_path_gen.rs`] containing the
10+
final Rust code. We check this generated file into the git repo, so
11+
there's no need for a `build.rs`.
12+
13+
For each device path node, we generate a packed struct and a builder
14+
struct. The packed struct corresponds almost exactly to the node
15+
structure in the UEFI Specification and is used for read-only access to
16+
a node. The builder struct is used to create new nodes.
17+
18+
## `spec.rs`
19+
20+
The `spec.rs` file is the input that describes each node. The code in
21+
this file is syntactically valid Rust code, but it's not included
22+
directly with a `mod` statement anywhere. Instead, the code is parsed
23+
with [`syn`] and processed in various ways.
24+
25+
The file is organized with modules, one for each [`DeviceType`]. Within
26+
each module are the node definitions. Each node is a `struct` marked
27+
with a `#[node(...)]` attribute, which can contain the following
28+
properties:
29+
* `static_size = <N>` (required): Specifies the expected static size (in
30+
bytes) of the node. This excludes dynamically-sized fields. This is
31+
compared against the internally-calculated size of the struct to help
32+
validate that the node definition is correct. The UEFI Specification
33+
usually says what this value is when describing the node, although a
34+
few are missing or incorrect.
35+
* `sub_type` (optional): Sets the [`DeviceSubType`]. This is usually
36+
inferred from the node's name and the module it's in, but there are a
37+
few edge cases where it needs to be manually specified.
38+
39+
A node struct can be a unit struct, or contain some number of fields. By
40+
default, fields are used unchanged in both the packed and builder
41+
structs. Fields can optionally be marked with a `[#node(...)]` attribute
42+
to alter the code generation, with the following optional properties:
43+
* `no_get_func`: No getter will be generated for this field.
44+
* `custom_get_impl`: A getter will be generated for this field, but the
45+
autogenerated implementation will be replaced with a call to
46+
`self.get_<field_name>`.
47+
* `build_type = <false|"string">`: If set to `false`, no field will be
48+
generated in the builder struct. If set to a string, the contents of
49+
the string will be parsed as a type to use for the build field.
50+
* `custom_build_impl`: When building a node, the autogenerated
51+
implementation for this field will be replaced with a call to
52+
`self.build_<field_name>`. If the field is a DST, the destination
53+
buffer will be passed in. Otherwise, the type is copyable and the
54+
function will just return the value directly.
55+
* `custom_build_size_impl`: When calculating the size of node before
56+
building it, the autogenerated implementation for this field will be
57+
replaced with a call to `self.build_size_<field_name>`.
58+
59+
Any items in a module that are not node structs will be passed through
60+
unmodified to the generated output file. An item can be annotated with a
61+
`#[build]` attribute to put it in the corresponding build module,
62+
otherwise it will go in the corresponding packed module.
63+
64+
## Design notes
65+
66+
### Why have two structs for each node type?
67+
68+
Having two structs for each node type, a packed struct and a builder
69+
struct, is motivated primarily by DST nodes. Many nodes end in a
70+
dynamically-sized slice, which prevents the normal struct construction
71+
syntax from being used. One option would be to generate a construction
72+
function that takes an argument for each field, but that can negatively
73+
impact readibility since there's no named-argument syntax. Having a
74+
separate builder struct allows us to use the normal struct construction
75+
syntax. DST fields in the builder are replaced with slice references.
76+
77+
### Why code generation?
78+
79+
With the need for two structs per node type established, the need for
80+
some kind of code generation becomes clear: having to actually write
81+
everything out by hand twice would be a huge pain and bug prone.
82+
83+
Code generation is also very helpful for all the code to write out
84+
builder nodes into the packed form, and for generating other functions
85+
such as debug and conversion impls.
86+
87+
### Why this type of code generation?
88+
89+
Rust offers a few built-in options for code generation: declarative
90+
macros, proc macros, and `build.rs`.
91+
92+
Declarative macros can get quite hard to read for anything too
93+
complicated. There are a fair number of idiosyncratic node types, so a
94+
declarative macro would almost certainly be quite complicated and
95+
therefore hard to read.
96+
97+
A proc macro, on the other hand, would certainly work for this use
98+
case. It can read arbitrary Rust syntax and produce arbitrary Rust
99+
code. The macro itself is fairly normal Rust code and hence can be quite
100+
readable. However, there are a couple drawbacks. First, it makes it
101+
harder for the `uefi` to ever stop having a hard dependency on
102+
`uefi-macros` in the future. Many crates try to make proc macros
103+
optional to improve compilation time, so it would be nice to keep that
104+
option open. Second, the generated code is invisible without special
105+
compilation flags. For a big complicated macro, that makes it more
106+
challenging to get the code generation correct in the first place, and
107+
also makes it harder to provide good errors to end-user code that uses
108+
the generated items, since the error message will point to the input to
109+
the macro, not the implicit generated output.
110+
111+
Next up there's `build.rs`, which is run automatically as part of the
112+
build and can generate arbitrary output files. Using `build.rs` we could
113+
use [`syn`] and [`quote`] just like a proc macro to specify nodes in a
114+
convenient format and generate code in a real file. That would solve the
115+
"invisible generated code" problem of proc macros, but it has the same
116+
compilation-time drawbacks. It also introduces a new problem: `build.rs`
117+
may not integrate well with non-cargo build systems.
118+
119+
That brings us to the solution actually implemented here, which is to
120+
use [`syn`] and [`quote`] like a proc macro, but to do so "offline" with
121+
an `xtask` command, and store the result in the git repo. This solves
122+
all the previous problems, and the only drawback is that it's possible
123+
to forget to run the command to update the generated code. However, a CI
124+
job verifies that the generated code is up to date so such mistakes
125+
won't make it into `main`.
126+
127+
[`quote`]: https://docs.rs/quote
128+
[`spec.rs`]: ./spec.rs
129+
[`syn`]: https://docs.rs/syn
130+
[`src/proto/device_path/device_path_gen.rs`]: ../../../src/proto/device_path/device_path_gen.rs
131+
[`DeviceType`]: https://docs.rs/uefi/latest/uefi/proto/device_path/struct.DeviceType.html
132+
[`DeviceSubType`]: https://docs.rs/uefi/latest/uefi/proto/device_path/struct.DeviceSubType.html

0 commit comments

Comments
 (0)