-
Notifications
You must be signed in to change notification settings - Fork 13.4k
modularize the config module bootstrap #141272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
modularize the config module bootstrap #141272
Conversation
This PR modifies If appropriate, please update |
This comment has been minimized.
This comment has been minimized.
6a1e868
to
b8374d6
Compare
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the general direction of this. Could you perhaps split the changes into multiple commits though? It would be easier to review. Thank you :)
… with the logic for deserializing and merging them
…level methods that interact across different config sections
b8374d6
to
ad21211
Compare
Kind of crazy just how much code is there in config definition and parsing. The changes LGTM, but since this is a large-ish change, I'd like to hear also from other members of t-bootstrap if they are OK with it. CC @onur-ozkan @jieyouxu. |
/// Since we use `#[serde(deny_unknown_fields)]` on `TomlConfig`, we need a wrapper type | ||
/// for the "change-id" field to parse it even if other fields are invalid. This ensures | ||
/// that if deserialization fails due to other fields, we can still provide the changelogs | ||
/// to allow developers to potentially find the reason for the failure in the logs.. | ||
#[derive(Deserialize, Default)] | ||
pub(crate) struct ChangeIdWrapper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have types
module but this module also includes bunch of types. Right now it seems like there is no clear distinction between these modules and people will start putting things all around without having a clear clue. If we want to have module for per task, we have to make it super explicit/clear and have it well documented.
// NOTE: can't derive(Deserialize) because the intermediate trip through toml::Value only | ||
// deserializes i64, and derive() only generates visit_u64 | ||
impl<'de> Deserialize<'de> for DebuginfoLevel { | ||
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> | ||
where | ||
D: Deserializer<'de>, | ||
{ | ||
use serde::de::Error; | ||
|
||
Ok(match Deserialize::deserialize(deserializer)? { | ||
StringOrInt::String(s) if s == "none" => DebuginfoLevel::None, | ||
StringOrInt::Int(0) => DebuginfoLevel::None, | ||
StringOrInt::String(s) if s == "line-directives-only" => { | ||
DebuginfoLevel::LineDirectivesOnly | ||
} | ||
StringOrInt::String(s) if s == "line-tables-only" => DebuginfoLevel::LineTablesOnly, | ||
StringOrInt::String(s) if s == "limited" => DebuginfoLevel::Limited, | ||
StringOrInt::Int(1) => DebuginfoLevel::Limited, | ||
StringOrInt::String(s) if s == "full" => DebuginfoLevel::Full, | ||
StringOrInt::Int(2) => DebuginfoLevel::Full, | ||
StringOrInt::Int(n) => { | ||
let other = serde::de::Unexpected::Signed(n); | ||
return Err(D::Error::invalid_value(other, &"expected 0, 1, or 2")); | ||
} | ||
StringOrInt::String(s) => { | ||
let other = serde::de::Unexpected::Str(&s); | ||
return Err(D::Error::invalid_value( | ||
other, | ||
&"expected none, line-tables-only, limited, or full", | ||
)); | ||
} | ||
}) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a parsing logic.
@rustbot author |
Reminder, once the PR becomes ready for a review, use |
cc @clubby789 |
@rustbot review |
Vibes-wise, organization seems fine to me. |
I still have this concern (also see this as an additional ref). With the current approach, adding new things will always be unclear for some people. Even now, for example, we have What I would suggest instead is organizing the modules based on the configuration field. For example, With this structure it would make it very explicit where things belong and anything that doesn't fit this rule can be easily caught during PR reviews. |
…uctures This commit introduces a new `toml` submodule within the `src/bootstrap/config` directory, fundamentally restructuring how `bootstrap.toml`'s on-disk representation is handled. Key changes introduced in this commit: 1. Dedicated `toml` Module: A new `src/bootstrap/config/toml` directory is created. This module is now solely responsible for defining the Rust types that directly mirror the structure of the `bootstrap.toml` file. These types serve as the initial layer for `serde` deserialization before further processing and merging into the final `Config`. 2. Field-Based Segregation within `toml`: To further enhance clarity and ownership, the raw TOML configuration types are segregated into individual files corresponding to the top-level sections of `bootstrap.toml`: * `build.rs`: For `[build]` related TOML structures. * `change_id.rs`: For the `change-id` field. * `dist.rs`: For `[dist]` options. * `gcc.rs`: For `[gcc]` options. * `install.rs`: For `[install]` options. * `llvm.rs`: For `[llvm]` options. * `rust.rs`: For `[rust]` options. * `target.rs`: For `[target.<triple>]` options. 3. Encapsulated Core Utilities: The `Merge` trait (now defined in `toml/merge.rs`) and the `define_config!` macro (in `toml/macros.rs`) have been moved into dedicated modules within the `toml` subdirectory. As these utilities are integral to how the raw TOML structures are processed (deserialized, merged, and constructed), centralizing them here enhances the self-contained nature of the `toml` module. A `common.rs` is also introduced within `toml` for types shared among its sub-modules.
This commit moves the `Config` struct's methods responsible for reading and deserializing `bootstrap.toml` files into `TomlConfig` instances to `src/bootstrap/config/toml/mod.rs`. Specifically, `Config::get_builder_toml`, `Config::get_toml`, and `Config::get_toml_inner` are now located within the `toml` submodule.
Moves `check_incompatible_options_for_ci_llvm` to `src/bootstrap/config/toml/llvm.rs` and `check_incompatible_options_for_ci_rustc` to `src/bootstrap/config/toml/rust.rs`. The generic `set` and threads_from_config helper function is moved to `common.rs`.
…t.rs respectively
- With this all parsing or deserialization logic has now moved to toml.
This comment has been minimized.
This comment has been minimized.
2dc25e6
to
877aa18
Compare
Currently, our
config
module is quite large over 3,000 lines, and handles a wide range of responsibilities. This PR aims to break it down into smaller, more focused submodules to improve readability and maintainability:toml
: Introduces a dedicatedtoml
submodule within theconfig
module. Its sole purpose is to define configuration-related structs along with their corresponding deserialization logic. It also contains theparse_inner
method, which serves as the central function for extracting relevant information from the TOML structs and constructing the final configuration.rust
,dist
,install
,llvm
,build
,gcc
, and others: Each of these modules contains TOML subsections specific to their domain, along with the logic necessary to convert them into parts of the final configuration struct.common
: Contains shared types and enums used across multiple TOML subsections.parsing
: Houses the logic that integrates all the TOML subsections into the complete configuration struct.r? @Kobzol