Skip to content

Make NonZero<char> possible #141001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

hkBst
Copy link
Member

@hkBst hkBst commented May 14, 2025

I'd like to use NonZero<char> for representing units of CStr in https://github.com/rust-lang/literal-escaper

@rustbot
Copy link
Collaborator

rustbot commented May 14, 2025

r? @joboet

rustbot has assigned @joboet.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 14, 2025
@hanna-kruppe
Copy link
Contributor

It’s not clear from the diff, but I guess since NonZero<T> works by trait impls on T, this can’t be feature-gated and would immediately allow NonZero<char> on stable?

@jieyouxu
Copy link
Member

r? libs-api

@rustbot rustbot added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 14, 2025
@rustbot rustbot assigned joshtriplett and unassigned joboet May 14, 2025
@hkBst
Copy link
Member Author

hkBst commented May 14, 2025

It’s not clear from the diff, but I guess since NonZero<T> works by trait impls on T, this can’t be feature-gated and would immediately allow NonZero<char> on stable?

That's a good point I hadn't considered. I'll look into whether a feature gate is possible.

@tgross35
Copy link
Contributor

Could you add a test so we don't accidentally regress this? library/coretests/tests/num/niche_types.rs doesn't exist yet but probably would be the place.

@hkBst
Copy link
Member Author

hkBst commented May 15, 2025

Could you add a test so we don't accidentally regress this? library/coretests/tests/num/niche_types.rs doesn't exist yet but probably would be the place.

Do you mean a test that just uses all the possible NonZero-able types as NonZeros? Meaning they are tested for being possible. Or is there another thing you meant?

@tgross35
Copy link
Contributor

I did only mean something to check basic usage of the type, and was only thinking of char since the other types are covered by the NonZeroUxx doctests. It wouldn't be bad to just add the same kind of test for all the types, though.

@hkBst
Copy link
Member Author

hkBst commented May 21, 2025

@tgross35 I added a simple test for NonZero::<char>. Let me know if it is okay, or if you have any more ideas to make it better.

@tgross35 tgross35 added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label May 21, 2025
@dtolnay dtolnay removed the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 27, 2025
@Amanieu Amanieu removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label May 27, 2025
@Amanieu
Copy link
Member

Amanieu commented May 27, 2025

I'd like to use NonZero<char> for representing units of CStr in rust-lang/literal-escaper

Could give some examples of how this is going to be used in literal-escaper?

@hkBst
Copy link
Member Author

hkBst commented May 27, 2025

Could give some examples of how this is going to be used in literal-escaper?

The characters of a C string are checked for being non-null as a matter of correctness anyway, and it may be a good idea to preserve this in the type.

I have a PR that did this (as a side quest) and the relevant file is here: https://github.com/rust-lang/rust/blob/30822ec0ec25723f36f9e73c42d91a83dc121388/compiler/rustc_lexer/src/unescape.rs

/// Used for mixed utf8 string literals, i.e. those that allow both unicode
/// chars and high bytes.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum MixedUnit {
    /// Used for ASCII chars (written directly or via `\x01`..`\x7f` escapes)
    /// and Unicode chars (written directly or via `\u` escapes).
    ///
    /// For example, if '¥' appears in a string it is represented here as
    /// `MixedUnit::Char('¥')`, and it will be appended to the relevant byte
    /// string as the two-byte UTF-8 sequence `[0xc2, 0xa5]`
    Char(NonZero<char>),

    /// Used for high bytes (`\x80`..`\xff`).
    ///
    /// For example, if `\xa5` appears in a string it is represented here as
    /// `MixedUnit::HighByte(0xa5)`, and it will be appended to the relevant
    /// byte string as the single byte `0xa5`.
    HighByte(NonZero<u8>),
}

impl From<NonZero<char>> for MixedUnit {
    fn from(c: NonZero<char>) -> Self {
        MixedUnit::Char(c)
    }
}

impl From<NonZero<u8>> for MixedUnit {
    fn from(byte: NonZero<u8>) -> Self {
        if byte.get().is_ascii() {
            MixedUnit::Char(NonZero::new(byte.get() as char).unwrap())
        } else {
            MixedUnit::HighByte(byte)
        }
    }
}

impl TryFrom<char> for MixedUnit {
    type Error = EscapeError;

    fn try_from(c: char) -> Result<Self, EscapeError> {
        NonZero::new(c).map(MixedUnit::Char).ok_or(EscapeError::NulInCStr)
    }
}

impl TryFrom<u8> for MixedUnit {
    type Error = EscapeError;

    fn try_from(byte: u8) -> Result<Self, EscapeError> {
        NonZero::<u8>::new(byte).map(From::from).ok_or(EscapeError::NulInCStr)
    }
}

@Amanieu
Copy link
Member

Amanieu commented May 27, 2025

Seems reasonable.

@rfcbot merge

@rfcbot
Copy link
Collaborator

rfcbot commented May 27, 2025

Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels May 27, 2025
@rfcbot
Copy link
Collaborator

rfcbot commented May 27, 2025

🔔 This is now entering its final comment period, as per the review above. 🔔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants