Skip to content

WIP: ~15% Faster str to int parsing (when base 10) #83088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 74 additions & 29 deletions library/core/src/num/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -814,6 +814,9 @@ macro_rules! doit {
}
doit! { i8 i16 i32 i64 i128 isize u8 u16 u32 u64 u128 usize }

const MULTIPLIER: &[u32] =
&[1_000_000_000, 100_000_000, 10_000_000, 1_000_000, 100_000, 10_000, 1_000, 100, 10, 1];

fn from_str_radix<T: FromStrRadixHelper>(src: &str, radix: u32) -> Result<T, ParseIntError> {
use self::IntErrorKind::*;
use self::ParseIntError as PIE;
Expand Down Expand Up @@ -846,37 +849,79 @@ fn from_str_radix<T: FromStrRadixHelper>(src: &str, radix: u32) -> Result<T, Par
};

let mut result = T::from_u32(0);
if is_positive {
// The number is positive
for &c in digits {
let x = match (c as char).to_digit(radix) {
Some(x) => x,
None => return Err(PIE { kind: InvalidDigit }),
};
result = match result.checked_mul(radix) {
Some(result) => result,
None => return Err(PIE { kind: PosOverflow }),
};
result = match result.checked_add(x) {
Some(result) => result,
None => return Err(PIE { kind: PosOverflow }),
};
if mem::size_of::<T>() <= mem::size_of::<u32>() && radix == 10 {
// The ALU can reorder these adds and do more in parallel
// as each mul isn't dependent on the previous answer.
let factors = &MULTIPLIER[MULTIPLIER.len() - digits.len()..];
let mut idx = 0;
if is_positive {
// The number is positive
for &c in digits {
let x = match (c as char).to_digit(radix) {
Some(x) => x,
None => return Err(PIE { kind: InvalidDigit }),
};
Comment on lines +860 to +863
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These matches can all be written a bit shorter like this:

Suggested change
let x = match (c as char).to_digit(radix) {
Some(x) => x,
None => return Err(PIE { kind: InvalidDigit }),
};
let x = (c as char).to_digit(radix).ok_or(PIE { kind: InvalidDigit })?;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall we used to do these but in performance sensitive parts using ok_or is slower than match counterparts, from what I recall.

Copy link
Contributor

@LingMan LingMan Mar 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? The optimizer should have all information it needs and in my quick tests the generated assembly is identical. Is there an open issue or do you have a test case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's what I recall but not sure if the same goes for this case, if the assembly is the same then I guess it's good to have it.

let x = match factors[idx].checked_mul(x) {
Some(result) => result,
None => return Err(PIE { kind: PosOverflow }),
};
result = match result.checked_add(x) {
Some(result) => result,
None => return Err(PIE { kind: PosOverflow }),
};
idx += 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to manage an index manually if you loop over the factors directly:

for (&c, &factor) in digits.zip(factors) { ... }

Assuming zip doesn't inhibit any optimizations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both zip and enumerate inhibited optimisations - I like that euphamism.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen some weird interactions between zip and rev, which currently(?) cause suboptimal assembly, but never with enumerate. Is this another case of missing -Copt-level=3 on godbolt?

PS: No euphemism intended.

}
} else {
// The number is negative
for &c in digits {
let x = match (c as char).to_digit(radix) {
Some(x) => x,
None => return Err(PIE { kind: InvalidDigit }),
};
let x = match factors[idx].checked_mul(x) {
Some(result) => result,
None => return Err(PIE { kind: NegOverflow }),
};
result = match result.checked_sub(x) {
Some(result) => result,
None => return Err(PIE { kind: NegOverflow }),
};
idx += 1;
}
Comment on lines +857 to +890
Copy link
Contributor

@pickfire pickfire Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to have so many duplicated code here? Also, why not put the extra condition branch in the error flow rather the default flow?

Maybe we can use iterator for idx here?

Suggested change
if is_positive {
// The number is positive
for &c in digits {
let x = match (c as char).to_digit(radix) {
Some(x) => x,
None => return Err(PIE { kind: InvalidDigit }),
};
let x = match factors[idx].checked_mul(x) {
Some(result) => result,
None => return Err(PIE { kind: PosOverflow }),
};
result = match result.checked_add(x) {
Some(result) => result,
None => return Err(PIE { kind: PosOverflow }),
};
idx += 1;
}
} else {
// The number is negative
for &c in digits {
let x = match (c as char).to_digit(radix) {
Some(x) => x,
None => return Err(PIE { kind: InvalidDigit }),
};
let x = match factors[idx].checked_mul(x) {
Some(result) => result,
None => return Err(PIE { kind: NegOverflow }),
};
result = match result.checked_sub(x) {
Some(result) => result,
None => return Err(PIE { kind: NegOverflow }),
};
idx += 1;
}
for &c in digits {
let x = match (c as char).to_digit(radix) {
Some(x) => x,
None => return Err(PIE { kind: InvalidDigit }),
};
let x = match factors[idx].checked_mul(x) {
Some(result) => result,
None => return Err(PIE { kind: if is_positive { PosOverflow } else { NegOverflow } }),
};
result = match result.checked_add(x) {
Some(result) => result,
None => return Err(PIE { kind: if is_positive { PosOverflow } else { NegOverflow } }),
};
idx += 1;

Copy link
Contributor Author

@gilescope gilescope Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you missed there was a checked_sub in there? Having an if in the loop to choose between checked_add and checked_sub kills performance - I can attest to that! I did wonder if we could do better too. You can't add and then negate as the negative ranges are slightly bigger. So you could always subtract and then !+1 to make positive, which would work great for signed types (with a check to see if val == T::Min) but I couldn't get my head around what would happen if I was doing subtractions to unsigned types - if I have to do unsigned types a different way then I've not made anything any simpler. So I can't think of a way to simplify the code, but am willing to try anything if you have any other ideas.

Copy link
Contributor

@pickfire pickfire Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I did missed that.

But what if we put the if branch only for checked_add and checked_sub but keep the to_digit and checked_mul?

Another idea, can you also try using if it's negative after the for loop to do checked_neg there? Remove the for loop and add a branch to convert the positive number to negative using checked_neg and return NegOverflow if there are issues so we don't need another loop for negative, a single branch.

While trying that, maybe can also try a hand crafted negation instead of checked_add? Like this

    for &c in digits {
        ... // checked_add ...
    }
    if !is_positive {
        const UNSIGNED_BITS: i8 = (i8::MAX as u8 * 2) as i8 + 1; // don't just use i8, can use whatever type it is
        result = (i ^ UNSIGNED_BITS).checked_add(1);
    }

One more thing, I don't think we need the whole bunch of negative calculation in unsigned code do we?

If would be better if you have a repo to try out stuff on the benchmarking, like the one I did https://github.com/pickfire/parseint so that I can also try to help out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For i8 if we're parsing "-128" we run into representation trouble if we start positive and then negate, that's why one would want to do subtractions and then neg. But we'd need a different code path for unsigned...

I don't think we need the whole bunch of negative calculation in unsigned code do we?

Hmm, we could do if is_positive || !is_signed_ty { rather than if is_positive - that way the compiler should remove the else block for unsigned types (I don't think currently the compiler can realise that is_positive is always true for unsigned types, but it's pretty crafty so it might have figured that out anyhows).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fork your parseint and add some benchmarks. (The trick is very cool you've got there btw.)

I need to double check my benchmarks anyhow because I reaslised that my blackbox was not quite in the right place and the compiler was optimising for the data (rookie mistake!).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you link your godbolt experiments?

Copy link
Contributor Author

@gilescope gilescope Mar 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in the godbolt is an exploration of expanding to larger type sizes than u32 but it's slow due to an if in the loop. Nevertheless it demostrates the is_signed_ty evaluation nicely.

Sure, on the left it's a const that evaluates to false and on the right it's an expression. https://godbolt.org/z/od3ajTh3v
Interestingly on the left side if I replace
const is_signed_ty: bool = false; with
let is_signed_ty: bool = false;
it can't tell that it should drop the else expression which is surprising. I would have thought it should have proporgated the constant into the is_signed_ty given it was assigned by a constant and is not mutable.

Copy link
Contributor

@LingMan LingMan Mar 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That godbolt link shows unoptimized assembly. You forgot to set -Copt-level=3. With that the assembly is identical between the two.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or -O

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pickfire I took a stab at building in all the safeguards to your trick as an alternative route for radix 10. Have submitted a PR for us to talk around: pickfire/parseint#2

}
} else {
// The number is negative
for &c in digits {
let x = match (c as char).to_digit(radix) {
Some(x) => x,
None => return Err(PIE { kind: InvalidDigit }),
};
result = match result.checked_mul(radix) {
Some(result) => result,
None => return Err(PIE { kind: NegOverflow }),
};
result = match result.checked_sub(x) {
Some(result) => result,
None => return Err(PIE { kind: NegOverflow }),
};
if is_positive {
// The number is positive
for &c in digits {
let x = match (c as char).to_digit(radix) {
Some(x) => x,
None => return Err(PIE { kind: InvalidDigit }),
};
result = match result.checked_mul(radix) {
Some(result) => result,
None => return Err(PIE { kind: PosOverflow }),
};
result = match result.checked_add(x) {
Some(result) => result,
None => return Err(PIE { kind: PosOverflow }),
};
}
} else {
// The number is negative
for &c in digits {
let x = match (c as char).to_digit(radix) {
Some(x) => x,
None => return Err(PIE { kind: InvalidDigit }),
};
result = match result.checked_mul(radix) {
Some(result) => result,
None => return Err(PIE { kind: NegOverflow }),
};
result = match result.checked_sub(x) {
Some(result) => result,
None => return Err(PIE { kind: NegOverflow }),
};
}
}
}
Ok(result)
Expand Down