Description
Part of #1557
While looking at the flamegraph posted to that ticket, one obvious source of improvement is to stop copying each token so much.
Functions like peek_token()
, and expect_token()
copy the Token
(which often includes a string)
impl Parser {
// returns an OWNED TokenWithLocation
pub fn peek_token(&self) -> TokenWithLocation {
...
}
}
In the above code, the non_whitespace.cloned()
call actually copies the token (and string) even when many callsites only need to check what it is and could get by with a &
Suggestions for improvements
The biggest suggestion is to stop copying each token unless it actually needed a clone. For example instead of this
impl Parser {
// returns an OWNED TokenWithLocation
pub fn peek_token(&self) -> TokenWithLocation {
...
}
}
Make it like this
impl Parser {
// returns an reference to TokenWithLocation
// (the & means no copying!)
pub fn peek_token_ref(&self) -> &TokenWithLocation {
...
}
}
I think we could do this without massive breaking changes by adding new functions like peek_token_ref()
that returned a reference to the token rather than a clone
Ideas
I played around with it a bit and here is what I came up with. I think a similar pattern could be applied to other places
impl Parser {
...
/// Return the first non-whitespace token that has not yet been processed
/// (or None if reached end-of-file) and mark it as processed. OK to call
/// repeatedly after reaching EOF.
pub fn next_token(&mut self) -> TokenWithLocation {
self.next_token_ref().clone()
}
/// Return the first non-whitespace token that has not yet been processed
/// (or None if reached end-of-file) and mark it as processed. OK to call
/// repeatedly after reaching EOF.
pub fn next_token_ref(&mut self) -> &TokenWithLocation {
loop {
self.index += 1;
// skip whitespace
if let Some(TokenWithLocation {
token: Token::Whitespace(_),
location: _,
}) = self.tokens.get(self.index - 1) {
continue
}
break;
}
if (self.index - 1) < self.tokens.len() {
&self.tokens[self.index - 1]
}
else {
eof_token()
}
}
...
}
static EOF_TOKEN: OnceLock<TokenWithLocation> = OnceLock::new();
fn eof_token() -> &'static TokenWithLocation {
EOF_TOKEN.get_or_init(|| {
TokenWithLocation::wrap(Token::EOF)
})
}