Skip to content

ACP: add slice::split_once #102

Closed
Closed
@olivia-fl

Description

@olivia-fl

Proposal

Add slice::split_once and slice::rsplit_once methods, analogous to the existing str::split_once and str::rsplit_once methods.

Problem statement

When doing ad-hoc parsing of a format that isn't guaranteed to be valid unicode, it's often useful to split byte slices on the first occurrence a specific delimiter. There isn't currently an API that expresses this directly for byte slices, although there is for strings.

Motivation, use-cases

There are some examples in aprs-parser-rs crate, which is being refactored from treating its input data as strings to using byte-slices. APRS packets consist of a header and a body, separated by a b':' byte. This is currently being parsed like this:

let header_delimiter = s
    .iter()
    .position(|x| *x == b':')
    .ok_or_else(|| AprsError::InvalidPacket(s.to_owned()))?;
let (header, rest) = s.split_at(header_delimiter);
let body = &rest[1..];

Solution sketches

There are currently two options to do this in stable rust.

Using position to find the delimiter's index, then splitting it on that index and explicitly rejecting the first byte of the second slice (which contains the delimiter):

let v = b"first:second";
let split_index = v.iter().position(|&x| x == b':')?;
let (first, second) = v.split_at(split_index);
let second = &second[1..];

Using splitn:

let v = b"first:second";
let split = v.splitn(2, |&x| x == b':');
let first = split.next()?;
let second = split.next()?;

These options are okay, but not great. They're both relatively verbose and don't express the actual intention very directly. They also have the issue that mistakes aren't necessarily going to show up in the type system.

With strings, there is currently a split_once method, that handles this exact use case:

let v = "first:second";
let (first, second) = v.split_once(':')?;

A similar method could be added for slices:

pub fn split_once<F>(&self, pred: F) -> Option<(&[T], &[T])>
    where F: FnMut(&T) -> bool
{
    let index = self.iter().position(pred)?;
    Some((&self[..index], &self[index+1..]))
}

Along with an rsplit_once equivalent. I also think it might make sense to add split_once_mut and rsplit_once_mut, however those don't currently exist for str.

Links and related work

What happens now?

This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ACP-acceptedAPI Change Proposal is accepted (seconded with no objections)T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions