Skip to content

Add std::io::Seek instance for std::io::Take<T> when T: Seek #555

Closed
rust-lang/rust
#138023
@melrief

Description

@melrief

Proposal

Problem statement

std::io::Read::take currently returns a new instance of Read but not Seek regardless of whether self implements Seek or not. This results in code that may work with self but not with the instance returned by .take because the former implements Seek while the latter doesn't. Moreover there is not reason for this to be the case; Take is just a wrapper around self that can read at most limit bytes. If self implements Seek then Take can also naturally implement Seek.

This proposal is to add a Seek instance for Take.

Motivating examples or use cases

For example, binary files parsers could benefits from this feature. Let's say that we have file format that is a list of records where each record is composed of the length of the record followed by the bytes of the record. It could be useful to read the length of the record and then use .take to delimit the bytes that the record parser should process. This removes the need to deal with trailing bytes from the record parser resulting in simpler code.

Now let's say that the parser needs to skip certain parts of the record because e.g. obsolete or not needed. seek is a natural fit for this task but currently Take doesn't support it even if File supports seek. The parser would have to read the bytes that it wants to skip and write them to a noop sink.

There is also a reusability reason: any generic code that works with bytes sequences should work with the result of .take too. For instance, let's write the interface of the parser above in a generic way:

fn record_parser<I: Read + Seek>(i: I) -> Result<Record> { ... }

this function should work for Take if it wraps an instance of Seek.

Solution sketch

seek can be expensive and therefore a requirement for the solution is that a call to Take.seek results in a single call to seek on the inner object of Take.

Take currently keeps track only of the remaining limit bytes that can be read from the inner object. This info alone is not enough to implement seek because e.g. we don't know when the Take "started" so seek(SeekFrom::Start) cannot be computed.
This solution proposes to keep track of the initial limit of the Take in a new field called len. This can then be used to compute the position of the cursor inside the Takeas len - limit. The reason to not track position directly is to avoid adding one more mutating variable that changes together with limit. len is constant (unless set_limit is called).

Once we know the position of the cursor inside Take, we can compute the offset from the current position based on the argument of seek. If the argument is SeekFrom::Start then we seek from -position, if the argument is SeekFrom::End then we seek from position+limit, if the argument is SeekFrom::Current then we seek from position.

API

Add a position method that returns the position within the Take

impl Take {
    fn position(&self) -> u64 {
       self.len - self.limit
    }
}

Add the Seekimplementation for Take

impl<T: Seek> Seek for Take<T> {
  ...
}

Alternatives

The solution proposed in 97227 keeps track of the position instead of the original len. It is equivalent to the solution proposed here. The reason why I prefer the above solution is because it's simpler to maintain given that len never changes while cursor in 97227 must be updated together with limit leading to code like:

        self.cursor += amt as u64;
        self.limit -= amt as u64;

Links and related work

  1. 97227
  2. My implementation of the solution proposed here PR #138023

Metadata

Metadata

Assignees

No one assigned

    Labels

    ACP-acceptedAPI Change Proposal is accepted (seconded with no objections)T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions