Support reading random rows in `read_csv`

It is very common to read random rows in a large csv file, typically for testing with a small dataset, or fit the limit of memory. The parameter `nrows` is used for read the first n lines, but I didn't find any feature to read random lines. Such parameter might be named `keeprows` (opposite to `skiprows`), which supports:
- int, e.g. `keeprows=100` means keep 100 random lines (uniformly)
- float in (0, 1), e.g. `keeprows=0.05` means keep 5% of total lines
- list of int(or iterable), e.g. `keeprows=[1, 3, 8]` mean to keep line 1, 3, and 8


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support reading random rows in `read_csv` #14285

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Support reading random rows in read_csv #14285

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Support reading random rows in `read_csv` #14285