ENH: Add argument "multiprocessing" to pd.read_csv() method

#### Is your feature request related to a problem?

When the method pd.read_csv() is called, unfortunately this doesn't take advantage of the multiprocessing module, making it inefficient to read multiple datasets, especially when more cores are available to work.

Other modules like modin or dask, already implement this, but I think that pandas should implement by itself, if called for.

#### Describe the solution you'd like

It should be able to work with the multiprocessing module out of the box, as an initial enhancement, and then in the future support other possible backends like joblib.

A list of filenames should be passed:

An example of application would be:

`pd.read_csv(list_of_filenames, multiprocessing=True)`


`pd.read_csv(glob.glob('table_*.csv'), multiprocessing=True)`

#### API breaking implications

This should not change established behavior, considering that the default value for the "multiprocessing" argument should be "None" by default.

The memory consumption should be the same, it just consumes the memory much faster.

For this method option, the indices will be the same from each file, but likely to be in different order, but the user can reset_index() afterwards if needed.


#### Describe alternatives you've considered

[this should provide a description of any alternative solutions or features you've considered]

#### Additional context

I have also considered extra backend options for future enhancements of this implementation, like joblib, ray, dask.

#NOTE: I have already a proof-of-concept for the solution, so I can work a bit further on it, commit and make a pull request.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Add argument "multiprocessing" to pd.read_csv() method #37955

Is your feature request related to a problem?

Describe the solution you'd like

API breaking implications

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ENH: Add argument "multiprocessing" to pd.read_csv() method #37955

Description

Is your feature request related to a problem?

Describe the solution you'd like

API breaking implications

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions