Improve type handling in read_sql and read_sql_table

### Problem

In _pd.read_sql_ and _pd.read_sql_table_ when the _chunksize_ parameter is set, Pandas builds a DataFrame with dtypes inferred from the data **in the chunk**. This can be a problem if an INTEGER colum contains null values in some chunks but not in others, leading the same column to be int64 in some cases and in others float64. A similar problem happens with strings. 

In ETL processes or simply when dumping large queries to disk in HDF5 format, the user currently has the burden of explicitly having to handle the type conversions of potentially many columns. 
### Solution?

Instead of guessing the type from a subset of the data,  it should be possible to obtain the type information from the database and map it to the appropriate dtypes. 

It is possible to obtain column information from  Sqlalchemy when querying a full table by inspecting its metadata, but I was unsuccessfull in findind a way to do it for a general query. 
Although I am unaware of all the possible type problems that can arise [DBAPI](https://www.python.org/dev/peps/pep-0249/#type-objects-and-constructors) does actually enforce the cursor.description to specify whether each result column is nullable. 
Pandas could use this information (optionally) to always interpret nullable numeric columns as floats and strings as object columns. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improve type handling in read_sql and read_sql_table #13049

Problem

Solution?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Improve type handling in read_sql and read_sql_table #13049

Description

Problem

Solution?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions