Skip to content

"too many SQL variables" Error with pandas 0.23 - enable multivalues insert #19664 issue #21103

Closed
@tripkane

Description

@tripkane
#let's import packages to use
import numpy as np
import pandas as pd
from sqlalchemy import create_engine


# #use pandas to import data
df = pd.DataFrame(np.arange(0,20000,1))
#create the engine to connect pandas with sqlite3
engine = create_engine('sqlite://')
#create connection
conn = engine.connect()
#convert df to sql table
df.to_sql('test',engine, if_exists='replace',chunksize=1000)
#print results
result = conn.execute("select * from test")
for row in result:
    print(row['index'])
conn.close()

Problem description

In pandas 0.22 I could write a dataframe to sql of reasonable size without error. Now I receive this error "OperationalError: (sqlite3.OperationalError) too many SQL variables". I am converting a dataframe with ~20k+ rows to sql. After looking around I suspect the problem lies in the limit set by sqlite3: SQLITE_MAX_VARIABLE_NUMBER which is set to 999 by default (based on their docs). This can apparently be changed by recompiling sqlite and adjusting this variable accordingly. I can confirm that for a df of length (rows) 499 this works. I can also confirm that this test version works with a row length of 20k and a chunksize of 499 inputted with df_to_sql works. In my real case the limit is 76. These numbers are clearly dependent on data size of each row so a method is required to estimate this based on data type and number of columns. #19664

Expected #Output

runfile('H:/Tests/Pandas_0.23_test.py', wdir='H:/Tests')
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

####Trace###########

runfile('H:/Tests/Pandas_0.23_test.py', wdir='H:/Tests')
Traceback (most recent call last):

File "", line 1, in
runfile('H:/Tests/Pandas_0.23_test.py', wdir='H:/Tests')

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "H:/Tests/Pandas_0.23_test.py", line 19, in
df.to_sql('test',engine, if_exists='fail',chunksize=500)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py", line 2127, in to_sql
dtype=dtype)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\sql.py", line 450, in to_sql
chunksize=chunksize, dtype=dtype)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\sql.py", line 1149, in to_sql
table.insert(chunksize)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\sql.py", line 663, in insert
self._execute_insert(conn, keys, chunk_iter)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\sql.py", line 638, in _execute_insert
conn.execute(*self.insert_statement(data, conn))

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 948, in execute
return meth(self, multiparams, params)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\sql\elements.py", line 269, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 1060, in _execute_clauseelement
compiled_sql, distilled_params

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 1200, in _execute_context
context)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 1413, in _handle_dbapi_exception
exc_info

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\util\compat.py", line 203, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\util\compat.py", line 186, in reraise
raise value.with_traceback(tb)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\base.py", line 1193, in _execute_context
context)

File "C:\Users\kane.hill\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\default.py", line 507, in do_execute
cursor.execute(statement, parameters)

OperationalError: (sqlite3.OperationalError) too many SQL variables [SQL: 'INSERT INTO test ("index", "0") VALUES (?, ?), (?, ?), (?, ?), (?, ?), (?, ?), (?, ?), (?, ?), (?, ?), (?, ?), (?, ?), (?, ?), (?, ?), (?,

Output of pd.show_versions()

pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None

pandas: 0.23.0
pytest: 3.5.1
pip: 10.0.1
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.14.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.3
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.3
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.4
lxml: 4.2.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.7
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    IO SQLto_sql, read_sql, read_sql_queryRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions