Closed
Description
When using the read_gbq()
function on a BigQuery table, incorrect results are returned.
I compare the output from read_gbq()
to that of a CSV export from BigQuery directly. Interestingly, there are the same number of rows in each output - however, there are many duplicates in the read_gbq()
output.
I'm using Pandas '0.13.0rc1-125-g4952858' on a Mac 10.9 using Python 2.7. Numpy '1.8.0'.
The code I execute to load the data in pandas:
churn_data = gbq.read_gbq(train_query, project_id = projectid)
I can't share the underlying data. What additional data/info would be useful for root causing?
The output data is ~400k lines.
Metadata
Metadata
Assignees
Labels
No labels