Skip to content

Error converting DataFrame with duplicate columns to ndarray #2236

Closed
@jpindi

Description

@jpindi

Installed latest version of pandas 0.9.0 in case this was an error
Trying to read Excel file. That part seems ok.
Originally, I was trying iteritems() for each row of the pandas dataframe, as the id_company had to be verified against a mysql database (code not included). Same/similar error message to putting it into a tuple (code is below). Error message follows.

Note there is a .reindex() but it didn't work before, either. The reindex() was kind of a hail-mary.

As a work-around, I'm probably going to simply import from my target sql and do a join. I'm concerned because of the size of the datasets.

import pandas as pd
def runNow():
    #identify sheet
    source = 'C:\Users\jlalonde\Desktop\startup_geno\startupgenome_w_id_xl_20121109.xlsx'
    xls_file = pd.ExcelFile(source)
    sd = xls_file.parse('Sheet1')
    source_u = sd.drop_duplicates(cols = 'id_company', take_last=False)
    source_r = source_u[['id_company','id_good','description', 'website','keyword', 'company_name','founded_month', 'founded_year', 'description']]
    source_i = source_r.reindex() #hail mary
    tup_r = [tuple(x) for x in source_i.values]

Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    sg_sql_2.runNow()
  File "sg_sql_2.py", line 31, in runNow
    tup_r = [tuple(x) for x in source_r.values]
  File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 1443, in as_matrix
    return self._data.as_matrix(columns).T
  File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 723, in as_matrix
    mat = self._interleave(self.items)
  File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 743, in _interleave
    indexer = items.get_indexer(block.items)
  File "C:\Python27\lib\site-packages\pandas\core\index.py", line 748, in get_indexer
    raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions