Skip to content

pandas.merge issue with duplicated column names #11762

Closed
@wavexx

Description

@wavexx
df1 = pd.DataFrame([[1, 1]], columns=['x','x'])
df2 = pd.DataFrame([[1, 1]], columns=['x','y'])
pd.merge(df1, df2, on='x')

Results in:

Traceback (most recent call last):
  File "test.py", line 5, in <module>
    m = pd.merge(df1, df2, on='x')
  File "/usr/lib/python3/dist-packages/pandas/tools/merge.py", line 35, in merge
    return op.get_result()
  File "/usr/lib/python3/dist-packages/pandas/tools/merge.py", line 196, in get_result
    join_index, left_indexer, right_indexer = self._get_join_info()
  File "/usr/lib/python3/dist-packages/pandas/tools/merge.py", line 324, in _get_join_info
    sort=self.sort, how=self.how)
  File "/usr/lib/python3/dist-packages/pandas/tools/merge.py", line 516, in _get_join_indexers
    llab, rlab, shape = map(list, zip( * map(fkeys, left_keys, right_keys)))
  File "/usr/lib/python3/dist-packages/pandas/tools/merge.py", line 681, in _factorize_keys
    llab = rizer.factorize(lk)
  File "pandas/hashtable.pyx", line 850, in pandas.hashtable.Int64Factorizer.factorize (pandas/hashtable.c:15601)
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

See #11754

Metadata

Metadata

Assignees

No one assigned

    Labels

    Error ReportingIncorrect or improved errors from pandasReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions