error merging dataframes with unicode and numpy.float64 column names

There is a strange error happening during pandas.merge when there is a unicode column name followed by a numpy.float64 column name. The error only happens for certain numpy.float64 values. The error is: "UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 5: invalid start byte"

The following code reproduces the error:

```
import pandas as pd
import numpy as np

def my_test(X):

    t=pd.DataFrame([[1, 2], [3, 4]])
    u=pd.DataFrame([[9, 10], [11, 12]])

    t.rename(columns={0:unicode('a'),1:np.float64(X)}, inplace=True)
    u.rename(columns={0:unicode('x'),1:unicode('y')}, inplace=True)

    pd.merge(u, t, how="inner", left_index=True, right_index=True)

#works fine for 113, but throws an error for 114
my_test(113)
my_test(114)

#print out the numbers up to 200 for which this error occurs:
problem_numbers=[]
for i in range(200):
    try:
        my_test(i)
    except UnicodeDecodeError:
        problem_numbers.append(i)

print(problem_numbers)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

error merging dataframes with unicode and numpy.float64 column names #13353

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

error merging dataframes with unicode and numpy.float64 column names #13353

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions