Taking mean on dataframe with uint16 gives wrong number

See this very strange behavior when taking the mean of a dataframe that has a series with uint16 datatype. Why does the mean of column y change after including column x?

```
Python 2.7.7 |Anaconda 2.0.1 (x86_64)| (default, Jun  2 2014, 12:48:16)
Type "copyright", "credits" or "license" for more information.

IPython 2.1.0 -- An enhanced Interactive Python.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://binstar.org
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: import random

In [4]:

In [4]: np.__version__
Out[4]: '1.8.1'

In [5]: pd.__version__
Out[5]: '0.14.0'

In [6]:

In [6]: y = np.array([random.randint(1900,2000) for x in range(0,2000)])

In [7]: y.mean()
Out[7]: 1950.0115000000001

In [8]: y.astype(np.uint16).mean()
Out[8]: 1950.0115000000001

In [9]:

In [9]: d1 = pd.DataFrame()

In [10]: d1['y'] = y.astype(np.uint16)

In [11]: d1.mean()
Out[11]:
y    16.6995
dtype: float64

In [12]:

In [12]: d1['x'] = y.astype(np.int16)

In [13]: d1.mean()
Out[13]:
y    1950.0115
x    1950.0115
dtype: float64
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Taking mean on dataframe with uint16 gives wrong number #7976

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Taking mean on dataframe with uint16 gives wrong number #7976

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions