Skip to content

ERR: better error message on invalid return from .apply #13820

Closed
@mhabets

Description

@mhabets

Code Sample

def apply_list (row):
     return [2*row['A'], 4*row['C'], 3*row['B']]

df = pd.DataFrame(np.random.randn(6,4), columns=list('ABCD'))
df['E'] = pd.Timestamp('20130102')

df['L'] = df.apply(apply_list, axis=1)

(Wrong) Error Output

ValueError: Shape of passed values is (6, 3), indices imply (6, 7)

Expected Result & Use case

Expected: a Serie containing a list for each row which are in my real case the result of a matrix multiplication processed with values coming from other columns (see a complete example in the next section)
Note: It works with a DataFrame without a datetime column (but not with a DataFrame with it):

0  [0.513057023122, -0.473155481431, -4.51039058299]  
1    [-0.331758428452, 3.92166465759, 2.75920806524]  
2    [-2.07257568656, 1.22070341071, 0.809676040678]  
3       [3.38079201699, 2.77074189984, 1.4351938036]  
4      [2.27838740024, 3.04253558763, 3.57504651688]  
5     [1.55497385554, 8.01021203173, -2.22893240905]  

Complete example with matrix multiplication

from numpy import pi, mat, cos, sin

def rotation(lon_rad, lat_rad):
    c_lat = cos(lat_rad); s_lat = sin(lat_rad)
    c_lon = cos(lon_rad); s_lon = sin(lon_rad)
    R = mat([[-s_lon, -s_lat*c_lon, c_lat*c_lon],
             [ c_lon, -s_lat*s_lon, c_lat*s_lon],
             [     0,        c_lat,       s_lat]])
    return R.T

def row_rotation(row):
    return rotation_ECEF_to_ENU(row['A']*pi/180.0, row['B']*pi/180).tolist()

df = pd.DataFrame(np.random.randn(6,4), columns=list('ABCD'))
# If I don't add the line below, I get what I want
df['E'] = pd.Timestamp('20130102')

df['M'] = df.apply(row_rotation, axis=1)

It works as I want if there is no datetime column in df:

          A         B         C         D  \
0  0.498796  0.209267 -0.551132 -0.066941   
1 -0.576161 -0.160802  0.895587 -0.355315   
2 -0.604028  0.959712 -1.388824 -0.356734   
3 -0.351012  0.242736  0.339068 -0.531425   
4 -1.355900  0.058885  1.458610  1.891502   
5 -0.560297  0.750459  1.288340  0.904650   

                                                   M  
0  [[-0.008705521998618918, 0.9999621062253967, 0...  
1  [[0.010055737814803208, 0.9999494397903326, 0....  
2  [[0.010542084651556644, 0.9999444306816251, 0....  
3  [[0.006126282255403784, 0.9999812341567851, 0....  
4  [[0.023662708414555287, 0.99971999891494, 0.0]...  
5  [[0.00977887456308289, 0.9999521856630343, 0.0...  

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-92-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 22.0.5
Cython: 0.24
numpy: 1.10.4
scipy: 0.17.1
statsmodels: None
xarray: None
IPython: 4.2.0
sphinx: 1.4.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.1
xlsxwriter: 0.8.9
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext)
jinja2: 2.8
boto: 2.40.0
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions