Skip to content

BUG: alignment of MultiIndexed DataFrame to a Series with a common index level #46001

Closed
@johannes-mueller

Description

@johannes-mueller

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np

foo_index = pd.Index([1, 2, 3], name='foo')
bar_index = pd.Index([1, 2], name='bar')

series = pd.Series([10, 20], index=bar_index, name='bar_series')

df = pd.DataFrame({'col': np.arange(6)}, index=pd.MultiIndex.from_product([foo_index, bar_index]))

_, series_aligned = df.align(series, axis=0)

print(series_aligned)

# Workaround: put the Series into the DataFrame and pull the column out again
# print(df.align(pd.DataFrame(series), axis=0)[1].iloc[:, 0])  # Works

# Align a Series to a Series
# print(df.col.align(series))  # Works

Issue Description

The series_aligned is filled with NaNs although there is a sensible way of filling it with actual data.

foo  bar
1    1     NaN
     2     NaN
2    1     NaN
     2     NaN
3    1     NaN
     2     NaN
Name: bar_series, dtype: float64

Maybe related to #43321

Expected Behavior

There is a sensible way of filling the Series:

Expected output

foo  bar
1    1      10
     2      20
2    1      10
     2      20
3    1      10
     2      20
Name: bar_series, dtype: int64

Installed Versions

INSTALLED VERSIONS ------------------ commit : cdca67a python : 3.9.7.final.0 python-bits : 64 OS : Linux OS-release : 5.4.0-99-lowlatency Version : #112-Ubuntu SMP PREEMPT Thu Feb 3 14:52:40 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : de_DE.UTF-8 LOCALE : de_DE.UTF-8

pandas : 1.5.0.dev0+360.gcdca67a28c
numpy : 1.21.4
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : 0.29.24
pytest : 6.2.5
hypothesis : 6.24.6
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.3
IPython : 7.29.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndex

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions