QST:  loc returns matrix with one row when index is not strictly unique

### Research

- [X] I have searched the [[pandas] tag](https://stackoverflow.com/questions/tagged/pandas) on StackOverflow for similar questions.

- [X] I have asked my usage related question on [StackOverflow](https://stackoverflow.com).


### Link to question on StackOverflow

https://stackoverflow.com/questions/78262738/why-does-pandas-loc-with-multiindex-return-a-matrix-with-single-row?noredirect=1#comment137985578_78262738

### Question about pandas

It seems when there exists at least one duplicate multi-index, ALL indices will return a DataFrame when queried with `.loc[]`, even if only one entry exists, resulting in a 1xN matrix. (I didn't test whether this is also applies to single-index). Is this wanted behaviour?

It does seem a bit strange and inconsistent with other functions, especially since you can still retrieve a Series when you query twice with `.loc[].loc[]`. On second glance it can also be seen to improve consistency by returning always a DataFrame for any index, regardless of the index that is used. While this has contributed to raising awarenss that the duplicate exist in one dataframe but not the other, the way there was long and tedious. I would prefer an INFO or WARN log over getting all confused why two similar dataframes return different objects when indexed in the same way.

code to reproduce (see also [SO-post](https://stackoverflow.com/questions/78262738/why-does-pandas-loc-with-multiindex-return-a-matrix-with-single-row?noredirect=1#comment137985578_78262738))

```python
import pandas as pd
from io import StringIO

file1 = StringIO("pdf_name;page;col1;col2\npdf1;0;val1;val2\npdf2;0;asdf;ffff")
file2 = StringIO("pdf_name;page;col1;col2\npdf1;0;;\npdf2;0;;\npdf2;0;;")
data1 = pd.read_csv(file1, sep=";", index_col=['pdf_name', 'page'])
data2 = pd.read_csv(file2, sep=";", index_col=['pdf_name', 'page'])
data2 = data2.sort_index()  # data2.sort_index()  # avoid performance warning
idx = data1.index[0]
print(idx)  # ('pdf1', 0)
print("data1.loc[idx]", type(data1.loc[idx]))  # data1.loc[idx] <class 'pandas.core.series.Series'>
print("data2.loc[idx]", type(data2.loc[idx]))  # data2.loc[idx] <class 'pandas.core.frame.DataFrame'>
print("data2.loc[idx].shape", data2.loc[idx].shape)  # (1, 2)  -- single row
print("data2['col1'].loc[idx]", type(data2['col1'].loc[idx]))  # data2['col1'].loc[idx] <class 'pandas.core.series.Series'>
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

QST: loc returns matrix with one row when index is not strictly unique #58142

Research

Link to question on StackOverflow

Question about pandas

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

QST: loc returns matrix with one row when index is not strictly unique #58142

Description

Research

Link to question on StackOverflow

Question about pandas

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions