Closed
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from tempfile import NamedTemporaryFile
import pandas as pd
XML = '''
<issue>
<type>BUG</type>
<reporter type="newbie">Emanuel</reporter>
</issue>
'''.encode('utf-8')
with NamedTemporaryFile() as tempfile:
tempfile.write(XML)
tempfile.flush()
df = pd.read_xml(tempfile.name, iterparse={'issue': ['type']})
issue_type = df.iloc[0]['type']
print(f"issue_type: expecting BUG, got {issue_type}")
Issue Description
The parsing code parses attributes on all child elements, rather than just on the main element.
This is probably unintentional - in the example above, the user is most likely interested in the type of the incident rather than the type attribute on some child element.
This feature is planned for 1.5.0 and not yet released. See: #45724 (comment)
Expected Behavior
Attributes are parsed only on the toplevel element.