Closed
Description
An answer on my recent SO post used get_indexer
, which worked great, but I soon learned if the indexer doesn't find a match it seems to use the last entry without raising an error:
df = pd.DataFrame({"ID": [6, 2, 4],
"to ignore": ["foo", "whatever", "idk"],
"value": ["A", "B", "asdf"],
})
df2 = pd.DataFrame({"ID_number": [1, 2, 3, 4, 5, 6],
"A": [0.91, 0.42, 0.85, 0.84, 0.81, 0.88],
"B": [0.11, 0.22, 0.45, 0.38, 0.01, 0.18],
})
df2 = df2.set_index('ID_number')
df['new_col'] = df2.values[df2.index.get_indexer(df['ID']), df2.columns.get_indexer(df['value'])]
I presumed the row with "asdf" would have raised an error, not returned the value for "B". This was problematic because I unwittingly processed data incorrectly and I enjoy being employed.
My interpretation of the documentation was that if method
was not supplied it would be "default: exact matches only". Supplying method = 'default'
and tolerance = 0
was also not accepted when not using another method.
Also, I'm no serious programmer so I have likely misunderstood something and am only trying to help. Please feel free to correct / tell me to go away