Closed
Description
Starting point:
http://pandas.pydata.org/pandas-docs/stable/io.html#index-columns-and-trailing-delimiters
If there is one more column of data than there are colum names, usecols exhibits some (at least for me) unintuitive behavior:
>>> data = 'a,b,c\n4,apple,bat,5.7\n8,orange,cow,10'
>>> pd.read_csv(StringIO(data))
a b c
4 apple bat 5.7
8 orange cow 10.0
>>> pd.read_csv(StringIO(data), usecols=['a', 'b'])
a b
0 4 apple
1 8 orange
>>>
I was expecting it to be equal to
>>> pd.read_csv(StringIO(data))[['a', 'b']]
a b
4 apple bat
8 orange cow
I am not sure if my expectation is unfounded, though, and that this behavior is indeed intentional?