Open
Description
We cannot load >2D array with duplicate labels:
arr = ndtest("a=a0,a1;b=x,x;c=c0,c1")
arr.to_hdf('test.h5', 'arr')
arr = read_hdf('test.h5', 'arr')
ValueError: cannot reshape array of size 8 into shape (2,1,2)
arr.to_csv('test.csv')
arr = read_csv('test.csv')
ValueError: cannot handle a non-unique multi-index!
arr.to_excel('test.xlsx')
arr = read_excel('test.xlsx')
ValueError: cannot handle a non-unique multi-index!
For HDF, this is clearly a limitation in larray's code. In pandas.py/index_to_labels, I used the following code:
return [unique_list(idx.get_level_values(label)) for label in range(idx.nlevels)]
where unique_list returns the unique labels for that index "level", and that obviously breaks in the presence of duplicate labels.
For csv and Excel, this is not so clear-cut. This seems to be a limitation in Pandas reindex, and I am unsure we can do anything about that (except not going via Pandas to load data).