Simultaneously melt multiple columns

### Melt Enhancement

**Summary**: This is a proposal with a pull request to enhance `melt` to simultaneously melt multiple groups of columns and to add functionality from `wide_to_long` along with better MultiIndexing capabilities. See [this notebook]( https://nbviewer.jupyter.org/github/tdpetrou/Machine-Learning-Books-With-Python/blob/master/melt%20enhancemenet.ipynb) for more examples.
 
- Melts different groups of columns by passing a list of lists into `value_vars`. Each group gets melted into its own column. This feature replaces the need for lreshape.
- When melting different groups of columns, groups do not have to be the same length. The shorter groups are filled with missing values.
- Adds parameters `stubnames`(boolean), `prefix` and `sep` from function `wide_to_long`. It keeps the suffixes in separate columns and does not align them in the same way.
- Can select any number of MultiIndex levels and greatly increase MultiIndex functionality
- Works with repeated column names, which normally show up when selecting a subset of MultiIndex levels
- Performance is ~30-40% faster than original `melt`, slightly slower than `lreshape` and much faster than `wide_to_long`



```python
>>> df = pd.DataFrame({'City': ['Houston', 'Austin', 'Hoover'],
                   'State': ['Texas', 'Texas', 'Alabama'],
                   'Name':['Aria', 'Penelope', 'Niko'],
                   'Mango':[4, 10, 90],
                   'Orange': [10, 8, 14], 
                   'Watermelon':[40, 99, 43],
                   'Gin':[16, 200, 34],
                   'Vodka':[20, 33, 18]},
                 columns=['City', 'State', 'Name', 'Mango', 'Orange', 'Watermelon', 'Gin', 'Vodka'])

      City    State      Name  Mango  Orange  Watermelon  Gin  Vodka
0  Houston    Texas      Aria      4      10          40   16     20
1   Austin    Texas  Penelope     10       8          99  200     33
2   Hoover  Alabama      Niko     90      14          43   34     18
```
Use a list of lists in `value_vars` to melt the fruit and drinks
```
>>> df.melt(id_vars=['City', 'State'], value_vars=[['Mango', 'Orange', 'Watermelon'], ['Gin', 'Vodka']], 
                    var_name=['Fruit', 'Drink'], value_name=['Pounds', 'Ounces'])

      City    State       Fruit  Pounds  Drink  Ounces
0  Houston    Texas       Mango       4    Gin    16.0
1   Austin    Texas       Mango      10    Gin   200.0
2   Hoover  Alabama       Mango      90    Gin    34.0
3  Houston    Texas      Orange      10  Vodka    20.0
4   Austin    Texas      Orange       8  Vodka    33.0
5   Hoover  Alabama      Orange      14  Vodka    18.0
6  Houston    Texas  Watermelon      40    nan     NaN
7   Austin    Texas  Watermelon      99    nan     NaN
8   Hoover  Alabama  Watermelon      43    nan     NaN
```
`wide_to_long` functionality. Added parameters `stubnames`(boolean), `sep` and `suffix`.
```
>>> df1 = pd.DataFrame({'group': ['a', 'b', 'c'],
                   'exp_1':[4, 10, -9],
                   'exp_2': [10, 8, 14], 
                   'res_1':[8, 5, 4],
                   'res_3':[11, 0, 7]}, columns=['group', 'exp_1', 'exp_2', 'res_1', 'res_3'])

  group  exp_1  exp_2  res_1  res_3
0     a      4     10      8     11
1     b     10      8      5      0
2     c     -9     14      4      7

>>> df1.melt(id_vars='group', value_vars=['exp','res'], stubnames=True, sep='_')

  group  variable_exp  exp  variable_res  res
0     a             1    4             1    8
1     b             1   10             1    5
2     c             1   -9             1    4
3     a             2   10             3   11
4     b             2    8             3    0
5     c             2   14             3    7
```
Also adds support for all kinds of multiindexing
```
>>> df2 = df.copy()
>>> df2.columns = pd.MultiIndex.from_arrays([list('aabbcccd'), list('ffffgggg'), df.columns], 
                                       names=[None, None, 'some vars'])

                 a                  b            c                     d
                 f                  f            g                     g
some vars     City    State      Name Mango Orange Watermelon  Gin Vodka
0          Houston    Texas      Aria     4     10         40   16    20
1           Austin    Texas  Penelope    10      8         99  200    33
2           Hoover  Alabama      Niko    90     14         43   34    18

>>> df2.melt(id_vars=[('a', 'f', 'State')], 
           value_vars=[[('b', 'f', 'Name'), ('c', 'g', 'Watermelon')],
                       [('b','f','Mango'), ('c','g', 'Orange'), ('d', 'g', 'Vodka')]],
           var_name=[['myvar1', 'myvar2', 'myvar3'],
                     ['next_myvar1', 'next_myvar2', 'next_myvar3']],
           value_name=['some values', 'more_values'])

 (a, f, State) myvar1 myvar2      myvar3 some values next_myvar1 next_myvar2  \
0         Texas      b      f        Name        Aria           b           f   
1         Texas      b      f        Name    Penelope           b           f   
2       Alabama      b      f        Name        Niko           b           f   
3         Texas      c      g  Watermelon          40           c           g   
4         Texas      c      g  Watermelon          99           c           g   
5       Alabama      c      g  Watermelon          43           c           g   
6         Texas    nan    nan         nan         NaN           d           g   
7         Texas    nan    nan         nan         NaN           d           g   
8       Alabama    nan    nan         nan         NaN           d           g   

  next_myvar3  more_values  
0       Mango            4  
1       Mango           10  
2       Mango           90  
3      Orange           10  
4      Orange            8  
5      Orange           14  
6       Vodka           20  
7       Vodka           33  
8       Vodka           18 
```

#### Problem description

Currently, there is poor support for simultaneous melting of multiple groups of columns. `lreshape` is old and undocumented. `wide_to_long` api does not match `melt` and it's slow.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Simultaneously melt multiple columns #17676

Melt Enhancement

Problem description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Simultaneously melt multiple columns #17676

Description

Melt Enhancement

Problem description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions