Transform input data: groupby, filter

Previously discussed (some lists are from @chriddyp) :

A `groupby`transform should split apart traces as per unique values or bins of the `groupby` dimension. Example:

```
groupby: ['a', 'b', 'a', 'b']
x: [1, 2, 1, 2]
y: [10, 20, 30, 40]
```

should generate two traces:

```
trace 1:
x: [1, 2]
y: [10, 20]

trace 2:
x: [1, 2]
y: [30, 40]
```
### Static `groupby` as a means of splitting spatially and/or aesthetically
- [ ] distinct categorical values: numbers, strings or `datetime` strings
- [ ] evenly spaced bins based on numerical data or time (`datetime` strings) in the `groupby` attribute, reusing logic of the preexisting `plotly` algorithm for histograms

![image](https://cloud.githubusercontent.com/assets/1548516/18349091/9c9e7040-75cf-11e6-843d-01eb5292d3e9.png)

Functional aspects:
1. `groupby` needs to work across numbers, dates, and categories (@chriddyp in the JS context, meaning strings, correct?)
2. `groupby` needs to split across all of the arrays or array-like specifications in a trace, not just `x` and `y`. For example, `marker.color` or `marker.line.color`. Not all array-like specifications in a trace are actual arrays (consider `colorscale`)
3. There must be a way of specifying distinct styles for the split apart traces so that they're discernible - example:
   
   ```
   transform:
       groupby: ['a', 'b', 'a', 'b']
       marker:
           color:
               a: 'blue'
               b: 'red'
   ```
4. @etpinard found some issues with legend items as he wrote an initial version of transforms: https://github.com/plotly/plotly.js/pull/499#issuecomment-216597436. We'll probably need to modify some of the `transforms` and API. That's OK - `transforms` was made for `groupby`
5. All relevant denotations for `groupby`, and the related animation split use (see below) need to be in the JSON format for serializability, fitting in the current declarative structure
6. The transforms such as `groupby` must work in the `restyle` and `relayout` steps, not just the initial `plot` step
7. `gd.data` is expected to preserve the single trace and the `groupby` spec as the user supplied, and `_fullData` on the other hand has the individual (spllt) traces and no longer has the `groupby` attribute
8. We must ID traces in `_fullData` back to groups or styles in `data`. Styling controls will be populated with the defaults from `_fullData` (e.g. `_fullData[4].marker.color`) but they’ll need to update the attributes in the `data` object (e.g. `data[0].transform.marker.color.d`). That’s because we serialize and save `data`, not `_fullData`.
### Preliminary work

Related PR, containing the initial, analogous `filter` work by @timelyportfolio : https://github.com/plotly/plotly.js/pull/859
`groupby`: https://github.com/plotly/plotly.js/blob/master/test/jasmine/assets/transforms/groupby.js
### Planned `groupby` coverage of the initial sprint
1. It would cover a positive list of attributes for `groupby` such as `x` and `y` but not all at once - HOWEVER the preferred solution aims for generality because other transforms will need to use a similar approach e.g. `filter`, and future arraylike attributes should be covered without code coupling to transformations (consequence: we'll have to check if there's enough `attribute` metadata that allows us to tell if it's arraylike, or we need further metadata; also, whether there's a programmatic way of separating arraylike data e.g. `colorscale` that's not represented as an array at input, otherwise we need to handle them attribute by attribute (we'll have to come back to this topic after a first round of work). 
   Initial attributes at least: `x`, `y`, `marker.color`, `marker.size` (scatter, bar, histogram, box)
   Then `lat`, `lon` (maps), `a`, `b`, `c` (ternary), ‘z’ (scatter3d), `error_y.array`
2. It would cover a set of (initially, non-WebGL) traces
3. First goalpost is separation by category (JS number or string)

It is expected that the trace separation (and transformations in general) is being performed in the  supply defaults step.
### Subsequent goal: splitting data for animations

Instead of generating `n` different paths as described above, `plotly` would arrive at a temporal sequence of `n` frames
### Possible future items:
1. Incremental recalculation (e.g. of bins, upon newly arriving data points)
2. Combine this with a subplots transform for rendering the traces into separate subplots (as small multiples plots)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Transform input data: groupby, filter #917

Static `groupby` as a means of splitting spatially and/or aesthetically

Preliminary work

Planned `groupby` coverage of the initial sprint

Subsequent goal: splitting data for animations

Possible future items:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Transform input data: groupby, filter #917

Description

Static groupby as a means of splitting spatially and/or aesthetically

Preliminary work

Planned groupby coverage of the initial sprint

Subsequent goal: splitting data for animations

Possible future items:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Static `groupby` as a means of splitting spatially and/or aesthetically

Planned `groupby` coverage of the initial sprint