Closed
Description
Currently histogram trace events (click
, hover
, selecting
, selected
) yield the bin number in the pointNumber
field, and don't do anything to connect this back to the input arrays. That seems awkward for applications like crossfilter. How should we handle this?
- (1) Leave it as is, let the developer work it out.
- pro: Small event data. Direct correspondence between the event on screen and the event data. Also inertia!
- con: It can be cumbersome (and slow) to work out the bin contents. Also
pointNumber
isn't actually a point number (ie you can't look up items ingd.data
from this)
- (2) Report one
point
for each item in the selected bin, and don't report the bin itself at all- pro: Most natural for crossfilter, as we connect directly back to the input arrays. In as far as this is what you care about, there's no special case to deal with histogram traces, they work just like other traces.
- con: Event data could become huge and slow to create. What about empty bins, we wouldn't be reporting anything even though (for click and hover events) there is hover data there. Devs still may want to know what bin is being selected.
- (3) Keep the current structure, but add an array of actual point numbers to each bin data object. Perhaps bin number could move from
pointNumber
tobinNumber
and the associated input data could be an array,pointNumbers
?- pro: report all the things! Doesn't have as much overhead as reporting a whole point for each item, either in object size (an array of numbers vs an array of objects with lots of keys) or in CPU (we could probably construct all these arrays of indices during
histogram.calc
with minimal overhead, and just spit them out on demand) - con: histogram becomes a special case for any event-handling code - that said, you won't need to switch on trace type per se but just on the structure of the event data you get. And other traces may follow, for example looking at https://simonbjohnson.github.io/Ebola-3W-Dashboard/ I notice the pie charts are really aggregated like histograms - separate issue on the topic forthcoming, I think this is in fact a very common use case.
- pro: report all the things! Doesn't have as much overhead as reporting a whole point for each item, either in object size (an array of numbers vs an array of objects with lots of keys) or in CPU (we could probably construct all these arrays of indices during
I'm leaning (3) but would love to hear opinions particularly from @cpsievert @chriddyp @monfera