Skip to content

Adding Scatterplot Matrix to FigureFactory #417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
May 12, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
f72beed
Adding Scatterplot Matrix to FigureFactory
Kully Mar 3, 2016
04c8e80
Added Diagonal Choices and implemented comments from @theengineear
Kully Mar 18, 2016
f7ba1c3
Added palette theme and **kwargs
Kully Apr 15, 2016
70c916d
Updated tests
Kully Apr 20, 2016
68c4143
Update plot schema
Kully Apr 20, 2016
7d43513
Updated Version
Kully Apr 20, 2016
89d28f3
Added 'complete function call' test in test_optional (it requires a p…
Kully Apr 20, 2016
c9d8bf7
added math import to test_core and fixed wrapping syntax
Kully Apr 22, 2016
6698fcd
Removed pandas_import check in validate_dataframe that was making the…
Kully Apr 22, 2016
3ba79c2
Moved only core test to optional
Kully Apr 29, 2016
08745cd
Add initialization to dataframe, headers and index_vals in function
Kully Apr 29, 2016
14a4bfd
changed basestring to str
Kully May 2, 2016
bbd02a5
Merge branch 'master' into Scatter_Plot_Matrix
Kully May 2, 2016
a6a91c2
Cleaned up Doc String and updated version.py and CHANGELOG
Kully May 2, 2016
c3e7755
attempt at fixing kwargs_test
Kully May 2, 2016
09d3c5d
More testing...
Kully May 2, 2016
bb48595
Original kwargs_test restored
Kully May 2, 2016
1957cf2
removed .data
Kully May 3, 2016
fbfe495
Changed kwargs test to assert_dict_equal
Kully May 3, 2016
a87a07b
Another test for kwargs_test
Kully May 4, 2016
84fc4c4
stashing some changes
Kully May 5, 2016
f98da6a
Commented out current theme-index-test//created test for no index- no…
Kully May 10, 2016
f4764b4
Updating Schema
Kully May 10, 2016
e25d392
Add a test for scatterplotmatrix with index
Kully May 10, 2016
7a2dffa
Adding stashed changes
Kully May 10, 2016
b3bf96d
Switching 'histogram' to 'scatter'
Kully May 11, 2016
398b8d5
Switching 'scatter' (which worked) to 'box'
Kully May 11, 2016
7afe967
Put 'histogram' back in diag=
Kully May 11, 2016
718dd71
Fixed up String Docs
Kully May 11, 2016
78608a7
Last Tweak: Corrected formatting on expected scatter plot matrix in t…
Kully May 11, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,14 @@ This project adheres to [Semantic Versioning](http://semver.org/).

## [Unreleased]

## [1.9.11] - 2016-05-02
### Added
- The FigureFactory can now create scatter plot matrices with `.create_scatterplotmatrix`. Check it out with:
```
import plotly.tools as tls
help(tls.FigureFactory.create_scatterplotmatrix)
```

## [1.9.10] - 2016-04-27
### Updated
- Updated plotly.min.js so the offline mode is using plotly.js v1.10.0
Expand Down
5,258 changes: 3,693 additions & 1,565 deletions plotly/graph_reference/default-schema.json

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion plotly/tests/test_core/test_tools/test_figure_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -1143,4 +1143,3 @@ def test_table_with_index(self):
# "FigureFactory.create_distplot requires scipy",
# tls.FigureFactory.create_distplot,
# hist_data, group_labels)

306 changes: 306 additions & 0 deletions plotly/tests/test_optional/test_figure_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from nose.tools import raises

import numpy as np
import pandas as pd


class TestDistplot(TestCase):
Expand Down Expand Up @@ -529,3 +530,308 @@ def test_dendrogram_colorscale(self):
self.assert_dict_equal(dendro['data'][0], expected_dendro['data'][0])
self.assert_dict_equal(dendro['data'][1], expected_dendro['data'][1])
self.assert_dict_equal(dendro['data'][2], expected_dendro['data'][2])


class TestScatterPlotMatrix(NumpyTestUtilsMixin, TestCase):

def test_dataframe_input(self):

# check: dataframe is imported
df = 'foo'

pattern = (
"Dataframe not inputed. Please use a pandas dataframe to produce "
"a scatterplot matrix."
)

self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df)

def test_one_column_dataframe(self):

# check: dataframe has 1 column or less
df = pd.DataFrame([1, 2, 3])

pattern = (
"Dataframe has only one column. To use the scatterplot matrix, "
"use at least 2 columns."
)

self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


def test_valid_diag_choice(self):

# make sure that the diagonal param is valid
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]])

self.assertRaises(PlotlyError,
tls.FigureFactory.create_scatterplotmatrix,
df, diag='foo')

def test_forbidden_params(self):

# check: the forbidden params of 'marker' in **kwargs
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]])

kwargs = {'marker': {'size': 15}}

pattern = (
"Your kwargs dictionary cannot include the 'size', 'color' or "
"'colorscale' key words inside the marker dict since 'size' is "
"already an argument of the scatterplot matrix function and both "
"'color' and 'colorscale are set internally."
)

self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


def test_valid_index_choice(self):

# check: index is a column name
df = pd.DataFrame([[1, 2], [3, 4]], columns=['apple', 'pear'])

pattern = (
"Make sure you set the index input variable to one of the column "
"names of your dataframe."
)

self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df, index='grape')

def test_same_data_in_dataframe_columns(self):

# check: either all numbers or strings in each dataframe column
df = pd.DataFrame([['a', 2], [3, 4]])

pattern = (
"Error in dataframe. Make sure all entries of each column are "
"either numbers or strings."
)

self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df)

df = pd.DataFrame([[1, 2], ['a', 4]])

self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df)

def test_same_data_in_index(self):

# check: either all numbers or strings in index column
df = pd.DataFrame([['a', 2], [3, 4]], columns=['apple', 'pear'])

pattern = (
"Error in indexing column. Make sure all entries of each column "
"are all numbers or all strings."
)

self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df, index='apple')

df = pd.DataFrame([[1, 2], ['a', 4]], columns=['apple', 'pear'])

self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df, index='apple')

def test_valid_palette(self):

# check: the palette argument is in a acceptable form
df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]],
columns=['a', 'b', 'c'])

self.assertRaisesRegexp(PlotlyError, "You must pick a valid "
"plotly colorscale name.",
tls.FigureFactory.create_scatterplotmatrix,
df, use_theme=True, index='a',
palette='fake_scale')

pattern = (
"The items of 'palette' must be tripets of the form a,b,c or "
"'rgbx,y,z' where a,b,c belong to the interval 0,1 and x,y,z "
"belong to 0,255."
)

self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df, use_theme=True, palette=1, index='c')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/plotly/plotly.py/pull/417/files#r60284049 (all of these can be made a lot more readable)


def test_valid_endpts(self):

# check: the endpts is a list or a tuple
df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]],
columns=['a', 'b', 'c'])

pattern = (
"The intervals_endpts argument must be a list or tuple of a "
"sequence of increasing numbers."
)

self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df, use_theme=True, index='a',
palette='Blues', endpts='foo')

# check: the endpts are a list of numbers
self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df, use_theme=True, index='a',
palette='Blues', endpts=['a'])

# check: endpts is a list of INCREASING numbers
self.assertRaisesRegexp(PlotlyError, pattern,
tls.FigureFactory.create_scatterplotmatrix,
df, use_theme=True, index='a',
palette='Blues', endpts=[2, 1])

def test_scatter_plot_matrix(self):

# check if test scatter plot matrix without index or theme matches
# with the expected output
df = pd.DataFrame([[2, 'Apple'], [6, 'Pear'],
[-15, 'Apple'], [5, 'Pear'],
[-2, 'Apple'], [0, 'Apple']],
columns=['Numbers', 'Fruit'])

test_scatter_plot_matrix = tls.FigureFactory.create_scatterplotmatrix(
df, diag='scatter', height=1000, width=1000, size=13,
title='Scatterplot Matrix', use_theme=False
)

exp_scatter_plot_matrix = {
'data': [{'marker': {'size': 13},
'mode': 'markers',
'showlegend': False,
'type': 'scatter',
'x': [2, 6, -15, 5, -2, 0],
'xaxis': 'x1',
'y': [2, 6, -15, 5, -2, 0],
'yaxis': 'y1'},
{'marker': {'size': 13},
'mode': 'markers',
'showlegend': False,
'type': 'scatter',
'x': ['Apple',
'Pear',
'Apple',
'Pear',
'Apple',
'Apple'],
'xaxis': 'x2',
'y': [2, 6, -15, 5, -2, 0],
'yaxis': 'y2'},
{'marker': {'size': 13},
'mode': 'markers',
'showlegend': False,
'type': 'scatter',
'x': [2, 6, -15, 5, -2, 0],
'xaxis': 'x3',
'y': ['Apple',
'Pear',
'Apple',
'Pear',
'Apple',
'Apple'],
'yaxis': 'y3'},
{'marker': {'size': 13},
'mode': 'markers',
'showlegend': False,
'type': 'scatter',
'x': ['Apple',
'Pear',
'Apple',
'Pear',
'Apple',
'Apple'],
'xaxis': 'x4',
'y': ['Apple', 'Pear', 'Apple', 'Pear', 'Apple', 'Apple'],
'yaxis': 'y4'}],
'layout': {'height': 1000,
'showlegend': True,
'title': 'Scatterplot Matrix',
'width': 1000,
'xaxis1': {'anchor': 'y1',
'domain': [0.0, 0.45]},
'xaxis2': {'anchor': 'y2',
'domain': [0.55, 1.0]},
'xaxis3': {'anchor': 'y3',
'domain': [0.0, 0.45], 'title': 'Numbers'},
'xaxis4': {'anchor': 'y4',
'domain': [0.55, 1.0], 'title': 'Fruit'},
'yaxis1': {'anchor': 'x1',
'domain': [0.575, 1.0], 'title': 'Numbers'},
'yaxis2': {'anchor': 'x2',
'domain': [0.575, 1.0]},
'yaxis3': {'anchor': 'x3',
'domain': [0.0, 0.425], 'title': 'Fruit'},
'yaxis4': {'anchor': 'x4',
'domain': [0.0, 0.425]}}
}

self.assert_dict_equal(test_scatter_plot_matrix['data'][0],
exp_scatter_plot_matrix['data'][0])

self.assert_dict_equal(test_scatter_plot_matrix['data'][1],
exp_scatter_plot_matrix['data'][1])

self.assert_dict_equal(test_scatter_plot_matrix['layout'],
exp_scatter_plot_matrix['layout'])

def test_scatter_plot_matrix_kwargs(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 this is great! thanks!


# check if test scatter plot matrix matches with
# the expected output
df = pd.DataFrame([[2, 'Apple'], [6, 'Pear'],
[-15, 'Apple'], [5, 'Pear'],
[-2, 'Apple'], [0, 'Apple']],
columns=['Numbers', 'Fruit'])

test_scatter_plot_matrix = tls.FigureFactory.create_scatterplotmatrix(
df, index='Fruit', endpts=[-10, -1], diag='histogram',
height=1000, width=1000, size=13, title='Scatterplot Matrix',
use_theme=True, palette='YlOrRd', marker=dict(symbol=136)
)

exp_scatter_plot_matrix = {
'data': [{'marker': {'color': 'rgb(128.0, 0.0, 38.0)'},
'showlegend': False,
'type': 'histogram',
'x': [2, -15, -2, 0],
'xaxis': 'x1',
'yaxis': 'y1'},
{'marker': {'color': 'rgb(255.0, 255.0, 204.0)'},
'showlegend': False,
'type': 'histogram',
'x': [6, 5],
'xaxis': 'x1',
'yaxis': 'y1'}],
'layout': {'barmode': 'stack',
'height': 1000,
'showlegend': True,
'title': 'Scatterplot Matrix',
'width': 1000,
'xaxis1': {'anchor': 'y1',
'domain': [0.0, 1.0],
'title': 'Numbers'},
'yaxis1': {'anchor': 'x1',
'domain': [0.0, 1.0],
'title': 'Numbers'}}
}

self.assert_dict_equal(test_scatter_plot_matrix['data'][0],
exp_scatter_plot_matrix['data'][0])

self.assert_dict_equal(test_scatter_plot_matrix['data'][1],
exp_scatter_plot_matrix['data'][1])

self.assert_dict_equal(test_scatter_plot_matrix['layout'],
exp_scatter_plot_matrix['layout'])
Loading