Skip to content

ENH: Styler.bar extended to allow centering about the mean, value or callable #42301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 9, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 13 additions & 9 deletions doc/source/user_guide/style.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1190,9 +1190,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In version 0.20.0 the ability to customize the bar chart further was given. You can now have the `df.style.bar` be centered on zero or midpoint value (in addition to the already existing way of having the min value at the left side of the cell), and you can pass a list of `[color_negative, color_positive]`.\n",
"Additional keyword arguments give more control on centering and positioning, and you can pass a list of `[color_negative, color_positive]` to highlight lower and higher values.\n",
"\n",
"Here's how you can change the above with the new `align='mid'` option:"
"Here's how you can change the above with the new `align` option, combined with setting `vmin` and `vmax` limits, the `width` of the figure, and underlying css `props` of cells, leaving space to display the text and the bars:"
]
},
{
Expand All @@ -1201,7 +1201,8 @@
"metadata": {},
"outputs": [],
"source": [
"df2.style.bar(subset=['A', 'B'], align='mid', color=['#d65f5f', '#5fba7d'])"
"df2.style.bar(align=0, vmin=-2.5, vmax=2.5, color=['#d65f5f', '#5fba7d'],\n",
" width=60, props=\"width: 120px; border-right: 1px solid black;\").format('{:.3f}', na_rep=\"\")"
]
},
{
Expand All @@ -1225,28 +1226,31 @@
"\n",
"# Test series\n",
"test1 = pd.Series([-100,-60,-30,-20], name='All Negative')\n",
"test2 = pd.Series([10,20,50,100], name='All Positive')\n",
"test3 = pd.Series([-10,-5,0,90], name='Both Pos and Neg')\n",
"test2 = pd.Series([-10,-5,0,90], name='Both Pos and Neg')\n",
"test3 = pd.Series([10,20,50,100], name='All Positive')\n",
"test4 = pd.Series([100, 103, 101, 102], name='Large Positive')\n",
"\n",
"\n",
"head = \"\"\"\n",
"<table>\n",
" <thead>\n",
" <th>Align</th>\n",
" <th>All Negative</th>\n",
" <th>All Positive</th>\n",
" <th>Both Neg and Pos</th>\n",
" <th>All Positive</th>\n",
" <th>Large Positive</th>\n",
" </thead>\n",
" </tbody>\n",
"\n",
"\"\"\"\n",
"\n",
"aligns = ['left','zero','mid']\n",
"aligns = ['left', 'right', 'zero', 'mid', 'mean', 99]\n",
"for align in aligns:\n",
" row = \"<tr><th>{}</th>\".format(align)\n",
" for series in [test1,test2,test3]:\n",
" for series in [test1,test2,test3, test4]:\n",
" s = series.copy()\n",
" s.name=''\n",
" row += \"<td>{}</td>\".format(s.to_frame().style.bar(align=align, \n",
" row += \"<td>{}</td>\".format(s.to_frame().style.hide_index().bar(align=align, \n",
" color=['#d65f5f', '#5fba7d'], \n",
" width=100).render()) #testn['width']\n",
" row += '</tr>'\n",
Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.4.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ enhancement2
Other enhancements
^^^^^^^^^^^^^^^^^^
- :meth:`Series.sample`, :meth:`DataFrame.sample`, and :meth:`.GroupBy.sample` now accept a ``np.random.Generator`` as input to ``random_state``. A generator will be more performant, especially with ``replace=False`` (:issue:`38100`)
- Additional options added to :meth:`.Styler.bar` to control alignment and display (:issue:`26070`)
- :meth:`Series.ewm`, :meth:`DataFrame.ewm`, now support a ``method`` argument with a ``'table'`` option that performs the windowing operation over an entire :class:`DataFrame`. See :ref:`Window Overview <window.overview>` for performance and functional benefits (:issue:`42273`)
-

Expand Down
267 changes: 192 additions & 75 deletions pandas/io/formats/style.py
Original file line number Diff line number Diff line change
Expand Up @@ -2042,75 +2042,16 @@ def set_properties(self, subset: Subset | None = None, **kwargs) -> Styler:
values = "".join(f"{p}: {v};" for p, v in kwargs.items())
return self.applymap(lambda x: values, subset=subset)

@staticmethod
def _bar(
s,
align: str,
colors: list[str],
width: float = 100,
vmin: float | None = None,
vmax: float | None = None,
):
"""
Draw bar chart in dataframe cells.
"""
# Get input value range.
smin = np.nanmin(s.to_numpy()) if vmin is None else vmin
smax = np.nanmax(s.to_numpy()) if vmax is None else vmax
if align == "mid":
smin = min(0, smin)
smax = max(0, smax)
elif align == "zero":
# For "zero" mode, we want the range to be symmetrical around zero.
smax = max(abs(smin), abs(smax))
smin = -smax
# Transform to percent-range of linear-gradient
normed = width * (s.to_numpy(dtype=float) - smin) / (smax - smin + 1e-12)
zero = -width * smin / (smax - smin + 1e-12)

def css_bar(start: float, end: float, color: str) -> str:
"""
Generate CSS code to draw a bar from start to end.
"""
css = "width: 10em; height: 80%;"
if end > start:
css += "background: linear-gradient(90deg,"
if start > 0:
css += f" transparent {start:.1f}%, {color} {start:.1f}%, "
e = min(end, width)
css += f"{color} {e:.1f}%, transparent {e:.1f}%)"
return css

def css(x):
if pd.isna(x):
return ""

# avoid deprecated indexing `colors[x > zero]`
color = colors[1] if x > zero else colors[0]

if align == "left":
return css_bar(0, x, color)
else:
return css_bar(min(x, zero), max(x, zero), color)

if s.ndim == 1:
return [css(x) for x in normed]
else:
return DataFrame(
[[css(x) for x in row] for row in normed],
index=s.index,
columns=s.columns,
)

def bar(
self,
subset: Subset | None = None,
axis: Axis | None = 0,
color="#d65f5f",
width: float = 100,
align: str = "left",
align: str | float | int | Callable = "mid",
vmin: float | None = None,
vmax: float | None = None,
props: str = "width: 10em;",
) -> Styler:
"""
Draw bar chart in the cell backgrounds.
Expand All @@ -2131,16 +2072,26 @@ def bar(
first element is the color_negative and the second is the
color_positive (eg: ['#d65f5f', '#5fba7d']).
width : float, default 100
A number between 0 or 100. The largest value will cover `width`
percent of the cell's width.
align : {'left', 'zero',' mid'}, default 'left'
How to align the bars with the cells.

- 'left' : the min value starts at the left of the cell.
The percentage of the cell, measured from the left, in which to draw the
bars, in [0, 100].
align : str, int, float, callable, default 'mid'
How to align the bars within the cells relative to a width adjusted center.
If string must be one of:

- 'left' : bars are drawn rightwards from the minimum data value.
- 'right' : bars are drawn leftwards from the maximum data value.
- 'zero' : a value of zero is located at the center of the cell.
- 'mid' : the center of the cell is at (max-min)/2, or
if values are all negative (positive) the zero is aligned
at the right (left) of the cell.
- 'mid' : a value of (max-min)/2 is located at the center of the cell,
or if all values are negative (positive) the zero is
aligned at the right (left) of the cell.
- 'mean' : the mean value of the data is located at the center of the cell.

If a float or integer is given this will indicate the center of the cell.

If a callable should take a 1d or 2d array and return a scalar.

.. versionchanged:: 1.4.0

vmin : float, optional
Minimum bar value, defining the left hand limit
of the bar drawing range, lower values are clipped to `vmin`.
Expand All @@ -2149,14 +2100,16 @@ def bar(
Maximum bar value, defining the right hand limit
of the bar drawing range, higher values are clipped to `vmax`.
When None (default): the maximum value of the data will be used.
props : str, optional
The base CSS of the cell that is extended to add the bar chart. Defaults to
`"width: 10em;"`

.. versionadded:: 1.4.0

Returns
-------
self : Styler
"""
if align not in ("left", "zero", "mid"):
raise ValueError("`align` must be one of {'left', 'zero',' mid'}")

if not (is_list_like(color)):
color = [color, color]
elif len(color) == 1:
Expand All @@ -2172,14 +2125,15 @@ def bar(
subset = self.data.select_dtypes(include=np.number).columns

self.apply(
self._bar,
_bar,
subset=subset,
axis=axis,
align=align,
colors=color,
width=width,
width=width / 100,
vmin=vmin,
vmax=vmax,
base_css=props,
)

return self
Expand Down Expand Up @@ -2830,3 +2784,166 @@ def _highlight_between(
else np.full(data.shape, True, dtype=bool)
)
return np.where(g_left & l_right, props, "")


def _bar(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could consider moving style* things to a separate dir e.g. pandas/io/formats/style/* and then have flexibility to for example move some utilities to own modules. (obviously for future)

data: FrameOrSeries,
align: str | float | int | Callable,
colors: list[str],
width: float,
vmin: float | None,
vmax: float | None,
base_css: str,
):
"""
Draw bar chart in data cells using HTML CSS linear gradient.

Parameters
----------
data : Series or DataFrame
Underling subset of Styler data on which operations are performed.
align : str in {"left", "right", "mid", "zero", "mean"}, int, float, callable
Method for how bars are structured or scalar value of centre point.
colors : list-like of str
Two listed colors as string in valid CSS.
width : float in [0,1]
The percentage of the cell, measured from left, where drawn bars will reside.
vmin : float, optional
Overwrite the minimum value of the window.
vmax : float, optional
Overwrite the maximum value of the window.
base_css : str
Additional CSS that is included in the cell before bars are drawn.
"""

def css_bar(start: float, end: float, color: str) -> str:
"""
Generate CSS code to draw a bar from start to end in a table cell.

Uses linear-gradient.

Parameters
----------
start : float
Relative positional start of bar coloring in [0,1]
end : float
Relative positional end of the bar coloring in [0,1]
color : str
CSS valid color to apply.

Returns
-------
str : The CSS applicable to the cell.

Notes
-----
Uses ``base_css`` from outer scope.
"""
cell_css = base_css
if end > start:
cell_css += "background: linear-gradient(90deg,"
if start > 0:
cell_css += f" transparent {start*100:.1f}%, {color} {start*100:.1f}%,"
cell_css += f" {color} {end*100:.1f}%, transparent {end*100:.1f}%)"
return cell_css

def css_calc(x, left: float, right: float, align: str):
"""
Return the correct CSS for bar placement based on calculated values.

Parameters
----------
x : float
Value which determines the bar placement.
left : float
Value marking the left side of calculation, usually minimum of data.
right : float
Value marking the right side of the calculation, usually maximum of data
(left < right).
align : {"left", "right", "zero", "mid"}
How the bars will be positioned.
"left", "right", "zero" can be used with any values for ``left``, ``right``.
"mid" can only be used where ``left <= 0`` and ``right >= 0``.
"zero" is used to specify a center when all values ``x``, ``left``,
``right`` are translated, e.g. by say a mean or median.

Returns
-------
str : Resultant CSS with linear gradient.

Notes
-----
Uses ``colors`` and ``width`` from outer scope.
"""
if pd.isna(x):
return base_css

color = colors[0] if x < 0 else colors[1]
x = left if x < left else x
x = right if x > right else x # trim data if outside of the window

start: float = 0
end: float = 1

if align == "left":
# all proportions are measured from the left side between left and right
end = (x - left) / (right - left)

elif align == "right":
# all proportions are measured from the right side between left and right
start = (x - left) / (right - left)

else:
z_frac: float = 0.5 # location of zero based on the left-right range
if align == "zero":
# all proportions are measured from the center at zero
limit: float = max(abs(left), abs(right))
left, right = -limit, limit
elif align == "mid":
# bars drawn from zero either leftwards or rightwards with center at mid
mid: float = (left + right) / 2
z_frac = (
-mid / (right - left) + 0.5 if mid < 0 else -left / (right - left)
)

if x < 0:
start, end = (x - left) / (right - left), z_frac
else:
start, end = z_frac, (x - left) / (right - left)

return css_bar(start * width, end * width, color)

values = data.to_numpy()
left = np.nanmin(values) if vmin is None else vmin
right = np.nanmax(values) if vmax is None else vmax
z: float = 0 # adjustment to translate data

if align == "mid":
if left >= 0: # "mid" is documented to act as "left" if all values positive
align, left = "left", 0 if vmin is None else vmin
elif right <= 0: # "mid" is documented to act as "right" if all values negative
align, right = "right", 0 if vmax is None else vmax
elif align == "mean":
z, align = np.nanmean(values), "zero"
elif callable(align):
z, align = align(values), "zero"
elif isinstance(align, (float, int)):
z, align = float(align), "zero"
elif not (align == "left" or align == "right" or align == "zero"):
raise ValueError(
"`align` should be in {'left', 'right', 'mid', 'mean', 'zero'} or be a "
"value defining the center line or a callable that returns a float"
)

assert isinstance(align, str) # mypy: should now be in [left, right, mid, zero]
if data.ndim == 1:
return [css_calc(x - z, left - z, right - z, align) for x in values]
else:
return DataFrame(
[
[css_calc(x - z, left - z, right - z, align) for x in row]
for row in values
],
index=data.index,
columns=data.columns,
)
Loading