Skip to content

DataFrame.plot() produces incorrect legend labels when plotting multiple series on the same axis #18222

Closed
@scottlawsonbc

Description

@scottlawsonbc

Code Sample, a copy-pastable example if possible

import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(data=[[1, 1, 1, 1], [2, 2, 4, 8]], columns=['x', 'r', 'g', 'b'])
fig, ax = plt.subplots(nrows=1, ncols=3)
# Left plot
df.plot(x='x', y='r', linewidth=0, marker='o', color='r', ax=ax[0])
df.plot(x='x', y='g', linewidth=1, marker='x', color='g', ax=ax[0])
df.plot(x='x', y='b', linewidth=1, marker='o', color='b', ax=ax[0])
# Center plot
df.plot(x='x', y='b', linewidth=1, marker='o', color='b', ax=ax[1])
df.plot(x='x', y='r', linewidth=0, marker='o', color='r', ax=ax[1])
df.plot(x='x', y='g', linewidth=1, marker='x', color='g', ax=ax[1])
# Right plot
df.plot(x='x', y='g', linewidth=1, marker='x', color='g', ax=ax[2])
df.plot(x='x', y='b', linewidth=1, marker='o', color='b', ax=ax[2])
df.plot(x='x', y='r', linewidth=0, marker='o', color='r', ax=ax[2])
# Produces correct output when uncommented:
# ax[0].legend()
# ax[1].legend()
# ax[2].legend()
plt.show()

Problem description

Note: Possibly related to #14958, #17939, #14563, however this issue discusses how the behaviour depends on the order in which df.plot() is called

  • In certain situations, df.plot() may generate incorrect legend labels (see example)
  • Incorrect legend labels may appear when df.plot() plots multiple series on the same axis
  • Plotting only one series per axis always appears to produce the correct legend label
  • When multiple series are plotted on the same axis:
    • Only the last df.plot() call will produce the correct legend label
    • Calling ax.legend() after the final df.plot() will produce the correct legend entry
  • Calling df.plot() with markerstyle='' always appears produces the correct legend label for that series regardless of the order in which df.plot() is called, or the number of other series on the same axis

Expected Output

This is the incorrect output produced by the sample code
Note that in each plot, only the last series to be plotted has a correct legend label
image
This is the expected output
image

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.5.4.final.0
python-bits: 64
OS: Linux
OS-release: 4.10.0-37-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 36.5.0.post20170921
Cython: None
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions