Open
Description
Hello all,
I need to use plotly
as the backend of a microservice who generates charts dynamically.
Unfortunately, after a little benchmarking, I found that the plotly.express
framework is very slow (around 5 secs to generate a chart from 500 lines dataset).
Here is the script I use to generate a scatter matrix:
import sys
import os
import traceback
import json
import time
sys.path.append('c:\\statwolf\\python\packages\Lib\site-packages')
input = json.loads('{\"file\":\"/tmp/data.tsv",\"color\":\"club_country\",\"dimensions\":[\"nolo\",\"tolo\",\"yolo\"]}')
def action():
def run():
import plotly.express as px
from pandas import read_csv
color = None if input['color'] == "" else input['color']
first = time.time()
d = read_csv(input['file'], sep='\t')
second = time.time()
fig = px.scatter_matrix(d, dimensions=input['dimensions'], color=color)
third = time.time()
j = fig.to_json()
fourth = time.time()
print('read: ' + str(second - first))
print('plot: ' + str(third - second))
print('json: ' + str(fourth - third))
return j
import time
for i in range(0, 3):
start = time.time()
result = run()
end = time.time()
print('iteration: ' + str(i) + '\ntime: ' + str(end - start))
return result
result = None
try:
result = { 'outcome': action() }
except Exception as e:
traceback.print_exc()
result = { 'error': str(e) }
resultDir = os.path.dirname(os.path.realpath(__file__))
resultFile = open(resultDir + '/result.json', 'w')
json.dump(result, resultFile)
resultFile.close()
from the dataset:
https://www.dropbox.com/s/cm9i3pfv10exbba/data.tsv?dl=1
and this is the report with timing:
https://www.dropbox.com/s/l2x3jqzea4i4xqw/report.txt?dl=1
Now:
- Is there any tweak I can implement to improve performances?
- Do you plan to focus on speed for the following releases?