-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
WIP: graph representation of model #1683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Wow great, I think that layout needs to be more pretty, and variable parameters associated with one random variable should be detected. |
I agree with ferrine -- this is a really cool PR! I would prefer the API to be decoupled from the I suspect you could derive all the necessary information from the model (including, probably, the variable parameter names), rather than updating state, but I haven't tried this myself yet. |
Thanks @ferrine and @ColCarroll, I can extract the graph building logic entirely into its own function that takes the model as an argument and exposes all of the Also tests, obviously. |
@stevenjkern Does the plot this creates still look like the one above? |
@twiecki, I've added the non-pygraphviz-dependent layouts as optional layouts for the matplotlib plotting that networkx exposes (circular, shell, spectral, spring, force-directed, random). An example below is from the lasso_missing.py example drawn with a circular layout. The example above was a spring layout, for reference. But the plotting function currently makes the drawing of the plot via matplotlib optional and returns the graph object that can be passed onto whatever networkx-supporting plotting utility the user desires, e.g. Graphcanvas*, Bokeh. I am planning on getting back to this PR later this week to distributions that I'd missed and add tests. * Full disclosure: I am the maintainer of Graphcanvas. |
Interesting, I've been searching for a way to get the pymc2 graphing ability to work in pymc3, functionality like this would be much appreciated for things like Bayesian belief networks (like BayesFusion's Genie). I actually just saw something relevant to this on r/python, but with the difference that the Something like this possible/useful? |
@stevenjkern It should be relatively easy to add a method that sets the "positions" of the nodes so that each level of the hierarchy appears in order w.r.t the other levels. Should make reading these graphs much easier (more like a flow chart) instead of the defaults like circular. Would that be helpful? I'll try to mock up an example in the next few days here. |
I believe Dask uses dot and GraphViz for layouts. We can add a tree layout as a separate function without much trouble. I just didn't expose networkx's tree layout because it is dependent upon pygraphviz and GraphViz and I didn't want to have to add another requirement to the package beyond networkx. |
Dask does use GraphViz, and its a nightmare. The Python bindings are very poorly maintained, and in fact they have stopped installing it with Dask on Conda (or are in the process of doing so). The graphs are pretty when they work, and there desperately needs to be a project to replace it. In the meantime, I am not sure what the best replacement is. |
I have run into trouble with pydot in the past but I noticed that the pypi version 1.2.3 is py2/3 and fairly recent. |
@fonnesbeck You're right, yesterday I tried recreating my old pygraphviz setup on a fresh conda install, and it was terrible. The graphviz install itself stopped adding PATH vars to the registry for windows users, so that in an of itself is an extra layer of difficulty if you were to use it as a dependency. Any solution will probably need to either rely solely on networkX, or maybe output a tikz object of some kind to be rendered in a markdown cell (for jupyter users, anyway). I'm trying my hand at a pure NX solution based somewhat off of this answer on stackoverflow |
If all we really need is a tree layout, we probably don't need to bring in any extra libraries that we aren't enthusiastic about. In Graphcanvas there is a non-pygraphviz implementation of a moderately attractive tree layout. I can't speak to it's code quality as it predates my involvement in the project by several years, but it works, it's fairly speedy, and something like it could by implemented here without much trouble. |
I spent some time putting together a tree layout using just networkx. How does this look? The model is from the lasso_block_update.ipynb notebook in the docs. |
@stevenjkern Looks much better! |
@stevenjkern looking great. So, to recapture some of the similar functionality as PyMC2 had (as shown here), It would be nice to alter the node shape based on whether a node is stochastic, deterministic (which as far as I can tell aren't currently tracked by the tree-builder, nor do I really know how they could be), or observed (perhaps just a check on whether the observed variable is defined?) , which would pretty much complete the visualization of the model itself. The only other features I can think of are perhaps a coloring based on the sampled posterior mean values (for use after sampling, of course). That may or may not be a far-future kind of feature though. |
FYI I believe that the graphviz conda packages were improved yesterday. Graphviz may now be less painful. |
@stevenjkern Any update on this? I'm building a pretty complex model and this would be helpful for me! |
@tbsexton Those are beautiful visualizations! Do you think with this PR we could get something similar in PyMC3? |
@mrocklin had some very slick in-notebook graphs in his https://github.com/mrocklin/streamz library. See the inline examples here: https://streamz.readthedocs.io/en/latest/ , and I think they were implemented with minimal fuss. |
FWIW I use and now recommend graphviz for static node-link diagrams. I
haven't seen users complain of installation issues for a while now, so I
assume that those problems have been sorted out. The only issue I've
encountered recently when working with students directly is that people pip
install graphviz, but don't install the system library, which can confuse
them. We've worked around this with an informative error message.
…On Tue, Dec 5, 2017 at 9:46 AM, Colin ***@***.***> wrote:
@mrocklin <https://github.com/mrocklin> had some very slick in-notebook
graphs in his https://github.com/mrocklin/streamz library. See the inline
examples here: https://streamz.readthedocs.io/en/latest/ , and I think
they were implemented with minimal fuss.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1683 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AASszBt4fqawoOXV94jVjdosaiJreU_6ks5s9VczgaJpZM4Loe5D>
.
|
@mrocklin That's great to hear. I went ahead and used graphviz in the above-mentioned visualizations, just to get reasonable heirarchical tree-layouts. It does work much better now than it used to, except for
@twiecki Definitely possible. Just needs some tweaking to the networkX graph representation, and allowing a dependency on graphviz/pydot ( |
Is there any opposition to using Daft (http://daft-pgm.org)? |
@zaxtax I don't think there is opposition to any particular package. My impression with Daft is that it is a little harder to automate than PyDot. |
I did do a bit of playing around with daft. As far as my experience with it (this is ~ 6mo ago) I like the package a lot for having only a dependency on matplotlib. However, I believe it currently only supports drawing nodes with the |
There is a PR for adding rectangular nodes. The lack of a default layout
algorithm is the real bummer.
…On Mon, Jan 8, 2018 at 3:24 PM, Thurston Sexton ***@***.***> wrote:
I did do a bit of playing around with daft
<https://github.com/usnistgov/pmml_pymcBN/blob/master/tests/weld_full/daft_net_png.py>.
As far as my experience with it (this is ~ 6mo ago) I like the package a
lot for having only a dependency on matplotlib.
However, I believe it currently only supports drawing nodes with the
patches.Ellipse object, which for my original use-case wasn't quite
enough (often, deterministic nodes are square). Additionally, there's no
default lay-out algorithm implemented, which means one must define the
position of each node manually. If something like the dot package's
algorithm could be hacked into a daft layout routine, I'd be all for using
this as a pure-python solution.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1683 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAhUeb9syYzZ7SNx8C8WtptgKWAur0Oks5tIjMrgaJpZM4Loe5D>
.
|
If that's the case, what's the overall feeling of combining something like grandalf and Daft, to get pretty nice looking layouts in pure-python? In all honesty, it may be worth submitting a PR to daft with a layout module dependent on grandalf, and then adding this functionality to pymc3. |
@tbsexton That's probably a good idea. I would not be too hung up on dependencies for something like this. I think its worth it to get nicer diagrams, plus this module (like our plotting modules) would likely be optional components anyway. |
@tbsexton @fonnesbeck how actively maintained is daft these days? As it's only one file, we could vendor it. |
Closing as superseded by #3049 |
This is an attempt to address #1547, but it is neither clever nor fancy.
I have added an attribute to most of the
Distribution
classes that list the names of the distribution's parameters and adds agraph
attribute (anetworkx.DiGraph
) to theModel
which gets populated each time a new variable is added to the model by inspecting the values of the added distribution's parameters. I am totally open to suggestions for other approaches to accomplish this.Below is an example of the graph in pymc3/examples/gelman_schools.py: