-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Text and SQLite backends for PyMC3 (update) #500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Probably need to add mock to the travis build. |
Thanks. It's pre-installed [1] on travis, but I didn't consider that [1] http://docs.travis-ci.com/user/languages/python/#Pre-installed-packages |
This is only needed for python 2 because mock is in stdlib for python 3 (unittest.mock).
This commit contains a new backend for sampling and selecting values. Non-backend files have been changed to work with the new backend. This commit also merges the `sample` and `psample` functions. `sample` now takes a keyword argument `njobs`, and if this is over one, the multiprocessing version is used.
Is this good to merge then? I know @jsalvatier reviewed this and it certainly looks high quality. |
Sorry, I should have indicated that when I opened up the new PR. I think |
I notice that if you try running another chain using an existing backend, the chain variable does not get incremented in the backend (nor does the draw get reset to zero). It would be nice to have the backend check for the highest chain number before sampling, so that we can arbitrarily go back and add chains to a database without confusing which chain the samples came from. Great work, BTW! |
Never mind. Now I see that the |
After running a few tests, I think this looks really good. A couple of things:
|
Thanks for the suggestions. I'll push an update soon. |
OK, I've pushed updates that incorporate your suggestions. |
As the Travis failures show, I only ran a subset of the tests locally. |
The Travis issues should be fixed now. |
I'm moving this under main so that it doesn't run with "test_examples". This could be set up like the other examples, with a run definition that allows for a short version, but I'd prefer not to for a couple of reasons. 1. Everything aside from the SQLite backend is the same as "hierarchical.py", so it isn't testing much more for the time added to the run. 2. This results in an SQLite file, so a cleanup should be added somewhere if it is run with "test_examples".
njobs = 1 | ||
|
||
data = np.random.normal(size=(2, 20)) | ||
model = pm.Model() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just get a stock model from tests/models.py or at least move this model into there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
|
||
def test_multitrace_init_unique_chains(self): | ||
trace0 = mock.Mock() | ||
trace0.chain = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it necessary to set trace0.chain? That seems undesirable. In general it would be nice to mostly test external behavior and internal behavior less.
Kyle, I want to thank you for your hard work and patience. You've put a lot of effort into this and it shows. I really like the external interface to the trace objects. Its simple and functional. I think I might have opinions about how the backends are implemented, but its clear to me that I'm not going to put in the effort to help make those changes right now. So I'm in favor of merging this soon. |
(Since several of your comments address the same thing, I'll just respond in a general comment.) I think there's a trade-off between having isolated/fast tests and relying too much on internal details and mocking. Based on your comments (and because my tests are the first place mocking occurs in the test suite), I'd guess we have different preferences on this. I've tried to also add higher level tests (like those in the dump/load and selection tests). For now, I'd prefer to keep these unit tests and just extend the higher level tests as desired. The unit tests add a very small amount of time to the total test suite, and they can be deleted later if they become problematic. |
I'll incorporate some of your suggestions, and then rebase this (since it will won't apply cleanly to recent changes in master) and open a new PR. |
That's true, about fast tests. Our tests are generally pretty slow. I look forward to seeing your changes. |
These changes are based on the discussion in #449. Please let me know your thoughts.