Skip to content

Update and extend governance [WIP] #4980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 2, 2021
Merged

Update and extend governance [WIP] #4980

merged 9 commits into from
Nov 2, 2021

Conversation

OriolAbril
Copy link
Member

@OriolAbril OriolAbril commented Sep 3, 2021

As some of you may now, I started working on updating and extending the governance doc a while ago (this will fix #4566). And I think this is now ready for review.

I have tried to not change too many things that didn't need changing and especially to write down what we are currently doing and to be more explicit. I think the changes explain better how to join/get more responsibility and will help attract new people (especially when combined with the new website) as well as attracting a more diverse pool, both in terms of demographics and functionality (this second point being the main driver of the team organization like it was of doc team creation).

Detailed Rationale

I feel that PyMC is now quite a large project, that goes way beyond the PyMC3 library, and I love this but it also poses many challenges related to governance, organization. Moreover, we have a long lasting issue with diversity in many levels and I think updating the governance doc (that would need to be updated anyway) is one of many actions we can take to improve that. Being a bit reductionist, we can divide these challenges in two main categories: functional diversity and internal diversity.

I have the impression that we are already at a point where none of us are experts in everything related to the project. Some are more or less aware of everything, but there are some things we have to (or much rather prefer) to delegate, might be docs, v3, v4, grant, PyMCon related... We have already created the doc team, and Abhipsha's and Martina's work has already improved the documentation significantly.

I think that making it easier for people to join the project as well as showing that there is much more to do in addition to coding will greatly benefit the project. And I think that waiting until someone has done "enough" to get commit rights to say they are part of the team is a mistake and has probably driven people away and will continue to do so, especially people who have experienced discrimination before and want to be clear they will have some future within the project before investing too much time and effort into PyMC.

I think I am actually an example of the process outlined in the doc as well as an example of having had a significant impact on the project without much direct impact on the library.

I was invited to slack while being a GSoC intern for ArviZ, having done 1 or 2 small PRs to PyMC3 only. Maybe even no PR by then, I don't really remember that. Thanks to that I started participating of the lab meetings, where at first I explained my GSoC project at ArviZ, started getting a feel of the project and the community and started doing other non-code contribution things basically: answering on Discourse, contributing to docs, being diversity chair at PyMCon, setting up tidelift and getting them to sponsor PyMCon, being Outreachy organizer, helping with GSoD and CZI grant applications... As of today, I have had 15 PRs merged to the pymc3 repo and reviewed some more but not that many either. And out of those 15, 8 are pure docs, 2 are pure GitHub config (funding button and templates) and another one is moving code from ArviZ to pymc3 (with significant refactor but still). The other 4 PRs together cover barely 100 lines of code. And now this governance PR. I have contributed much more to pymc-examples via reviews.

Obviously not everyone will engage in the same way, and by being explicit and advertising this recurrent contributor role we might add people to slack who end up not contributing to the project in any way. But some of the main hurdles preventing people to join open source are: lack of time, (perceived) lack of experience and not being sure where to start. I think that showing people they
can join by reviewing PRs mostly, or even without coding by advocating for the library and bringing in grants and sponsors, or other things we haven't even considered but could help the project will be very beneficial in the long run.

Open questions

general feedback on adding/formalizing the 4 level structure: recurrent/core/council/bdfl

As I commented above, I tried to write down what we are currently doing but if we wanted to change
that, now is the time.

From the current governance doc, especially from the paragraph:

During the everyday project activities, council members participate in all discussions, code review and other project activities as peers with all other Contributors and the Community. In these everyday activities, Council Members do not have any special power or privilege through their membership on the Council. However, it is expected that because of the quality and quantity of their contributions and their expert knowledge of the Project Software and Services that Council Members will provide useful guidance, both technical and in terms of project direction, to potentially less experienced contributors.

I understand that the doc defines 3 groups: core devs, steering council and bdfl, and that the main decision making organ is the core devs group. Steering council and bdfl should only "enter the playground" if there is a blockage or conflict within the core devs group. I have renamed these core devs to core contributors to try and be more general and include people who don't develop code. Moreover, the steering council listed there is probably outdated, but if basically falls back to a subset of core devs to accept new people, break blockage... so I think we do have a steering council even if somewhat fluid and not the one listed in the doc.

how should we match tiers (and maybe teams too) to github permissions

I would prefer to discuss that after having merged that so that we don't condition our organization to GitHub for a coupe reasons. The most important in my opinion is that as I have said, we could extend the team and there are many things related and important to PyMC that are unrelated to GitHub (i.e. someone might be a great Discourse moderator but not be interested in
contributing to GitHub, they might even rather not have an account).

But if the general opinion is to focus on GitHub related things it's also fine. i.e. triaging permissions to recurrent contribs, admin rights based on teams and/or steering council?… Also if teams should be org-wide or project-specific?

temporal/internship teams

We can also include an extra tier for GSoC/Outrachy interns, GSoD technical writers... however, I don't think we should. In my opinion, if we have selected them is because they are great candidates who we'd like to join the team. I think adding them as recurrent contributors is "low risk" enough while it also can be a symbol of them also being part of the team and of our interest that they join after their temporal paid project as volunteers.

Moreover, people from historically discriminated groups are more likely to not have time to dedicate to open source as a volunteer, so them being recurrent contributors will mean they can ask for numfocus small development grants or similar things so that they can get paid (or at least paid a bit) for contributing. Otherwise we are automatically discarding as valuable contributor anyone who doesn't have the privilege to contribute for free.

long term wise, once the doc team is more established, do we agree with this potential team structure?

Very related to functional diversity point. I see other teams being created and enriching the project if we make things easy for that. In two years time the diagram could look like:

PyMC_possible future drawio

I would keep the discussion about potential teams generic, and if we do want to create more teams right now, do so in another PR, issue or on Discourse.

Voting: currently the old steering council votes on the new steering council, we could change that to core contributors vote on new members/renewals

Vote of no confidence

I think this is like the code of conduct, we need to have some process for core contributors to remove people from the steering council even if we hope and expect to never have to use it

council membership constraints

I think we should have a minimum number of members, say 5, with at least one member per team and at most 2 institutional contributors of the same company. Or some other numbers but some kind of constraints on those ends. I don't care much about the numbers really, but I think some constraints would be good to have

becoming recurrent/core contributors

The proposal establishes a nomination based approach. Do we want the nominations to be public (i.e. like in ArviZ) or only public within the team/slack?

TODOs

Some disclaimers. I have tried to distinguish the project from the library by naming the project PyMC and the library PyMC3, but I may have messed up a couple times. I also tried to match the language of the original document with the capitalized "The Project" and so on, and here I am sure I have messed up many times and used PyMC, the PyMC project or combinations of all 3. I think they won't make the document less clear and we can take care of them here while reviewing, but I think it would be best to wait for at least one or a couple rounds of more general review to avoid getting sidetracked by strange language.

Copy link
Member

@eigenfoo eigenfoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OriolAbril and I have discussed off of GitHub, here's a summary:

It needs to be easier for people to become recurring contributors - while joining the Slack (and thus becoming a recurring contributor, under this document's definition) is a huge milestone in a person's involvement with PyMC, it's a high bar to expect first-time contributors to meet to graduate into recurring-contributor-hood.

I wonder if a better distinction might be for people who are "on their way" to getting invited to the Slack - we usually monitor GitHub and invite people who have done a few PRs, but as far as I know that's the only systematic way that new/first-time contributors get into the Slack (and later, the GitHub org). This is what happened with me, for example! The problem with that is that it heavily favors coders, and usually deeply technical ones, when we're trying to value and look for volunteers across all different kinds of contributions (docs, organizing, moderating, etc)

This also ties in to improving diversity in the team - I have a hypothesis that one reason finding recurring contributors is hard is because of a lack of "early reward": if, for example, early contributors had the chance to join the Slack during/shortly after their first contribution (whether a PR, helping with PyMCon, etc.), I think we'd find more "recurring" contributors - in the sense that they would contribute again! This is just a guess from me, but for me, a synchronous chat fosters community so much more than asynchronous GitHub issues or discourse posts.

Obviously, opening the Slack up poses a lot of logistical problems, but @OriolAbril had some very good suggestions for how we might make the onboarding/recurring-contributor process easier.

@Spaak
Copy link
Member

Spaak commented Sep 16, 2021

I started going through bits of the governance document, but want to raise a (textual) point that is better here than in line-by-line comments.

In the definition of some of the roles (Core Contributor, Recurring Contributor), you focus on certain "rights" that they have, while I think it might be better to focus on what makes someone e.g. a Recurring Contributor. I.e. the definition of Recurring is not that they are on Slack, but them being on Slack is a consequence of their role; if you understand what I mean.

So as a suggestion: "Recurring Contributors are those individuals who have contributed more than once to the project, e.g. by helping out many users on Discourse or contributing several PRs to the code or documentation. If individuals demonstrate these repeated contributions, then by initiative of one or several Core Contributors they are typically invited to the Project's private communication channels, like Slack. The moment at which this happens is typically diffuse, but it is expected that the bar here is relatively low."

And for Core Contributors: "Core Contributors are those Recurring Contributors who have shown contributions over a sustained period of time and/or quality. Recurring Contributors become Core Contributors by invitation from the Steering Council, and can be suggested by any Core Contributor. It is expected that Core Contributors, if their contributions have consisted of PRs, have write access to one or more PyMC repositories. Nonetheless, even Core Contributors should always have their PRs reviewed by at least one other Core Contributor before merging."

Also, it might be helpful to reverse the order of describing the roles. I.e. Recurring Contributors first, then Core Contributors, then Steering Council, then BDFL. Right now the order works well for BDFL/Steering Council, but it gets jarring for the Contributor roles. Fully reversing the order also fits better with how people typically approach the project: they'll always become Recurring Contributor first, then possibly Core, etc.

(But all in all: great work here btw! Especially good to see steps toward a more diverse project team, definitely needed.)

@OriolAbril
Copy link
Member Author

Hi! Thanks for reviewing the governance PR! I have to say I was a bit worried there wouldn't be much discussion, so thanks

After reading @Spaak general comment with which I agree completely, I think I have an idea on how to better frame the core/recurrent difference and thought I could run it by you to see what you think about it

As I explained on the PR description, I have tried to base that on my journey in joining pymc, what I have generally observed, and my general believe from my experience that being privy to private communications channels is way more power than people generally realize (i.e. it doesn't really matter much who is the one to act if you are the one who convinced everyone to follow a given path) but I wasn't completely clear about how to write that down so it'd be understandable

from the discussion now, also related to @michaelosthege points about voting the steering council, votes of no confidence and so on, I was thinking that maybe we could frame the distinction as having binding or non-binding opinion on project matters

so contributors (as people who are not on slack, one time PR submitters...) are not directly consulted about anything related to the project, they can open/participate in issues and discussions on Discourse, but there are many aspects of the project that never even go close to there (maybe we could be more open about the decision making in general, but I think there will always be things that should be discussed in private channels, a bit similarly to how there are things we discuss in the lab meetings but not in async mode in slack). For example, gsoc ideas are dicussed on slack and published directly. Maybe they could be discussed on discourse for example, but that is not the case and in general there will always be this private discussion boundary with some topics as I said, or maybe due to urgency communication more direct than Discourse is needed. I used gsoc because it's one of the main sources of code and driver of development, so being able to suggest a gsoc idea and have it selected has a lot of influence over the project, and doesn't require any commit rights or anything of the sort

recurrent contributors are consulted and have a voice on the project, but that voice is non-binding. That would mean that they would not vote on the steering council (or like some countries do, they could vote in a non-binding way to see if the council/core team is diverging from the recurrent contributor base). Again back to the gsoc example to stay somewhat on topic, no matter how good the idea is, it needs the "active" approval of at least one core contributor who then publishes it on the list

core contributors would then have voting rights/binding opinion while still being within the team and working for consensual decisions. A bit like with the council, I think that should generally not be "used", but core contributors should be people who we trust and would allow to do so. Now a couple new examples.

I always open PRs and open for someone else to review and merge, but if the docs were broken I can and would self-merge a PR and publish the fixed docs by myself if waiting for a review meant having to wait for ~1-2 days
On PRs from people who have never contributed yet, I also always review and if I think they are ready to merge I mark as approved, but wait for someone else to review and merge the PR. However, if after a month or so there have still been no reviews I sometimes merge the PR if I feel comfortable enough doing so. I have done this several times on pymc-examples for instance

@mjhajharia
Copy link
Member

mjhajharia commented Oct 4, 2021

building on @Spaak's point, i think something we should think about more is -> how do people "become" recurring/core contributors/council etc, the duties discharged by them can be flexible and vary according to situations. but i think we should have a very clear framework of nominating people -> be it informal voting on slack or something(that would work i think?) like i don't think anything like that happened when me and Martina got some permissions, like there was some confusion about which team to add me into when @michaelosthege / @twiecki gave me some permissions, and i think doing this clearly would be great! not really for this anecdote, but something in general that would be a good practice(?)

@michaelosthege
Copy link
Member

@OriolAbril to me it looks like this PR is already a clear improvement over the status quo. The remaining open threads are about details and at the same time the thread is getting really lengthy with a lot of text.

@OriolAbril @fonnesbeck What do you think about merging and then changing the details in a follow up if we need to?

@OriolAbril
Copy link
Member Author

I'll try to add some extra changes in 1-3 days and then we can merge and kick off the election?

@codecov
Copy link

codecov bot commented Oct 15, 2021

Codecov Report

Merging #4980 (13c1485) into main (3b763a1) will decrease coverage by 3.11%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4980      +/-   ##
==========================================
- Coverage   78.21%   75.10%   -3.12%     
==========================================
  Files         131       87      -44     
  Lines       24523    14142   -10381     
==========================================
- Hits        19181    10621    -8560     
+ Misses       5342     3521    -1821     
Impacted Files Coverage Δ
pymc/bart/pgbart.py 91.44% <0.00%> (-4.51%) ⬇️
pymc/bart/bart.py 96.61% <0.00%> (-3.39%) ⬇️
pymc/bart/tree.py 96.66% <0.00%> (-3.34%) ⬇️
pymc/backends/report.py 89.51% <0.00%> (-2.10%) ⬇️
pymc/sampling.py 86.81% <0.00%> (-0.77%) ⬇️
pymc/smc/smc.py 98.36% <0.00%> (ø)
pymc/tests/test_aesaraf.py
pymc/tests/test_posdef_sym.py
pymc/tests/test_distributions_random.py
pymc/tests/test_mixture.py
... and 41 more

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>
@canyon289
Copy link
Member

Overall this LGTM. One question I have is who proposes and who has final say for monetary decisions

Co-authored-by: Ravin Kumar <ravinsdrive@gmail.com>
@OriolAbril
Copy link
Member Author

OriolAbril commented Oct 15, 2021

One question I have is who proposes and who has final say for monetary decisions

It should follow the same process as all other decisions when it comes to choosing what to do. IIUC, the numfocus subcommittee is mostly an executive body that doesn't decide what to do, only ensures the decisions taken by the team that require monetary actions are done and properly coordinated.

I think the hiring of Martina as GSoD technical writer is a good example of how these decisions should go. We started with discussions on slack and lab meetings about participating to GSoD, agreed on that and then worked on the proposal, budget included, collaboratively on the wiki pages. Chris ended up doing most of the work, but everyone on slack was able to participate. Once applications started coming in, again Chris organized and led the efforts, but everyone on slack was able, prompted and encouraged to participate, and many of us took part in the final decision about who to hire. After that I lost track of the process, but I assume Chris or the numfocus committee took over the execution of payments and other work if needed

GOVERNANCE.md Outdated
- Raul Maldonado (docs)
- everyone on slack! This may be too long so it could be a good idea to
format as a table instead or to put within a `<details>` tag so
it's hidden by default while still available publicly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a ToDo before merging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I am also a bit unclear about how to better address that, should I again tag everyone with a "make sure you are listed here if you want to continue being part of slack"? List everyone and be done with it? Do nothing for now and have them follow the recurrent contributor nomination process as if being in slack was an automatic nomination but not an automatic membership card?

There are many people on slack with whom I have never interacted nor I think they have been active at all in the last 1-2 years.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion about this is to not keep a public list for recurring contributors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not express myself very well before, but the way I see it, the list of recurring contributors being public or even the existence of such list altogether is a rather insignificant part of everything that hides behind the questions above.

From what I understand, this governance defines being accepted as recurrent contributor as the requirement to join slack.

If that is the case, it leaves us with the questions above, should everyone on slack be a recurrent contributor? Why out of the gsoc participants who did not contribute after gsoc finished (to the best of my knowledge) only some are still on slack? Did they leave voluntarily (I for once have no idea how to leave slack workspaces, only how to log out) or were they removed due to lack of activity?

If that is not the case, then we need to define how/when/why people are added or removed from slack. Because I do not know the answer and the "not being rules about it" basically ends up meaning that in practice the answer is that to join slack (or the project) one needs to be friends with the core contributors or well connected. And this will not do.

OriolAbril and others added 2 commits October 16, 2021 18:15
Co-authored-by: Michael Osthege <michael.osthege@outlook.com>
Co-authored-by: Michael Osthege <michael.osthege@outlook.com>
Copy link
Member

@ColCarroll ColCarroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for doing this -- it looks comprehensive! i left a few "nits" and questions

@junpenglao
Copy link
Member

Thanks for taking the time to do this @OriolAbril. I only have one comment related to admin:

  • Do you think we should include admin policy re our other communication channel and forums, including but not limit to Twitter, Medium, and Discourse? There are some physical limitation as well, for example, with the free tier Discourse only allow 5 staff (all admin access), which I shuffle around from time to time according to availability.

@OriolAbril
Copy link
Member Author

OriolAbril commented Oct 19, 2021

Do you think we should include admin policy re our other communication channel and forums, including but not limit to Twitter, Medium, and Discourse? There are some physical limitation as well, for example, with the free tier Discourse only allow 5 staff (all admin access), which I shuffle around from time to time according to availability.

Thanks, yes! I will try to be a bit vague so we don't need to change the governance constantly, but being clear about giving admin rights being shared with relevant team members and council members as well as rotating when necessary I think is general enough, and a great practice!

Then privately we could maybe have a list of services we have and 1-2 "contact" persons on that (even if more people have admin rights). I had no idea we had medium until the theano is dead post for example.

I'll try to add this and Colin's suggestions to the doc and fix a couple roles in 1-2 days.

Copy link
Member

@fonnesbeck fonnesbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thoughtful update to this document!

- Lorenzo Toniazzi (docs)
- Olga Khan (docs)
- Peadar Coyle
- Raul Maldonado (docs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to be maintaining this list here? You can always identify members by looking at Github activity, no?

Copy link
Member Author

@OriolAbril OriolAbril Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to be maintaining this list here?

I really want to maintain this list. The only con I see to the list is maintenance burden which in my opinion is clearly outweighted by all pros.

Listing people here makes new team members feel part of the team, so far I have yet to find someone who is new to the team but has reservations about this section. It also clearly signals that we value our team members and want them to be recognized (even if they mostly work on fundraising and public relations on twitter and medium). It will log the history of members which at the same time will serve as an indicator to prospective team members that we are an active team continuously onboarding people

You can always identify members by looking at Github activity, no?

Definitely not, and saying so would be going against ourselves as we have said above in multiple occasions that any and all contributions are welcome, not only code or pull requests. The most similar proxy are the members in the pymc organization, but that doesn't always show everyone.

I really really think that this will be good for the project and won't be hard to maintain. And if it turns out I'm wrong we can just come here a year from now, delete it and be done with it.

those Services for the benefit of the Project and Community.
- Make decisions when regular community discussion doesn’t produce consensus
on an issue in a reasonable time frame.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to add something like: "Make decisions about strategic financial issues."

So, not day-to-day decisions like approving reimbursements, but those concerning the use of money in a large-scale sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was acutally considering reducing the list or downplaying it a little. While I expect council members to hold a lot of power and probably effectively decide the roadmap and strategy of the project. From the paragraph above (which I barely modified) I think they should do that as core contributors, which should be the main governance body, with the steering council intervening only if necessary. After all, council members are core contributors with at least a year of significant contributions and well respected enough by the rest of core contributors to be elected as council members.

voters vote on a yes/neutral/no basis per candidate -- “would I trust this person to lead PyMC?”.
* Candidates are evaluated independently,
each candidate having 60% or more of yes votes and less or
equal than 20% of no votes is chosen.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it would be easy to deadlock under this rule. So if 50% of members do not vote, you cannot be elected, even if there are no explicit "No" votes? Or am I misreading that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we would go directly to step 2. But I would consider 50% of core contributors not voting a huge red flag. I expect 100% of core contributors to vote. Recurrent contributors don't vote.

This is repeated until only one of the previously tied candidates
is currently found to have the highest median-grade.
* If ties are still present after this second round, the winner will be chosen at random. First we make a alphanumerically sorted list of the names in the tie. Then we will draw one prior predictive sample from a `pm.Categorical` distribution over the elements in the list to determine the winner.
* At the conclusion of voting, all the results will be posted. And at least 24 hours will be left to challenge the election result in case there were suspicions of irregularities or the process had not been correctly carried out.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who oversees the election? Should the council include an election officer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone external, it is mentioned above. In ArviZ last election I believe it was a numfocus representative cc @canyon289. The detailed log is here: https://github.com/arviz-devs/arviz/blob/main/elections/ArviZ_2020.md

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumFOCUS oversaw the election for us, and they were both great and kind about it

@OriolAbril
Copy link
Member Author

Should we wait for more reviews or merge and start the election?

@canyon289
Copy link
Member

imo chris approved so merge it

@fonnesbeck fonnesbeck merged commit e89c63d into main Nov 2, 2021
@fonnesbeck
Copy link
Member

Thanks to @OriolAbril for leading this revision!

@OriolAbril OriolAbril deleted the governance branch November 2, 2021 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Updating governance doc