Listen to this Post
Over the past 12 years, GitHub has grown to become the de facto cloud platform for writing code and collaborating on building applications.
Home to some 50 million users and 100 million repositories, managing who is allowed access to which resources is a massive undertaking to say the least.
In order to take on this mission, GitHub has built their own in-house permission management system to help everyone from the smallest open source project owner up to the enterprise teams at Microsoft secure their repositories. This system allows admins to manage their repo permissions with a straightforward process that still succeeds in respecting the hierarchy structure that keeps teams from falling into chaos.
As a developer of permission management systems, I have to tip my hat at a job well done. At the same time, there is always room for improvement. Engineers should always be thinking about how to make a good thing better, so of course I thought I’d share my thoughts on the matter.
But first, it’s important to provide some context as the metrics of what we think that a permission management system should have to be considered good and worthy of emulation.
Setting Parameters for a Solid Permission Management System
While preferences and opinions may vary as to what an ideal permission management system should have, there are a couple of features that I believe should be present in any system worth its salt.
For starters, a permission management system should be designed narrowly enough that it actually addresses the team management job that you need it to do. Far too many of these systems are built to meet too wide of an audience, and end up supporting nobody in particular all that well.
Then of course it needs to do the heavy lifting of allowing admins to effectively and efficiently grant and block access to assets as required. You can build a beautiful gate but it isn’t much good if it doesn’t lock properly.
Finally, for our purposes at least, a good system needs to make it simple for admins to gain visibility over all the users and assets connected with their organization.
How Does GitHub Stack Up — Putting Principles into Practice
To understand how these principles apply in practice, let’s take a look at how GitHub’s permission management system holds up to our parameters. What do they get right and where do they have space to improve?
When it comes to team management capabilities, GitHub has to support three types of use cases:
- Open source git projects that are open to everyone with a few project managers running the show. The contributors use their own personal accounts and emails for access.
- Organizations whose proprietary git projects that can only be accessed with organization-based user identities and are built on strong security structures (including SAML users, teams hierarchy, etc).
- The hybrid model where organizations have a mixture of open source git projects along with proprietary and protected projects, and the users are able to use their personal accounts to access as contributors
Adding to the complexity soup is the fact that while access to a repository is generally based on being a part of a given Organization, the Organization contains within it a wide variety of subgroups and roles that each have their own levels of privileges that must be carefully mapped out and controlled.
GitHub has designed their permission management system to account for organizational roles like “Owners” who are the top admins, “Billing Managers” that handle the payments, and then “Members” are basically everyone else who should be there but don’t need the highest levels of abilities within the organization.
The next step down in resolution comes to the Teams with their “Manager” and “Member” roles for access inside the repos. Finally, we have the access levels inside the repos that provide these roles with either Read, Triage, Write, Maintain or Admin right.
Creating Restrictions and Controls that Fit Just Right
The second key characteristic of a well designed permission management system is that it has to provide real control over who is or is not granted access. It starts by blocking anyone trying to circumvent the barriers.
Next, the controls should be strong enough to make it easy enough to govern the granting of permissions, but not so easy that everyone and their dog is given high level access credentials. This addresses the concern that if lower level members are unable to access the resources that they need easily and quickly enough, then it is not uncommon for admins to simply grant higher level privileges to everyone just so that they don’t have to deal with constant requests for access.
By my estimation, GitHub has done a pretty good job in striking the right balance between usability and security here. The controls are strong but malleable enough that people can work with it responsibly.
Actionable Visibility for Manageable Control
It’s really hard to protect an asset if you lack the visibility to know who has access to our asset.
As we have laid out, GitHub does a pretty good job of allowing Owners and other admins to control who can do what within our organization. They even have really useful logs to help us track who did what over time.
However, GitHub’s openness to working with collaborators who may not be from your organization has a challenge that can impact your security.
Like we stated above, they are pretty good about controlling who can do what, but they cannot always tell you who is necessarily who.
This is because GitHub allows users to work within an organization’s repos utilizing user accounts that are tied to personal emails when using the hybrid model that was mentioned at the top of our discussion. So even if you grant commit rights to [email protected][.]com, you don’t actually have a way of knowing who that person is. Even organization owners cannot really know who the members of the organization are. Sure, they know the username, but if it’s an out-of-domain / no SAML user, then they are lacking potentially important bits of information.
Stepping back a tick, we should say that GitHub’s permission management will probably be just fine for most people. If not knowing who someone is specifically but you still know that they are authorized to be in your repo doesn’t run up against your threat model, then this is probably not a big deal for you.
The most obvious solution may be to not allow your organization to use the hybrid model in the first place, requiring them to use a domain email or SAML system. Or perhaps you can just map the usernames to emails manually. A third solution is to run some logic on git commits emails and try to link them to the users that pushed them.
All are options but they all add an extra layer of friction that can be avoided with a bit of smarts and automation.
Making the Good Thing Better
In reviewing GitHub’s existing permission system, I think that they get the first two bits down pretty well. However, on the third point of having visibility over who is who, Authomize’s approach can help GitHub admins with identifying who is in their repo.
When an organization connects their apps with our API, we gather up a lot of information about how your team uses its applications. This means learning the ins and outs of who your users are with a wide view across all the resources that they are using. This helps to not only determine if they have the right level of access to their resources, but also to help offer actionable, automated recommendations for future permission requests based on your team’s tailored needs.
So how does this 30,000 ft view help with our non-domain email issue with GitHub?
Well for starters, we can check the email associated with the git commit to the GitHub user pushed it. From there we try to associate GitHub user names and the visible emails with the organization email account. For example, if I am using a GitHub account associated with my [email protected] email address, then Authomize can take a data-driven educated guess and understand that I am probably the same person that uses [email protected]
As we collect more data within an organization, we can move from the more obvious examples like the one above and create more connections and conclusions. Authomize now knows to associate the email address [email protected] in Active Directory as an alias of [email protected] because it identifies me using both of those addresses as identities in my work.
While there will always be some cases where there simply isn’t enough data to connect the external addresses with my primary work email, it has shown itself to be sufficient to cover 80% of the task at hand. For the other 20%, we have made it simple enough to handle with our support of manual mapping for aliases.
What is important here is that we are helping to fill a gap here for those organizations that need the extra bit of visibility to meet their security needs. Our goal is to help them take that good thing that GitHub provides and give it that proverbial kicking up a notch to make it even better.
What This Means for GitHub Admins
Permission management is tough to get right, especially when it comes to large organizations. Getting the nuances of who should have which kinds of access, and keeping track of all the identities in the mix is a constant challenge.
But thankfully we are now getting better at providing the controls that go beyond the basics of general access to a resource and pushing towards providing stronger granular control to admins. Hopefully by harnessing the power of automation and more data, we can continue to make the applications that we love like GitHub a little easier to use and a lot more secure.