Authoring an RBAC API for your application

When creating access to systems, applications and environments it’s important to keep security top of mind. Even when working at a rapid pace, it is crucial to think through what credentials and controls will be used to provide access to a resource, whether that be login to a Windows Server, pulling Docker images on an AKS cluster, or accessing your application’s administrative web portal. These types of actions all require some form of authentication and authorization.

Role-based access control (RBAC) is one of the most popular ways to provide access control for applications due to its conceptual simplicity and flexibility. However, it can be deceivingly difficult to implement performant RBAC into your applications.

In this blog post, I’d like to cover some approaches to RBAC and trade-offs you should consider when implementing RBAC in your applications. Throughout the article, I’ll use the familiar scenario of an Internet forum board where users can create posts and reply to posts via comments:

Example UI for posts in a forum application

Concepts

Authentication vs Authorization

Before we begin, let’s differentiate between authentication and authorization.

Authentication is proving your identity to a computer system, often via a credential such as username/password or cryptographically signed certificate. It proves who you are, but not what you are allowed to do.

Authorization is the inverse – given an identity, it defines what actions they might perform or resources they may access; however it makes no guarantees about the validity of the identity.

Together, authorization and authentication are the cornerstones of security for computing. They prove the identity of someone interacting with the application and define the granted permissions for that identity.

RBAC

Role-Based Authentication (RBAC) is an authorization system that provides fine-grained access management of resources. Access management via RBAC allows you to better control the scope of what your users and applications can access along with what they authorized to do (permissions).

In role-based access control (RBAC), the named roles are the primitives through which permissions are granted. At its simplest, access control could be based on nothing more than a role name stored in a field of the user’s profile. Your application would check the user’s profile and ensure those with the user role could only create post or comments, and those with the admin role have ability to perform administrative operations.

However, it’s generally bad practice to couple your access control directly to role names. What if as the forum grows more popular, we need a new moderator role to help manage spam posts but don’t want to grant them unrestricted admin access? We’d have to refactor the entire app to add new checks for a moderator role alongside the user and admin roles.

Instead, we should think in terms of permissions, the individual actions users can do in your application, separately from the named roles we use to grant access to actors in the application.

A more flexible RBAC design would leverage four key concepts:

Permissions: the set of fine-grained capabilities or actions in your application (e.g. “create comment” or “delete user”)
Roles: Named groups of permissions. Roles are configured with a set of granular permissions that encapsulate a subset of desired system functionality (e.g. our new moderator role could have “create comment” and “delete comment” permissions).
*Subjects: the system actors that can be assigned roles. Typically, these would be interactive user accounts and/or non-interactive service/system accounts.
Role assignments: assignments map a role (and its permissions) to a subject.

This approach provides the flexibility for your application to perform granular access control based on permissions which are granted to subjects via their assigned roles, but the application doesn’t have to care about which role specifically granted the permission. This lets you group those permissions into roles according to your needs now or in the future without having to refactor your application.

With that in mind, let’s take another look at structuring roles and permissions for the forum application:

We can break down actions in the app into distinct permissions: create post, create comment, delete own content, delete any content, and finally delete user.

We can also configure different permissions on the 3 named roles:

Role	Configured Permissions
`user`	create post, create comment, delete own content
`moderator`	delete any content, delete user
`admin`	create post, create comment, delete own content, delete any content, delete user

We have some role assignments on user accounts (subjects):
1. User 1: admin
2. User 2, 3, 4: user
3. User 5: user and moderator

This requires the application to maintain a data store mapping user identities to their roles as well as the roles to permissions.

Now when users interact with the forum application, the app can lookup the user’s current role assignments to identify their effective permissions and decide to permit the action or not based on those granular permissions. Here’s what an account with the user role would see:

Example UI for a post and its comments

On the other hand, an account with moderator role would inherit the “delete any content” permission and could have extra delete buttons rendered:

Example UI for moderation actions on comments

Let’s take it further – REST API

As the forum site gains traffic, we want to add a REST API so that developers can write integrations that can post and reply on behalf of users.

Before, things were simple — the server would just check the profile of the currently logged in user. With the REST API, we now have to deal with two types of subjects: a user’s account, and a developer’s bot. So how can we do authorization and check roles?

Approaches to validating authorization

1. Lookup on the server

The first approach is to use the credential sent with an API request to perform a lookup of the subject and its role assignments:

Flow diagram of server-side RBAC lookups

This approach works great for credentials like API keys that can be tied to a subject’s identity on the server-side, or when there are a very large number of role assignments to verify. The tradeoff is that the role assignment lookup and mapping the roles to permissions adds latency to your API requests, and this lookup is required on every single API request made by the subjects interacting with your app.

2. Cache claims on the client

A second approach is to front-load the role assignment lookup by issuing the subject a secure credential during authentication with their role assignments embedded in it. If the subject provides the credential in the requests to the application, the application server can authorize requests immediately and avoid costly role assignment lookups.

For this to work, the client needs to obtain its secure credential with their identity and role assignments before talking to the application. The application sends the identity provider a list of role assignments for that subject during the process.

Flow diagram for issuing credentials with embedded RBAC information

Once the secure credential is obtained, subjects can submit as many requests as they want and the server can avoid having to do a role assignment lookup every time, since it’s provided in their credential:

Flow diagram for client-side RBAC lookups

Keen observers will note this looks a lot like OAuth, and that’s because it is! OAuth or JSON Web Tokens pair very well with this strategy. The identity provider can authenticate users and issue secure credentials (such as Bearer tokens) that are cryptographically signed, fast to verify on the application server, and hold our role assignments in a fields called “claims”. Microsoft Entra ID (formerly Azure AD) supports this via the Custom Claims Provider functionality.

The disadvantage to this approach is that (1) roles and/or permissions are cached in the secure credential (e.g. JWT), and thus changes to permissions might not be immediately reflected when role permissions or role assignments are modified; (2) this approach becomes expensive or impossible for subjects with many role assignments because the secure credential must be attached to every request. It is possible that it grows too large to include in HTTP headers or pass through gateways with lots of role data in it.

Keep in mind that your client will also need a secure cache to store its credential. Anyone with access to that credential could impersonate your user (which is why JWT tokens are usually short-lived).

Access control with hierarchy of resources

As more users are posting on the forum application, we may want to start grouping posts into communities of expertise by topic, forming a simple hierarchy. Forum posts are now made under a forum topic, and moderators should moderate posts within their assigned topic(s):

Example UI with posts grouped into topics

If we moved to a set of “create topicName post” and “delete topicName post” permissions for every forum topic, the explosion of permissions would make role management untenable as we created or removed topics. We need to use the same “create post” and “delete post” permissions as earlier, but restrict their use to certain topics for a given user.

Access control under a hierarchy of resources in this way can be easily achieved by adding a scope field to the role assignment objects. With the scope field, our forum application can lookup role assignments and verify not only that (1) the subject has necessary permissions, but that (2) the target of the action is contained within the indicated scope on the role assignment.

To assign a moderator for the ‘house plants’ and ‘gardening’ topics, the role assignments would look like:

User 5: { "role": moderator, "scope": [ "/topics/houseplants", "/topics/gardening" ] }
User 5: { "role": user, "scope": [ "/topics/*" ] }

Just like that, they can now moderate under house plants and gardening forum topics, but they’re a regular user everywhere else!

Considerations

Subtree grants

Carefully consider whether your RBAC API needs to support wildcard matching in its role assignment scopes to automatically grant permissions on nested entities/subtrees in the entity hierarchy.

Granting access on entity hierarchies’ subtrees is a common need, but also introduces lots of complexity and heightens the risk for edge cases or bugs in a security-critical component of your application.

Scope with entity graphs instead of hierarchies

One crucial consideration when modelling your role assignments with a scope field is that the straightforward implementation also requires a straightforward tree structure for the entity hierarchy; that is to say, any given entity only has a single parent. For example, if forum posts strictly belong to one topic, granting access to the topic grants access to the posts made within. Posts don’t live anywhere outside a topic, and access can’t be granted to an individual post.

If your entity model is not a tree and behaves more like a graph, evaluation of role assignment scopes becomes complex. Say your application manages access to buildings, consider how these scopes might be semantically different, or overlap with each other:

/buildings/122: building 122, regardless of the customer
/customers/1/buildings/122: building 122, but only if accessed through a user belonging to customer 1
/customers/1/buildings/: buildings assigned to customer 1 (which might include 122, among possible others)
**/buildings/***: all buildings, regardless of customer

When there exists more than one path to reference a particular entity, scope evaluation logic on the server needs to be careful to understand the relationships between entities and parse them accordingly.

Nested roles and permission wildcards

As your application grows more complex, it might be tempting to make your life easier by implementing wildcard permissions in addition to scopes (e.g. admin role would have a global wildcard * permission) or to nest roles so that the admin role might just be composed of the two moderator and user roles, and automatically inherit the union of the permissions in either.

Our opinionated recommendation would be to advise against doing so. Access control is one of the most security critical components in your app, and you want changes to it to be explicit and to avoid non-obvious side effects. Wildcard permissions also violate the principle of least privilege because new permissions are automatically included by the wildcard, even if they are not strictly necessary for that role.

The setup overhead of adjusting each role as necessary when adding new features/permissions to your application can be viewed as the lesser of two evils compared to having a permission change cascade to a seemingly unrelated role without your team realizing because of nested roles or wildcard permission evaluation.

In fact, it is best practice to regularly perform access reviews, not just during development. Roles, their permissions, and role assignments should be regularly evaluated and adjusted to ensure the least privilege.

A small amount of extra development time is cheap compared to a security breach!

Application Gateways

For small and medium sized applications, it’s often sufficient to simply perform authentication/authorization directly within the application.

However for larger services, it can be effective to setup a trust boundary and perform authentication/authorization verification at the gateway level.

In this pattern, traffic established behind the gateway to your internal application services is implicitly trusted as the gateway has already performed authentication and authorization. The internal services are not made available to the public (e.g. placed on a private VNet).

The application gateway serves as the central entrypoint for exposing your application services, and therefore authentication and authorization decisions (and monitoring) can be centralized in the gateway.

For more information on this pattern with Azure API Gateway, visit Authentication and authorization to APIs in Azure API Management and Connect to a virtual network using Azure API Management.

Authoring an RBAC API for your application