Awesome
kube-rebac-authorizer
Welcome to a prototype of how to integrate Kubernetes Authorization with Relation-Based Access Control (ReBAC)!
The core idea is that this project provides an Authorizer implementation that the Kubernetes API server can webhook to, replacing the need for both the RBAC authorizer and Node authorizer, deployed in practically all Kubernetes clusters, while offering a wider, general-purpose, way to write authorization rules for your Kubernetes cluster or kcp-like generic control planes.
The work in this repository is described and core idea is proposed in a Cloud Native Rejekts talk from November 5, 2023.
I highly recommend watching this talk before reading the rest of the README: https://www.youtube.com/live/tWWBzsZLrIw?t=396
I also highly recommend checking out the talk presentation: https://speakerdeck.com/luxas/beyond-rbac-implementing-relation-based-access-control-for-kubernetes-with-openfga
:star: Features
- Intuitive Integration with non-Kubernetes sources: Utilizing the policy data that you have specified in Kubernetes is great, but sometimes you have other sources of data affecting the authorization state, such as user groups in LDAP/GitHub/etc., cloud IAM roles, or team permissions in your own system's databases, all of which "speak different languages" when it comes to authorization. Using ReBAC, you now have a "lingua franca" for which you can one-way sync (project, map, reconcile) all your other authoritative permission and user data from the ReBAC graph data model, and make authorization decisions based on relations, without having to move the data from their "native places".
- Natively-supported Deny Roles: The Kubernetes authorizer framework itself supports denying specific requests, but RBAC is only additive. With the ReBAC approach, we can natively and intuitively integrate deny roles.
- Superset of RBAC: Everything you can do with RBAC today, you can do with ReBAC! This Authorizer is also compatible with Kubernetes RBAC types. However, you can also do much more than with plain RBAC.
- Inheritance: You can now specify, for example, that if a user can access a given API object in the Kubernetes API (e.g., a
Deployment
), then it can also access all the childReplicaSet
s and grandchildPod
s without having to grant access the user access to list allReplicaSet
s orPod
s. Now, this specific Deployment-ReplicaSet-Pod example is perhaps not very useful in practice, but what is, for example, is to grant a user or controller only access to suchSecret
s that given, sayIngress
orGateway
, objects reference, and not allSecrets
cluster-wide (a really serious privilege if the used credential is compromised, or for confused deputy scenarios).
:handshake: IMPORTANT: While this project demonstrates a re-implementation of the RBAC and Node authorizers in Kubernetes, the aim is not to make those obsolete in any way. The aim is to demonstrate how the concepts relate, showcase that ReBAC can offer what we already have, and more, as well as experiment with making Kubernetes' (in-core and out-of-core) authorization primitives even better in the future.
:warning: WARNING: This is a prototype which, although close, is not (at the time of writing) 100% feature-compatible with neither the RBAC nor the Node authorizer, and this project is not in any way production-ready. Use it at your own risk.
Looking for feedback!
The goal here is to discuss in the Cloud Native / Kubernetes community how we can apply the great recent developments of open source ReBAC projects.
:rocket: Get in touch! I'm looking for feedback on the approach. I'm planning to bring this to the Kubernetes Special Interest Group for Authorization soon.
You can open an issue, star :star: this repo so I see your interest, or reach out to me on Twitter (I'm @kubernetesonarm) or the CNCF or Kubernetes Slacks (I'm @luxas). Looking forward to speak with you!
What is Relation Based Access Control?
ReBAC is an evolution of Role-Based Access Control and Attribute-Based Access Control, and was popularized by the Google Zanzibar paper. The core idea is that the authorization state (who X can do Y on resource Z) is modelled as a graph with nodes (for example, users and documents) and relations (directed edges) between them (user named lucas
is related to document secret
through the relation read
).
graph LR
user:lucas -- read --> document:secret
A ReBAC service, such as the CNCF project OpenFGA used here, provides an API to read and write what the authorization state looks like (what nodes and edges there are), define a schema/model for how the graph can be constructed, and for querying the graph (e.g., does this user node have access to this document node? or what documents does this user have access to through this relation?).
ReBAC can be an effective tool in combating the Roles Explosion phenomenon often seen in RBAC setups, which practically limits the usefulness of RBAC deployments, although in theory it would be possible to specify more fine-grained RBAC definitions than is practical. In order to understand the true usefulness of ReBAC, however, let's dive into how the authorization state is specified!
Authorization Model
Each node and relation has specific type, e.g., a node representing a user named lucas could be named user:lucas
where user
is the node type, and lucas
is the node identifier. Furthermore, a node of a certain type can be related to nodes of other types through specific types of relations (e.g., read
, write
, admin
, and so forth), as described by the authorization model. You can think of the authorization model as schema validation (e.g., OpenAPI / JSON Schemas) for the graph, defining what can relate to what and how.
TODO: Insert picture from the slides on the type model.
Definition Schema (Tuples)
The graph describing the authorization state is made up by a list of Tuples.
A Tuple describes a direct edge in the graph of the form {source_node, relation, destination_node}
. The source node is often referred to as the subject node or user node of the tuple, and the destination node the object node.
Note: The fact that relations have a direction means that even though
user:lucas
is related todocument:secret
through theaccess
relation, there is no way for anyone related todocument:secret
to traverse towards theuser:lucas
node (unless another relation from document to user is added, of course). This has direct implications for parent-child relationships: if you want to model that someone with access to a parent (e.g., something general) should be able to access child nodes as well (something specific), then you create a relation from the parent to the child, but another relation from child to parent is needed if you want to model that people with access to the child (e.g., something specific) should also get access to the parent (that is more general).
There are two types of ways to specify the subject node(s) of a Tuple:
- Direct Node Assignment: The straightforward case, e.g.,
user:lucas
is related todocument:secret
through theaccess
relation. - UserSet through Relation Assignment: This matches a set of subject nodes to bind the tuple to, and is a powerful way to describe more advanced, propagating relationships. For example, given:
- a Tuple from
user:alice
tofolder:clients
through theowner
relation, - a Tuple from
user:bob
tofolder:clients
through theowner
relation, - we can create a Tuple from
folder:clients#owner
todocument:customercase
through theeditor
relation. Notable here is that the UserSetfolder:clients#viewer
will resolve both touser:alice
anduser:bob
without ever creating direct edges betweenuser
s anddocument
s. The power of the graphs becomes unlocked!
- a Tuple from
Evaluation Schema (Rewrites)
Although UserSets are powerful, as seen above, sometimes they are not quite enough. With the most recent example, we could say that a set of users can relate to one document through a certain relation. But if we want to give the owners of the clients folder (folder:clients#owner
) access to another document, say document:marketingplan
, we would need another Tuple as follows: {folder:clients#owner, editor, document:marketingplan}
. This needs to be done, as otherwise we don't know which documents belong to which folders. However, this gets tedious quickly.
The solution to this is to define rules, rewrites, of the defined tuple data, that are computed at evaluation (query) time, e.g., when issuing a Check
request asking "is this subject node related to this object node through this relation", or in more simple terms "can this actor perform this action on this object?"
We'd like to just define relations between user
s and folder
s, and folder
s and document
s, and then dynamically resolve the (possibly recursive) relations between user -> folder -> document. We can do this by specifying simple {user:alice, owner, folder:clients}
and {folder:clients, contains, document:customercase}
-like tuples, and defining rules for how the ReBAC engine shall resolve (or rewrite) these edges. Notably, at this point, there is no relation between user
and document
nodes at all (only between user
and folder
, and folder
and document
).
The Zanzibar paper proposes several rewrites, of which one is the Tuple to UserSet rewrite. With the most recent example, we could define a rule on the document
type, which says that "any subject node that is an owner
of a folder
node that I am connected with through the contains
relation, will be related to me as editor
". In this case, the rewrite creates a path between user
and document
nodes. The name comes from that, in this case, the {folder:clients, contains, document:customercase}
Tuple with subject type folder
is traversed "backwards" (or forwards, depending on how you think about it), in order to yield a UserSet of the subject types of the owner
relation (here, user
).
There are other rewrites available as well, but surprisingly few at the end of the day, given that Google is using it for authorization in all of their services TODO link. The other ones are:
- Computed UserSets: Recall that a UserSet represents a set of subject nodes that are bound to a given node through a given relation. Let's say
user:alice
anduser:bob
are bound tofolder:clients
asowners
. Now, if we want to define another relation,viewer
, for folders, it would be tedious (and error-prone) to specify that, indeed,user:alice
anduser:bob
are bound tofolder:clients
asviewer
too, as it is expected in the client application that owners indeed can view. However, consistency issues could arise if there are two or more Tuples, where some tuples are "derived" and need to be atomically created or deleted when the "defining" tuple is. In this case, we can assign the computed UserSetowners
onfolder
to theviewer
relation onfolder
. So whenever the ReBAC engine evaluates who is bound to theviewer
relation, all subject nodes of theowners
relation are added to the search. - Set operations: Union (
or
), Intersection (and
), and Exclusion (not
) operations can be applied to UserSets at evaluation time.
Kubernetes Authorization Model
Kubernetes API server
The Kubernetes API server is an extensible, uniform, and declarative REST API server. Let's look at the implications of this:
-
Declarative: The user-declared desired "end state" of the system is stored, not just actions for the user to invoke to reach some end state. For example, instead of exposing a
POST /resource/start
method, which would be imperative, the API server lets the userPOST
orPUT /resource
with a body of{"spec":{"shouldBeRunning": true}}
, where, by convention, thespec
object contains the user-defined desired state. -
Uniform: Every type in the API server must
- be part of an API group, which logically groups a set of API types evolving together (e.g.,
rbac.authorization.k8s.io
). - have an explicit API version, specifying the schema of the JSON object to use for encoding and decoding (e.g.,
v1beta1
orv1
). - reserve a Kind name which is unique in the API group, e.g.,
Role
orRoleBinding
, and a - resource name which is commonly the Kind name in lower-case and plural, i.e.,
roles
orrolebindings
. - always encode the API Group, Version and Kind information, so that a reader/decoder of the object in encoded form can know exactly how to decode the object, without relying on intrinsics.
- be either namespaced or not. A namespace allows logically segregating sets of API objects of a particular type from each other, for e.g., policy enforcement (e.g.,
team-1
members shall not see resources in theteam-2
namespace). - have a user-settable name field, an identifier that is unique API server-wide or namespace-wide, depending on the namespacedness, and
- have a user-settable namespace field, if the type is namespaced.
This uniformity allows the API server to provide uniform REST paths for all objects. Here is the template for the paths supported by it:
- For API server-wide resources:
- Create a new API object (
create
):POST /apis/<apiGroup>/<apiVersion>/<resource>
- List API objects (
list
):GET /apis/<apiGroup>/<apiVersion>/<resource>
- Watch API objects (
watch
):GET /apis/<apiGroup>/<apiVersion>/<resource>?watch=1
- Get an API object with a given name (
get
):GET /apis/<apiGroup>/<apiVersion>/<resource>/<name>
- Update an API object with a given name (
update
):PUT /apis/<apiGroup>/<apiVersion>/<resource>/<name>
- Patch an API object with a given name (
patch
):PATCH /apis/<apiGroup>/<apiVersion>/<resource>/<name>
- Delete an API object with a given name (
delete
):DELETE /apis/<apiGroup>/<apiVersion>/<resource>/<name>
- Create a new API object (
- For namespaced resources:
- Exactly the same logic as for API server-wide resources, but the base path is
/apis/<apiGroup>/<apiVersion>/namespaces/<namespace>/<resource>
- Exactly the same logic as for API server-wide resources, but the base path is
- be part of an API group, which logically groups a set of API types evolving together (e.g.,
-
Extensible: Kubernetes users can customize the system in almost any way imaginable. One can define new types for the API server to serve just by providing the above-mentioned required data for each type, and preferably a JSON schema, and then Kubernetes will start serving your custom types on demand. This concept is called CustomResourceDefinitions TODO link. Kubernetes also provides pluggable authentication, authorization, pre-storage API object mutation/defaulting (mutating admission controller), validation hooks (validating admission controller), and a whole paradigm of "control through choreography" TODO link, where controller loops work together turn the user's desired state into actual state.
Kubernetes Authentication
One of the first stages that happen for an API server request is authentication, where some user-provided credential (often a client certificate or JWT token) is resolved into what Kubernetes calls UserInfo
, with the following fields:
- a username,
- a set of groups,
- a UID,
- and possibly, a map of extra string-string values for custom parameters.
Kubernetes Authorization
Authorization is executed after the authentication stage, but before the HTTP request body (if any) is deserialized. In other words, at this stage, authorization decisions cannot be made based on the user-provided data, only based on metadata. Additionally, the current API server data is also not available for making decisions.
Essentially, the parameters defined in the API server section, together with the user info, make up the attributes that are at the Authorizer's disposal for producing a response.
Kubernetes also provides an API, SubjectAccessReview
, which allows an actor check if another actor is authorized to perform a given action.
Kubernetes supports multiple Authorizers, where each Authorizer is asked serially from the first to the last. Each Authorizer can produce an Allow
, Deny
or NoOpinion
response. If Allowed, the API server short-circuits and proceeds to the next stage of the API request process. If NoOpinion
is returned, the API server asks the next Authorizer in the chain, or denies the request, if there is no further authorizer to ask. If Denied
, it short-circuits as well, and denies the request. In summary, order matters for the authorizer list.
The de-facto default authorizer chain consists of the builtin Node, and RBAC authorizers. Let's cover them next.
Kubernetes' RBAC Authorizer
Kubernetes implements declarative Role-Based Access Control through defining the ClusterRole
, Role
, ClusterRoleBinding
and RoleBinding
types, where the Cluster
prefix indicates that the role or binding applies cluster-wide, and absence means the role or binding applies in a given namespace.
RBAC rules are purely additive, in other words the RBAC authorizer will return other Allow
, or NoOpinion
, never Deny
.
A (Cluster)RoleBinding defines a mapping between a set of users or groups (matching username and groups of UserInfo
) that are bound to one (Cluster)Role.
A (Cluster)Role defines a set of policies for allowed requests to match. Each policy defines:
- A set of apiGroups to match (any of them)
- A set of resources to match (any of them)
- A set of verbs to match (any of them)
An allowed request must contain a matching apiGroup, resource and verb.
Optionally, the rule can contain a set of resourceNames to match as well, for when the request's name must be in the list too.
The RBAC authorizer is built into the API server in Go code, but the authorizer receives RBAC objects through a regular client through REST requests (i.e. it does not "bypass" the normal request path), so it is topologically the same as running the RBAC authorizer outside the API server binary, barring that there isn't any need in-binary to issue an HTTP request querying for the authorization decision, which is a necessity out-of-tree. The RBAC authorizer makes decisions based on the RBAC API objects it has cached in-memory (and which are updated through a standard API server watch TODO link)
Notably, the RBAC storage layer (not the authorizer) prevents privilege escalation when creating/updating Roles and RoleBindings by making sure the user trying to change a policy already has access to all the policies that would be applied to the new role or binding.
RBAC supports aggregated ClusterRoles (such that one ClusterRole can inherit policies from others), but this is done through a controller copying source PolicyRules to the aggregated ClusterRole's Rules. With ReBAC, we can just create an edge between one ClusterRole to another without further ado.
Additionally, RBAC is extensible, and allows users and programs to define custom verbs in addition to the ones used by the API server. This can be used in conjunction with the SubjectAccessReview
API, where operators, admission controllers or extensions can delegate the authorization decision, e.g., whether a user is able to set a specific field, to the authorizers already used by the API server, for which the user already has a defined authorization state.
Kubernetes' Node Authorizer
API objects in Kubernetes follow the "control through choreography" design pattern, where the idea is to design the types such that they can be actuated or reconciled (user's desired state is turned into the actual state of the system through controller actions) independently, thus favoring multiple semi-independent objects over one big, conflated object. The coupling between API objects is loose, meaning, unlike databases, missing references between objects can momentarily exist (the idea is that, while the link and referenced object might due to necessity have to be created out of order, they will eventually appear).
Kubernetes being a highly dynamic system poses a challenge to RBAC: Sometimes one wants to provide the policy in a relation/graph-based manner, as one might not know the API object names in beforehand. For example, in the Pod definition there is a nodeName
field, which the Kubernetes scheduler sets once it has found a suitable node to schedule the workload on. The node agent, kubelet, needs, by necessity, to be able to read the Pod specification and linked objects the Pod needs for execution (Secrets, PersistentVolumes, etc.) in order to be able to know which containers to run and how. However, it is not ideal to let every node agent read every Pod, Secret, PersistentVolume, etc., as then it's enough to have one Node hacked to have all Secrets leaked. While RBAC allows specifying resourceNames
, which in theory would make it possible to assign an individual Role per Node, with needing a list of Pods, Secrets, etc. to access, such a design is not ideal. As we don't know the names of the Pods and Secrets up-front, there would need to be a controller updating RBAC roles dynamically, but the churn/write rate of such a ClusterRole would be relatively high, which would result in a lot of conflicts. Furthermore, instead of relying on a graph-based design, specifying, e.g., that one Pod references two Secrets, the whole tree of linked resources would need to be flattened in the ClusterRole.
Kubernetes has solved this in a specific way through the Node authorizer, built in Go code directly into the API server. The Node authorizer watches Node, Pod, Secret, etc. resources and builds a graph in-memory, from which it then computes authorization decisions.
Kubernetes + ReBAC = possible?!?
The idea with this project is to generalize both the RBAC and Node authorizers into one, by using a Zanzibar-inspired ReBAC implementation. For this PoC, I chose to integrate with OpenFGA, a CNCF project implementing Zanzibar-style fine-grained authorization.
Further, I want to prototype with the possibilities that using a native, general-purpose graph-based authorization engine brings to the table.
But first, how can we replicate the existing behavior?
Mapping RBAC API Objects to Tuples
We need to define Zanzibar node types and relations between the nodes from the Kubernetes types. In addition, we need to create a mapping function from an API object of a certain kind to a list of Tuples.
Let's see an example on how to map RBAC API objects to Tuples, and what kind of authorization model we'd need:
ClusterRoleBinding
:- Node type:
clusterrolebinding
- Node ID: As the
ClusterRoleBinding
is cluster-scoped, we can use the name directly, as it is guaranteed to be unique in the cluster. We might need to escape some characters depending on the ReBAC implementation. - Per subject: A tuple from
user:<subject-username>
orgroup:<subject-groupname>
toclusterrolebinding:<name>
through anassignee
relation. - As
ClusterRoleBinding
toClusterRole
is a 1:1 mapping, one tuple of form:clusterrolebinding:<name>#assignee
toclusterrole:<clusterrole-name>
through the ClusterRole'sassignee
relation.
- Node type:
ClusterRole
:- Node type:
clusterrole
- Node ID: As the
ClusterRole
is cluster-scoped, we can use the name directly, as it is guaranteed to be unique in the cluster. We might need to escape some characters depending on the ReBAC implementation. - Loop each policy rule, loop rule's
apiGroups
, loop rule'sverbs
, loop rule'sresources
, and:- If the rule is for a collection (resourceNames list is empty): create a Tuple from
clusterrole:<name>#assignee
toresource:<apiGroup>/<resource>
through the<verb>
relation. - If the rule is for individual, named API objects: create a Tuple from
clusterrole:<name>#assignee
toresourceinstance:<apiGroup>/<resource>/<object-name>
, which is a more specific/fine-grained type with three and not two pieces of data, compared toresource
.
- If the rule is for a collection (resourceNames list is empty): create a Tuple from
- Node type:
The authorizer would now be along the lines of:
- Ask one check request per piece of data in
UserInfo
, as the user could be allowed through its name node or any of the group nodes it belongs to, with - the target object to check if the user has a path in the graph to, would be
resource:<apiGroup>/<resource>
if name is not specified (which is the case for, e.g., alist
of all Pods), orresourceinstance:<apiGroup>/<resource>/<object-name>
if the name is specified (e.g.,update
Pod with namefoo
), and
- the relation could be the verb of the request, e.g.,
get
,list
,create
, etc. (API server request types listed above).
This works fairly well, and is a good start for re-implementing RBAC with ReBAC. However, Kubernetes RBAC allows defining wildcards, such as verbs=*
or resources=*
. This implementation does currently not support that. Thus, enter contextual check tuples.
Providing Context for Check Requests
Sometimes, it is useful to provide contextual tuples when checking if there is a path in the graph between two nodes, especially when all possible nodes and edges have not been created in the name of resource efficiency (e.g., why create a controller that watches every single object in the cluster and creates one node in the graph for each, if all authorization policies are based on collections and not instances?)
In our above example, in order to support the wildcard, we can send a contextual tuple (that is never saved to the database) with the check request that "gets us the last mile" in the graph, from a wildcard node that might exist in the graph, e.g., resource:apps.*
that would allow the user to access any resource in the apps API group, to the specific node part of the check request, e.g., resource:apps.deployments
. As we want to create this "forwarding rule" for any verb, we create a new resource
to resource
relation (e.g., with name wildcardmatch
), and specify a "Tuple to UserSet" rewrite (see above for the definition) such that if the user had, e.g., get
access on resource:apps.*
, then they also have get
access on resource:apps.deployments
, as long as the contextual tuple resource:apps.*
is related to resource:apps.deployments
through wildcardmatch
.
Furthermore, we can have an anyverb
relation on the resource
type, which corresponds to verb=*
, for which we specify a "computed UserSet" for all the other verbs, such that a user is considered to be related to resource:<id>
as get
if they are directly related, related through the wildcard match, or are related through the anyverb
relation.
A similar thing can be done if we have a get
request for a specific API object (e.g., Deployment foo
). If the user can get all Deployments in the cluster, then surely they can get that one specific Deployment. As the target node is now more specific (resourceinstance:apps/deployments/foo
through get
), there needs to be a contextual "forwarding" from resource:apps.deployments
to resourceinstance:apps/deployments/foo
using a contextual tuple and a tuple to UserSet rewrite similar to wildcardmatch
earlier.
Finally, in order to reduce the amount of check requests from the Authorizer to the ReBAC engine, we can bind users contextually to groups for that specific check request, and thus make do with only one check request. This can be done without Kubernetes knowing in general which users belong to which groups, we only know for a specific authorization request that just now the user with this name belongs to this group, as that info comes from the authenticator. Thus, a contextual tuple is enough to implement this.
Mapping Node Authorizer API Objects to Relations
With the above, we implemented most (not all) of the RBAC functionality in pseudocode. Let's proceed to sketch on how the Node authorizer can be reimplemented. First and foremost, just as the Node authorizer, we need to list and watch all Nodes, Pods, and other related resources, and create one node in the graph for each API object, and create edges between related objects.
Let's look at a couple of API objects and what information they contain:
Node
: The Node object does not know what Pods are scheduled to it, but it has a name, and the convention used by Kubernetes is that the node agents (kubelets) shall have the usernamesystem:node:<node-name>
after the authentication phase. Let's give the node in the graph the type namecore.node
for example. Thus, we should make a Tuple betweenuser:system:node:<node-name>
andcore.node:<node-name>
, e.g., for the relationsget
,update
, and optionally a specialget_linked_resources
.Pod
:- To
Node
: The Pod contains information of what Node it runs on in thespec.nodeName
field. Therefore, we'll make an incoming edge, i.e.core.node:<spec.nodeName>
tocore.pod:<pod-namespace>/<pod-name>
through a relation called e.g.,node_to_pod
. Notable is that the Pod is namespaced, so we can only fully identify it by specifying both namespace and name. Now, through a Tuple to UserSet rewrite, we can specify that if a user is related to the Node asget_linked_resources
, then it should also be related to the Pod asget_linked_resources
(so we just "extend" or "duplicate" the relation further in the graph). Through a computed UserSet on thecore.pod
type, we can then specify that all users related to theget_linked_resources
relation on thecore.pod
type are also related asget
. - To
Secret
: The Pod can contain a set of references to Secrets in environment variable or volume definitions. We can think of these relations as outgoing from the Pod. A sample tuple would becore.pod:<pod-namespace>/<pod-name>
tocore.secret:<pod-namespace>/<linked-secret-name>
aspod_to_secret
. Notably, the Pod and Secret must reside in the same namespace. Now again, we can propagate theget_linked_resources
relation forward, and point thecore.secret
'sget
relation to that.
- To
Secret
: Does not contain any bindings to other resources. Thus, we don't need to watch for or create any tuples based on the Secrets.
With this setup, the node user:system:node:<node-name>
is related as get
to all core.node
, core.pod
, and core.secret
resources it needs to have access to, just like the Node authorizer. The ReBAC authorizer would either make a contextual forwarding node from e.g., resourceinstance:core.pods/foo
to core.pods:<namespace>/foo
, or perform two check requests if it knows that this Kubernetes type is "fine-grained".
Generically Building an Authorization Model
Now, what similarities are there from the above mappings from a Kubernetes API object and authorization style to the "ReBAC way"?
Each Kubernetes API kind/resource should define the following:
- Its type name in the Zanzibar authorization model
- How to extract the Node ID from the API object (this is pretty much always just the Kubernetes name, possibly with namespace, as that is the "primary key" used for cross-object references)
- A list of incoming direct relations
- For the
ClusterRoleBindings
, this is the mappings from its subjects to itself. For thePod
: from the Node to itself, etc. - We need to know the subject node type, and how to extract the node ID from the Kubernetes API object being processed, and optionally, the name of the UserSet relation to bind to (e.g., if a
ClusterRoleBinding
has a group with namefoo
as subject, it would relate togroup:foo#members
, where members is the UserSet relation being bound to, which in turn resolves touser
nodes, so a check request reaches all the way fromuser
toresource
, or similar). - Further, we need to define what type of relation the subject node computed dynamically, and the API object node being processed, should have.
- For the
- A list of outgoing direct relations
- E.g., for the
ClusterRole
, the relation from itself toresource
orresourceinstance
nodes. Or forPod
, from itself to the Secrets it refers to. - Similarly, the object node type needs to be known, and the object node ID is computed dynamically based on the API object (e.g., the names of the referred Secrets can be acquired from the Pod's environment variable and volume definitions).
- Need to define if it is the processed node itself that should be the subject, or one of its UserSet relations that should be the subject (e.g., in
pod_to_secret
, the Pod itself is the subject, but in the case of a ClusterRoleBinding, itsassignee
UserSet is the subject). - Finally, the relations the processed node and the computed object node should have.
- E.g., for the
- A set of rewrite rules for the type's relations
- For example, a user is bound to the
resource
node if it is directly bound to the specific verb, through a wildcard relation, or through theanyverb
relation.
- For example, a user is bound to the
Given this information, an authorization model can automatically be generated in a way that, e.g., OpenFGA understands.
Reconciling Tuples in a Generic Way
Furthermore, these mapping rules are more or less declarative, given we can use something like Common Expression Language TODO link for writing the mapping functions that extract the node ID strings from the API object. With that, we can create a declarative CustomResourceDefinition API that lets the user define their wanted graph-based authorization logic without:
- writing Go code
- creating a webhook authorizer that multiplexes requests to other authorizers (as Kubernetes currently supports only one), and creating yet one webhook authorizer with its own deployment challenges
- writing their own graph engine like Kubernetes did
- trying to figure out a scalable data model for authorization data (Zanzibar already provided one that empirically works)
One important aspect of reconciliation of the tuples from the Kubernetes API (being authoritative in the case of RBAC and Node authorization, but in general the source could be other places too) to a ReBAC engine like OpenFGA, is how to detect deletions in (or "shrinking of") the desired state.
Consider that you had a ClusterRoleBinding foo
with two subjects, user:lucas
and group:admins
, pointing to ClusterRole bar
. This will generate three tuples, which are applied on the first reconciliation:
- {
user:lucas
,assignee
,clusterrolebinding:foo
} - {
group:foo#members
,assignee
,clusterrolebinding:foo
} - {
clusterrolebinding:foo#assignee
,assignee
,clusterrole:bar
}
Now, imagine that, as the ClusterRoleBinding's subjects are mutable, someone goes and deletes the user:lucas
subject. On the next reconciliation, the following Tuples are generated:
- {
group:foo#members
,assignee
,clusterrolebinding:foo
} - {
clusterrolebinding:foo#assignee
,assignee
,clusterrole:bar
}
This means that, even though the ReBAC engine provides an idempotent "apply tuples" API, that is not enough. We need to inspect the tuples bound to the ClusterRoleBinding for the relations the ClusterRoleBinding defines or "owns", and compute what Tuples currently existing in the graph should be deleted.
Why is it important to specify what relations a Kubernetes API kind "owns" in the graph, and why did we mention some relations are "incoming" and some "outgoing"? Imagine that we reconcile the ClusterRole bar
that lets subjects get deployments. It then generates the following tuple that it applies on the first reconciliation:
- {
clusterrole:bar#assignee
,get
,resource:apps.deployments
}
The second time we reconcile, say, the ClusterRole has not changed. In order to be able to process deletes, as highlighted above, if we now read all tuples touching the clusterrole:bar
node in the graph, we get:
- {
clusterrolebinding:foo#assignee
,assignee
,clusterrole:bar
} - {
clusterrole:bar#assignee
,get
,resource:apps.deployments
}
Oh, we got the clusterrolebinding
-> clusterrole
tuple as well! However, the ClusterRole has no knowledge about to which bindings it is bound, so it cannot generate tuples for those. Thus, unless we filter out the clusterrolebinding
-> clusterrole
tuple due to that that relation not being "owned" by the ClusterRole API object, we will either not be able to delete stale tuples, or we delete all unknown tuples from a given type's perspective.
TODO: talk about Kubernetes finalizers and whole object deletions.
Deployment Topologies
This project consists of two parts:
- one authorizer which implements the Kubernetes SubjectAccessReview Webhook interface, and
- one controller which maps RBAC and Node related API objects to tuples, and reconciles the tuples to OpenFGA.
At the time of writing, the mapping policy is statically defined, but the idea is to create a CRD API for defining the authorization model and tuple mappings.
TODO: Add picture.
- Controller, Authorizer, OpenFGA all separate binaries
- Controller and Authorizer in the same binary, OpenFGA separate binary
- Controller, Authorizer, OpenFGA all in the same binary
- kcp, Controller, Authorizer, OpenFGA all in the same binary for a generic control plane implementation with generic graph-based authorization, just like Kubernetes + Node + RBAC authorizers, but more extendable.
OpenFGA can be deployed with persistent storage (a SQL database, e.g., MySQL or Postgres) or used with an in-memory graph.
If
- tuples are generated from another consistent source like Kubernetes (and can thus be recomputed from scratch any time),
- the graph is small enough to fit in memory (it does at least for all existing Kubernetes installations),
- and some downtime can be tolerated if it needs to be upgraded and built from scratch, then
the in-memory backend can be a good alternative. In fact, this is exactly what Kubernetes does with RBAC and Node authorizer; authoritative data is stored in a consistent store (etcd) and then replicated to in-memory caches in the authorizers for fast response times and availability, trading consistency as per the CAP theorem. In case of a partition, the authorizer would happily respond based on potentially stale policy rules. But in this case, as it is coupled with the API server, the API server would be part of the partition as well, so the request fails regardless.
Running the demo
Setup
- Start an instance of OpenFGA locally, serving the gRPC port on
localhost:8081
. - Start an etcd and API server container locally using Docker Compose, that have the following configuration:
- Authentication is statically defined using the static-tokens authenticator, and there are three users:
- one
superuser
user which can do anything as it bypasses authorization completely, being in thesystem:masters
group. - one
normal-user
user which does not have any privileges from start, which lets us demo the RBAC authorizer reimplementation. - one
system:node:foo-node
user that lets us demo the Node authorizer reimplementation.
- one
- No RBAC or Node authorizers, only
Webhook
mode, which calls our ReBAC authorizer. The API server does not cache anyauthz
requests. - API server serves HTTPS requests on
localhost:6443
.
- Authentication is statically defined using the static-tokens authenticator, and there are three users:
- Start the
rebac-authorizer
binary using thedeploy/default-config.yaml
config file, and kubeconfigdeploy/superuser-kubeconfig.yaml
.
- Being a superuser allows us to apply the bootstrap RBAC rules that Kubernetes normally deploys by default for an empty cluster, so the components are authorized as expected. However, the access could indeed be more limited, but there is a chicken-and-egg problem in that the rebac-authorizer would deny its own request for reading the RBAC rules on first startup when there are no tuples in the graph yet. And, because the authorizer can answer any authorization request, it is already a superuser.
- The Webhook server is running on
localhost:9443
.
RBAC demo
export KUBECONFIG=demo/demo-kubeconfig.yaml
kubectl config use-context normal-user
# all denied
kubectl get pods
kubectl get pods foo
kubectl get pods -A
kubectl get pods -A -w
# use superuser mode
kubectl config use-context admin-user
# create only the role and no binding to our user
kubectl create clusterrole view-pods --verb get,list,watch --resource pods
kubectl config use-context normal-user
# still denied
kubectl get pods
kubectl get pods foo
kubectl config use-context admin-user
# bind our user to the role => now should become allowed
kubectl create clusterrolebinding normal-view-pods --clusterrole view-pods --user normal-user
kubectl config use-context normal-user
# all allowed
kubectl get pods
kubectl get pods foo
kubectl get pods -A
kubectl get pods -A -w
kubectl config use-context admin-user
# remove the "list" and "watch" verbs from the clusterrole, leave the "get"
EDITOR=nano kubectl edit clusterrole view-pods
kubectl config use-context normal-user
# not allowed anymore
kubectl get pods
# allowed
kubectl get pods foo
# allowed, in any namespace
kubectl -n sample-namespace get pods foo
# not allowed anymore
kubectl get pods -A -w
Node Authorization demo
export KUBECONFIG=demo/demo-kubeconfig.yaml
kubectl config use-context node-user
# all denied
kubectl get nodes
kubectl get node foo-node
kubectl get pods
kubectl get pods hello
kubectl get secrets
kubectl get secrets missioncritical
kubectl get secrets very-secret
kubectl config use-context admin-user
# apply the following objects:
# Node foo-node (node-kubeconfig.yaml has corresponding user system:node:foo-node)
# Pod hello which runs on foo-node and references Secrets missioncritical and very-secret
# Secret missioncritical
# Secret very-secret
# NOTE: No RBAC rules are changed here!
kubectl apply -f deploy/node-pod-secret.yaml
# Switch back to node context
kubectl config use-context node-user
# denied
kubectl get nodes
# allowed
kubectl get node foo-node
# denied
kubectl get pods
# allowed
kubectl get pods hello
# denied
kubectl get secrets
# allowed
kubectl get secrets missioncritical
# allowed
kubectl get secrets very-secret
Future improvements
Features yet to be implemented:
- RulesReview API
- Correct implementation of ClusterRole NonResourceURLs
- ClusterRole aggregation with more than one labelSelector label
- Finalizer support
- Assignment of a ClusterRole to a RoleBinding
- Wildcard and resource contextual tuples through UserSets and not Tuple to UserSet
- Contextual Tuple to fine-grained resources
- Common Expression Language support for mapping functions
- Implementation of other Node Authorizer kinds than Node, Pod and Secret
- Declarative CustomResourceDefinition Policy API
- ABAC constraints (upcoming feature in OpenFGA)
- Investigate synergies with kcp
- Preventing privilege escalation
- For example, when a user adds a resource with links to other objects => a user should not be able to get access to a
Secret
just by adding something it already has access to (e.g., anIngress
) with a link to thatSecret
- This can be implemented with an admission controller
- For example, when a user adds a resource with links to other objects => a user should not be able to get access to a
- Related to privilege escalation: enforcing immutability of some inter-object links
- This is to avoid mutating the authorization state when it should not be possible. For example, a Pod should probably not be able to change its Secret and Node references while running, but a new Pod should be created if such mutations are needed.
- This can be implemented with an admission controller
- Add information on how to generate CA, server and client certificates for the demo
- Add information on improvements to contribute upstream
- And much more
Request for Feedback!
The point of this project is to gather feedback and see how authorization for declarative control planes can be improved. Please comment what you think about this in a GitHub Issue, Discussion or on the Kubernetes or Cloud Native Slack! (I'm @luxas there as well.) Or shoot me a tweet at @kubernetesonarm.
Credits
Thanks to Jonathan Whitaker and Andrés Aguiar from the OpenFGA team that helped with questions about OpenFGA, and thanks to Jonathan for the Rejekts presentation together.
Project generated with Kubebuilder.
The authorization webhook implementation is heavily inspired by the authentication webhook in controller-runtime project. If desired by the community, I can contribute that part upstream.
As the RBAC reconciliation code is not easily vendorable from Kubernetes, I need to carry an extracted version of the RBAC storage in a forked repo. This file is outside the scope of the license and copyright of this project.