🔑 Kubernetes Authentication & Authorization WebHook Server

Last update: Dec 16, 2022

Comments: 17

Guard

Guard by AppsCode is a Kubernetes Webhook Authentication server. Using guard, you can log into your Kubernetes cluster using various auth providers. Guard also configures groups of authenticated user appropriately. This allows cluster administrator to setup RBAC rules based on membership in groups. Guard supports following auth providers:

Supported Versions

Kubernetes 1.9+

Installation

To install Guard, please follow the guide here.

Using Guard

Want to learn how to use Guard? Please start here.

Contribution guidelines

Want to help improve Guard? Please start here.

Guard binaries collects anonymous usage statistics to help us learn how the software is being used and how we can improve it. To disable stats collection, run the operator with the flag --analytics=false.

Acknowledgement

Support

We use Slack for public discussions. To chit chat with us or the rest of the community, join us in the AppsCode Slack team channel #guard. To sign up, use our Slack inviter.

If you have found a bug with Guard or want to request for new features, please file an issue.

Owner

Kubernetes Guard

Kubernetes Authentication & Authorization WebHook Server

https://github.com/appscode/guard https://appscode.com/products/guard

Comments

Logo Proposal for Guard

Hi, I'm a graphic designer and I like to collaborate with open source projects. Do you know that the graphic image of a project is very important? thinking about it I would like to design a logo for your Project Guard.

I will be pleased to collaborate with you.
Non interactive logins - Azure provider

Hey hey !

In the past few days, I've had a chance to look over the AKS-AAD integrations (which utilize guard, under the hood); That was during some research I'm doing for implementing non interactive logins for AAD enabled clusters (both legacy & managed).

Specifically, I'm looking at auth via service principals, passing a client_credentials-flow token, which holds a service principals object id claim and no UPN claim.

Managed AAD clusters have it mostly solved (a-la kubelogin) - one only needs to issue a token for the multi tenant AKS server app, and any directory entity could be used in this flow - the groups JWT claim is considered along with any overage data.

However, in case of overage - ms graph is consulted, and the request fails for SPNs, as msgraph 404-s fetching group memberships for service principals; This is because the graph API currently used only supports retrieving memberships for users (and not service principals).

As for legacy clusters, the groups JWT claim isn't considered at all -
ms graph is always consulted in fetching group memberships for the given object id; Specifically, the "/users/id/getMemberGroups" endpoint is used.

When an spn oid is passed - the API above 404-s in the same manner and it fails the request altogether.

I believe that ms graph had no API for retrieving groups for any given entity back in the day, but now one does exist -"/directoryObjects/id/getMemberGroups".

I was wondering if it would be a good idea to migrate to the new endpoint - this would enable non interactive login flows to legacy clusters and fix the flow for managed integrations for SPNs assigned to many groups.

Otherwise, it might make sense to have the ms graph call be best effort - returning blank groups in cases of error; It's a bit unfortunate that not being able to retrieve groups, fails the auth attempt altogether - when reaching that code path, we have a verified JWT at hand with some object id, it might have made sense to pass it onward and check for any direct k8s role mapping.

Another suggestion might be to flip the flag which considers the given groups claim (on AKS side) for legacy clusters - closing the disparity between the two integrations.

Would love to hear your two cents on this @weinong

Thanks !
NTP sync causing periodic crashes

NTP sync was introduced 0.1.0 via https://github.com/appscode/guard/issues/83

Since upgrading to 0.1.0+ (including latest version) we see periodically the following:

F0529 18:25:05.766404 1 server.go:44] read udp [--IP REDACTED--]->128.138.141.172:123: i/o timeout goroutine 27 [running]: github.com/appscode/guard/vendor/github.com/golang/glog.stacks(0xc420339900, 0xc42019a320, 0x69, 0xa0) /go/src/github.com/appscode/guard/vendor/github.com/golang/glog/glog.go:766 +0xcf github.com/appscode/guard/vendor/github.com/golang/glog.(*loggingT).output(0x22972e0, 0xc400000003, 0xc4200ad4a0, 0x21e0a82, 0x9, 0x2c, 0x0) /go/src/github.com/appscode/guard/vendor/github.com/golang/glog/glog.go:717 +0x322 github.com/appscode/guard/vendor/github.com/golang/glog.(*loggingT).printDepth(0x22972e0, 0xc400000003, 0x1, 0xc420b04f98, 0x1, 0x1) /go/src/github.com/appscode/guard/vendor/github.com/golang/glog/glog.go:646 +0x12a github.com/appscode/guard/vendor/github.com/golang/glog.(*loggingT).print(0x22972e0, 0xc400000003, 0xc420b04f98, 0x1, 0x1) /go/src/github.com/appscode/guard/vendor/github.com/golang/glog/glog.go:637 +0x5a github.com/appscode/guard/vendor/github.com/golang/glog.Fatal(0xc420b04f98, 0x1, 0x1) /go/src/github.com/appscode/guard/vendor/github.com/golang/glog/glog.go:1125 +0x53 github.com/appscode/guard/server.Server.ListenAndServe.func1(0xc420526480, 0xc420366900) /go/src/github.com/appscode/guard/server/server.go:44 +0xf1 created by github.com/appscode/guard/server.Server.ListenAndServe /go/src/github.com/appscode/guard/server/server.go:41 +0xe1b

I note via host 128.138.141.172 this is utcnist2.colorado.edu.

I see the default for clock-check-interval is five minutes. Our errors are not that frequent, not are they regular, so I don't think this can be put down to an overly restrictive security group on our part. It happens approximately 10-20 times per week.

Ideally - i/o timeouts should be dealt with, rather than any error here causing a fatal exception.

Alternatively - a simple way to turn this off should be provided. Best I can see right now is to make clock-check-interval big.
Add proxy support in Guard (#20)
Added proxy support for guard and additional metrics for authz

Proxy Support Added 4 parameters for proxy in the installer command

3 are for the common proxy environment variables (HTTP_PROXY, HTTPS_PROXY, NO_PROXY) - "guard-proxy" secret (installer/secrets.go) is created which is loaded as environment variable in the deployment.yaml. --proxy-http, --proxy-https, --proxy-skip-range

--proxy-cert is for proxy with cert authentication. For this, user provides the path to their proxy cert to the --proxy-cert args. Internally, we read the data from the file and create another secret ("guard-proxy-cert"). For proxy with cert, the cert should be part of "/etc/ssl/certs/ca-certificates.pem". This is achieved by running update-ca-certificates command. The update-ca-certificates command takes the certs present in /usr/local/share/ca-certificates and adds them to /etc/ssl/certs/ca-certificates.pem. To achieve this, we are doing the following:

create two volumes , one is empty directory volume called "ssl-certs" and the other is a volume created from the "guard-proxy-cert" secret called "proxy-certstore".

An init container (image: nginx:stable-alpine (9 mb)) which does the job of running the update-ca-certificates command. To this container we mount the "ssl-certs" volume at path "/etc/ssl/certs" and the "proxy-certstore" volume at "/usr/local/share/ca-certificates" path. The container will run update-ca-certificates and create the ca-certificates.pem file in the /etc/ssl/certs directory which is from the volume.

The main "guard" container has the "ssl-certs" volume mounted at /etc/ssl/certs path. As the volume is shared between the init container and guard container, when the init container updates the /etc/ssl/certs, the guard container will also have the updated ca-certificates.pem file which contains the proxy cert. This way the http client used to make the requests will be able to do successfully through proxy.

The proxy support is for azure-arc scenario where users will be setting up guard by themselves and also in future when arc switches to automating the guard setup for customers
Feature - Azure - Enabling PoP Token verification #331
This PR allows any Azure Auth provider to leverage AAD based authentication using tokens with proof of possession (PoP).

Introducing :

New flag azure.enable-pop to either activate or not this verification

New flag azure.pop-hostname used by PoP to validate the u claim been passed

New flag azure.pop-token-validity-duration to set a TTL duration for the PoP token been passed

New flag azure.skip-group-membership-resolution used to bypass the group membership resolution logic

New PassthroughAuthMode used when we don't want to run any extra validation against GRAPH such as refreshing or get a new token. This scenario is mainly used by Azure Arc today since AKSAuthMode OBOAuthMode ClientCredentialAuthMode cannot been used

New PoPTokenVerifier Struct used to verify the PoP token structure. As today this is non official and custom code but following up with AAD Team to see how and when this verification will be official and been supported in GO-MSAL ?

Adding UTs for each new components
Issues with LDAP and guard get installer
Hello, I'm trying to setup Guard 0.2.1 in order to use it with an internal LDAP server. I've of course followed the official guide.

I think I hit a couple of issues when using guard get installer:

The default path for server and CA certificates is set to /etc/guard/pki even if variable GUARD_DATA_DIR is unset (following this logic, the path should be set to user's home folder).

Also, the following options are set by default: --tls-cert-file="/etc/guard/pki/tls.crt" --tls-private-key-file="/etc/guard/pki/tls.key"

instead, server certificates are named server.crt and server.key respectively (using of course guard init ca). If I try to change the options above, I get this error even when file exists:

server.go:141] open /etc/guard/pki/server.crt: no such file or directory

Am I missing something?

Thank you!
Avoid storing result in cache on checkaccess error

When checkaccess returns error, we have made change to not store result as denied in the cache since it's not actually a deny and could be a transient http error (like 500, 504 etc).

We are storing the result into cache if the error is not http and also if error code is 429 (too many requests)
increase write timeout for guard

We have been seeing stream Internal Errors in kube-apiserver for webhook authorizer requests to guard recently. On investigating we found out that it is because the write timeout in guard server is short (10 secs) and it closes client connection whenever it takes longer than 10s calling to Azure to check access (which is taking more than 10 seconds ) Increasing the write timeout to handle this.

More on server timeouts: https://blog.cloudflare.com/the-complete-guide-to-golang-net-http-timeouts/
Implement On-Behalf-Of (OBO) flow
On-Behalf-Of (OBO) flow is added for unmanaged kubernetes (third party, using user specified AAD applications) as well as Azure Kubernetes Service (which uses first party OBO service)

Separate out the token logic from Get Group logic

refactored the original login() function into client-credential flow which follows the new TokenRefresher interface

This implements #131 #235
Enable guard to use OAuth 2.0 On-Behalf-Of flow to obtain user's group memberships

Currently guard uses the Application Identity of Web API AAD application to get AAD user's group memberships: https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-authentication-scenarios#web-application-to-web-api

The caveat of using this flow is that it requires the application to have certain kinds of read permissions for all users profile in AAD. This is an elevated access scenario and some tenant admins (and organizations) don't allow applications to have this level of access without user's consent.

A safer/more acceptable model is to use OAuth 2.0 On-Behalf-Of flow: https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-v2-protocols-oauth-on-behalf-of#protocol-diagram.

In this scenario Guard to request group memberships from AAD on behalf of the user using it's application identity and user's token in exchange of token returned by AAD.

This issue is a feature request to add On-Behalf-Of flow in Guard to obtain group memberships of the user.
Diagnosability improvements in authz scenarios
Improving diagnosability in authz scenarios:

adding custom metrics.

changing levels of various log statements.

updating cache ttl settings

removing resolve group membership option
Add more logs in guard to record where the request is from

Sometimes, guard pod is reporting " failed to verify token for azure: oidc: id token issued by a different provider". However, we don't know where the request is from. Could guard add more logs to show the request source for better debugging?
Add support for * in subject access review requests
Fields Resource, Group and Verb can be * in SAR requests. Currently there is no handling for such requests. We will now create a map containing list of predefined K8s types on startup and use that map to create list of data actions that need to be sent in a checkaccess request depending on the combo of *'s in Resource, Group and Verb fields. Structure of Map:

type AuthorizationEntity struct {

Id string `json:"Id"`

}

type AuthorizationActionInfo struct {

AuthorizationEntity IsDataAction bool `json:"IsDataAction"`

}

type DataAction struct { ActionInfo AuthorizationActionInfo IsNamespacedResource bool // whether this is a namespace scoped resource or not }

type ResourceAndVerbMap struct { ResourceMap map[string]map[string]DataAction // “string” will be resource name whose value is a verb map whose value is DataAction struct defined above
}

type OperationsMap struct { GroupMap map[string]ResourceAndVerbMap //“string” will be apigroup name whose value is an resource map }

Snippet :

The scenario diag is:

Scenario | Namespace is empty (Cluster scope call) | Namespace is not empty (NS scope) -- | -- | -- Verb - * , Res - * Group - * | All cluster and ns res with all verbs at clusterscope | All ns resources at ns scope Res - *, Group - * | All cluster and ns res with specified verb at clusterscope | All ns res with specified verb at ns scope Verb - *, Group - * | Resource under all apigroups and with all verbs at cluster scope | NS Resource under all apigroups and with all verbs at ns scope Verb - , Resource - | All cluster and ns res with all verbs under specified apigroup at clusterscope | All ns res with all verbs under specified apigroup at ns scope Verb - * | Resource under specifed apigroups and with all verbs at cluster scope | Resource under specifed apigroups and with all verbs at ns scope Resource - * | All CS and NS Resources under specifed apigroup with specified verb at cluster scope | All NS Resources under specifed apigroup with specified verb at ns scope Group - * | Resource under all apigroups with specified verb at cluster scope | Resource under all apigroups with specified verb at ns scope All three are not * | Normal call we make now at cluster scope | Normal call we make now at namespace scope
[Google] hard-coded Google OAuth client was deleted
hi,

we've been using Google Authenticator (https://appscode.com/products/guard/v0.7.1/guides/authenticator/google/) and everything was working until 2022/10/17 around 4am PST, the hard-coded Google OAuth client was deleted.

we are getting error when running guard get token -o google

Unable to connect to the server: failed to refresh token: oauth2: cannot fetch token: 401 Unauthorized Response: { "error": "deleted_client", "error_description": "The OAuth client was deleted." }

After communicating with Google, response:

"When entering the Client ID we get the error “App not found” which indicates that the Client ID doesn't exist or is not available outside the organization that contains that Client ID."

Can the guard team help us understand:

is the hard-coded Google OAuth client owned by Guard team?

can someone check how / why the client was deleted? thanks

hard-coded Google OAuth client: https://github.com/kubeguard/guard/blob/master/auth/providers/google/google.go#L32-L38

Our fix is to create a new Google OAuth client that fully manage by you and recompile the Guard binary and update guard image running on your clusters.

[github] certificate signed by unknown authority

I was folloing the github auth guide. Everything seems fine:

When I try to test the webhook:

kubectl get pods --all-namespaces --user <myghuser> -v9

I get an error:

error: You must be logged in to the server (Unauthorized)

It seems that the guard pod does receives the token review request, but fails, while communicating with Github API:

> kubectl logs -n kube-system deploy/guard
...
I0827 10:52:47.002248       1 server.go:168] setting up authz providers
I0827 10:53:23.761824       1 handler.go:47] Received token review request for github/rancser
E0827 10:53:23.805177       1 utils.go:130] failed to check user's membership in Org rancser: Get "https://api.github.com/user/memberships/orgs/rancser": x509: certificate signed by unknown authority

I tried it with versions: v0.6.1 and v0.11.0

[Security] Workflow release.yml is using vulnerable action actions/checkout

The workflow release.yml is referencing action actions/checkout using references v1. However this reference is missing the commit a6747255bd19d7a757dbdda8c654a9f84db19839 which may contain fix to the some vulnerability. The vulnerability fix that is missing by actions version could be related to: (1) CVE fix (2) upgrade of vulnerable dependency (3) fix to secret leak and others. Please consider to update the reference to the action.
Question : Can Guard support more than 1 OIDC provider ?

Hi,

I was wondering if Guard can support more than 1 OIDC provider today ?

For example can we do Azure AD Auth + Github for example ? With an order or preference where Guard will try first to use Azure AD Auth and fallback into Github is not working ?

Also, can we do 2 Azure AD Auth (Using different tenant for example) with the same logic ?

Thanks a lot

Webhook-server - Webhook Server for KubeDB resources

webhook-server Webhook Server for KubeDB resources Installation To install KubeD

Feb 22, 2022

Kubernetes webhook development (validating admission webhook) tutorial using kubewebhook

pod-exec-guard-kubewebhook-tutorial Introduction This is a tutorial that shows how to develop a Kubernetes admission webhook. To explain this, the tut

Aug 26, 2022

webhook is a lightweight incoming webhook server to run shell commands

What is webhook? webhook is a lightweight configurable tool written in Go, that allows you to easily create HTTP endpoints (hooks) on your server, whi

Jan 5, 2023

Tcpdump-webhook - Toy Sidecar Injection with Mutating Webhook

tcpdump-webhook A simple demonstration of Kubernetes Mutating Webhooks. Injects

Feb 8, 2022

MTLS - Golang mTLS example,mTLS using TLS do both side authentication & authorization

mTLS Golang Example mTLS Golang Example 1. What is mutual TLS (mTLS)? 2. How doe

Dec 21, 2022

MTLS - Golang mTLS example,mTLS using TLS do both side authentication & authorization

mTLS Golang Example mTLS Golang Example 1. What is mutual TLS (mTLS)? 2. How doe

Jan 1, 2022

Kubernetes OS Server - Kubernetes Extension API server exposing OS configuration like sysctl via Kubernetes API

KOSS is a Extension API Server which exposes OS properties and functionality using Kubernetes API, so it can be accessed using e.g. kubectl. At the moment this is highly experimental and only managing sysctl is supported. To make things actually usable, you must run KOSS binary as root on the machine you will be managing.

May 19, 2021

A Kubernetes Mutating Webhook to automatically re-point pod images to mirrors

kubernetes-mimic Kubernetes Mimic is a Mutating Webhook that will watch for pod creation and update events in a Kubernetes cluster and automatically a

Nov 22, 2022

Kubernetes Webhook used for image mutations

Table of Contents About Imagswap Getting Started Prerequisites Installation Usage Roadmap Contributing License Contact Acknowledgments About The Proje

Mar 7, 2022

Kubernetes Admission Controller Demo: Validating Webhook for Namespace lifecycle events

Kubernetes Admission Controller Based on How to build a Kubernetes Webhook | Admission controllers Local Kuberbetes cluster # create kubernetes cluste

Feb 27, 2022

A Terraform module to manage cluster authentication (aws-auth) for an Elastic Kubernetes (EKS) cluster on AWS.

Archive Notice The terraform-aws-modules/eks/aws v.18.20.0 release has brought back support aws-auth configmap! For this reason, I highly encourage us

Dec 4, 2022

cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resource objects related of Kubernetes Cluster API.

Overview cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resou

Oct 27, 2022

The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the operator-sdk or controller-runtime.

k8s-generic-webhook The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the opera

Nov 24, 2022

🔑 Kubernetes Authentication & Authorization WebHook Server

Guard

Supported Versions

Installation

Using Guard

Contribution guidelines

Acknowledgement

Support

Owner

Kubernetes Guard

Comments

Logo Proposal for Guard

Non interactive logins - Azure provider

NTP sync causing periodic crashes

Add proxy support in Guard (#20)

Feature - Azure - Enabling PoP Token verification #331

Issues with LDAP and guard get installer

Avoid storing result in cache on checkaccess error

increase write timeout for guard

Implement On-Behalf-Of (OBO) flow

Enable guard to use OAuth 2.0 On-Behalf-Of flow to obtain user's group memberships

Diagnosability improvements in authz scenarios

Add more logs in guard to record where the request is from

Add support for * in subject access review requests

[Google] hard-coded Google OAuth client was deleted

[github] certificate signed by unknown authority

[Security] Workflow release.yml is using vulnerable action actions/checkout

Question : Can Guard support more than 1 OIDC provider ?

Related tags

Webhook-server - Webhook Server for KubeDB resources

Kubernetes webhook development (validating admission webhook) tutorial using kubewebhook

webhook is a lightweight incoming webhook server to run shell commands

Tcpdump-webhook - Toy Sidecar Injection with Mutating Webhook

MTLS - Golang mTLS example,mTLS using TLS do both side authentication & authorization

MTLS - Golang mTLS example,mTLS using TLS do both side authentication & authorization

Kubernetes OS Server - Kubernetes Extension API server exposing OS configuration like sysctl via Kubernetes API

A Kubernetes Mutating Webhook to automatically re-point pod images to mirrors

Kubernetes Webhook used for image mutations

Kubernetes Admission Controller Demo: Validating Webhook for Namespace lifecycle events

A Terraform module to manage cluster authentication (aws-auth) for an Elastic Kubernetes (EKS) cluster on AWS.

cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resource objects related of Kubernetes Cluster API.

The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the operator-sdk or controller-runtime.

(WIP) Extremely simple unixway GitHub webhook listener for push event

HSDP Metrics alerts webhook broker and CF events forwarder for Microsoft Teams

Slack Incoming Webhook for Go

Example Pod webhook

Microservice we use to post reddit posts to a webhook

Wechatbot for prometheus alertmanager webhook