Sbom-operator - Catalogue all images of a Kubernetes cluster to multiple targets with Syft

Last update: Jan 4, 2023

Comments: 12

sbom-operator

Catalogue all images of a Kubernetes cluster to multiple targets with Syft.

Overview

This operator maintains a central place to track all packages and software used in all those images in a Kubernetes cluster. For this a Software Bill of Materials (SBOM) is generated from each image with Syft. They are all stored in one or more targets. Currently only Git is supported. With this it is possible to do further analysis, vulnerability scans and much more in a single place. To prevent scans of images that have already been analyzed pods are annotated with the imageID of the already processed image.

Kubernetes Compatibility

The image contains versions of k8s.io/client-go. Kubernetes aims to provide forwards & backwards compatibility of one minor version between client and server:

access-manager	k8s.io/{api,apimachinery,client-go}	expected kubernetes compatibility
0.2.0	v0.23.3	1.22.x, 1.23.x, 1.24.x
0.1.0	v0.23.3	1.22.x, 1.23.x, 1.24.x
main	v0.23.3	1.22.x, 1.23.x, 1.24.x

However, the operator will work with more versions of Kubernetes in general.

Container Registry Support

The operator relies on the go-containeregistry library to download images. It should work with most registries. These are officially tested (with authentication):

ACR (Azure Container Registry)
ECR (Amazon Elastic Container Registry)
GAR (Google Artifact Registry)
GCR (Google Container Registry)
GHCR (GitHub Container Registry)
DockerHub

Installation

Manifests

kubectl apply -f deploy/

Helm-Chart

Create a YAML file first with the required configurations or use helm-flags instead.

helm repo add ckotzbauer https://ckotzbauer.github.io/helm-charts
helm install ckotzbauer/sbom-operator -f your-values.yaml

Configuration

All parameters are cli-flags.

Parameter	Required	Default	Description
`verbosity`	`false`	`info`	Log-level (debug, info, warn, error, fatal, panic)
`cron`	`false`	`@hourly`	Backround-Service interval (CRON). All options from github.com/robfig/cron are allowed
`ignore-annotations`	`false`	`false`	Force analyzing of all images, including those from annotated pods.
`format`	`false`	`json`	SBOM-Format.
`targets`	`false`	`git`	Comma-delimited list of targets to sent the generated SBOMs to. Possible targets `git`
`git-workingtree`	`false`	`/work`	Directory to place the git-repo.
`git-repository`	`true` when `git` target is used.	`""`	Git-Repository-URL (HTTPS).
`git-branch`	`false`	`main`	Git-Branch to checkout.
`git-path`	`false`	`""`	Folder-Path inside the Git-Repository.
`git-access-token`	`true` when `git` target is used.	`""`	Git-Personal-Access-Token with write-permissions.
`git-author-name`	`true` when `git` target is used.	`""`	Author name to use for Git-Commits.
`git-author-email`	`true` when `git` target is used.	`""`	Author email to use for Git-Commits.
`pod-label-selector`	`false`	`""`	Kubernetes Label-Selector for pods.
`namespace-label-selector`	`false`	`""`	Kubernetes Label-Selector for namespaces.

The flags can be configured as args or as environment-variables prefixed with SBOM_ to inject sensitive configs as secret values.

Example Helm-Config

args:
  targets: git
  git-author-email: XXX
  git-author-name: XXX
  git-repository: https://github.com/XXX/XXX
  git-path: dev-cluster/sboms
  verbosity: debug
  cron: "0 30 * * * *"

envVars:
  - name: SBOM_GIT_ACCESS_TOKEN
    valueFrom:
      secretKeyRef:
        name: "sbom-operator"
        key: "accessToken"

Targets

It is possible to store the generated SBOMs to different targets (even multple at once). Currently the only available target is Git, but this will change soon.

Git

The operator will save all files with a specific folder structure as described below. When a git-path is configured, all folders above this path are not touched from the application. Assuming that git-path is set to dev-cluster/sboms. When no git-path is given, the structure below is directly in the repository-root. The structure is basically <git-path>/<registry-server>/<image-path>/<image-digest>/sbom.json. The file-extension may differ when another output-format is configured. A token-based authentication to the git-repository is used.

dev-cluster
│
└───sboms
    │
    └───docker.io
    |   │
    |   └───library
    |       │
    |       └───busybox
    |           │
    |           └───sha256_ae39a6f5...
    |               │   sbom.json
    |
    └───ghcr.io
        │
        └───kyverno
            │
            └───kyverno
            |   │
            |   └───sha256_9e3f14e5...
            |       │   sbom.json
            |
            └───kyvernopre
                │
                └───sha256_e48f87fd...
                    │   sbom.json
            |
            └───policy-reporter
                │
                └───sha256_b70caa7a...
                    │   sbom.json

Security

The docker-image is based on scratch to reduce the attack-surface and keep the image small. Furthermore the image and release-artifacts are signed with cosign and attested with provenance-files. The release-process satisfies SLSA Level 2. All of those "metadata files" are also stored in a dedicated repository ghcr.io/ckotzbauer/sbom-operator-metadata. Both, SLSA and the signatures are still experimental for this project.

Contributing

License

Changelog

Owner

Christian Kotzbauer

Web Developer working with TypeScript and Aurelia. Interested in Node.js, Security, Docker and Kubernetes.

https://github.com/ckotzbauer/sbom-operator

Comments

Private ECR repositories give '401 Unauthorized'
Hi,

First of all: thanks for this great work! 🥳

When running v0.9.0 with dependency-track as target, most public available images work fine, except for ECR hosted ones:

sbom-operator-77fdbbfd87-dbznp sbom-operator time="2022-04-20T10:09:53Z" level=error msg="Image-Pull failed" error="GET https://602401143452.dkr.ecr.eu-west-1.amazonaws.com/v2/amazon-k8s-cni-init/manifests/sha256:6c70af7bf257712105a89a896b2afb86c86ace865d32eb73765bf29163a08c56: unexpected status code 401 Unauthorized: Not Authorized\n"

This ECR repo is provided by AWS and should be available for everyone. Other private ECRs give the same 401 error.

Some information about the environment:

Kubernetes 1.21 (EKS)

Instance has AmazonEC2ContainerRegistryReadOnly policy.

Can someone point me in te right direction? I'll add it to the README if useful for others!
Docker Image ID Parsing -- Could not parse
Image ID's are retrieved as follows: https://github.com/ckotzbauer/sbom-operator/blob/151dde7a4046091f2ea56eaaabfc900ced1800da/internal/kubernetes/kubernetes.go#L106-L122

And they are parsed as follows: https://github.com/ckotzbauer/sbom-operator/blob/151dde7a4046091f2ea56eaaabfc900ced1800da/internal/syft/syft.go#L42

I am seeing the following error logs:

time="2022-04-06T19:40:03Z" level=error msg="Could not parse imageID docker-pullable://<rest of image id>" error="Error parsing reference: \"docker-pullable://<rest of image id>\" is not a valid repository/tag"

I believe this is because the Image ID is prefixed with docker-pullable://

I writing a small program to parse <rest of image id> using the same library used by sbom-operator: docker-parserand it works fine.

I think the simplest fix to this is to either remove the docker-pullable:// prefix or use the img.Image field instead.
Feature discussion: AWS Lambda SBOM generation
Hello - thank you for starting this project - it has saved me from attempting to build the same thing! ❤️

Would you be open to a contribution to allow SBOM generation from AWS Lambda functions?

Broadly, something like:

Use the AWS Lambda Go SDK to call the GetFunction operation, to obtain the Code.Location URL

Fetch the Lambda's function code from the URL to a local temporary dir; this is generally a ZIP file

Invoke Syft on the local code package

(Tidy up?)

This would enable use of this tool in an environment in which there is a mix of Kubernetes workloads and serverless ones.

I wanted to guage your interest in whether this aligns with your project goals, before contributing a PR.
Support "Dependency Track" as alternative target for SBOMs

Instead of storing the generated SBOMs to Git it would be good to support different targets, e.g. "Dependency Track".

/kind feature /cc @stevespringett
Clone fails with status 400 when trying to clone a private Azure DevOps repo

When I try to configure a private Azure DevOps git repo as a target cloning fails with the following message:

level=error msg="Open or clone failed" error="unexpected client error: unexpected requesting \"https://*******@dev.azure.com/****/*************/_git/**************/git-upload-pack\" status code: 400"

I suspect the go-git lib to be responsible here as other people report similar issues: https://github.com/src-d/go-git/issues/335

This seems to have been solved for other projects by falling back to git client: https://github.com/argoproj/argo-cd/pull/1244
Registry authentication fails when secret contains only .dockercfg

Secrets with type = "kubernetes.io/dockercfg" containing the pull secret in the field .dockercfg do not work.

OpenShift uses this type of secret for the internal registry - so, you cannot provide the type = "kubernetes.io/dockerconfigjson".
store sbom result on OCI registry along with an image

Hello @ckotzbauer, in cosign there is a support for attaching SBOM files to an OCI registry along with an image. So, maybe we can support that too as an alternative way of storing SBOM files instead of just storing them in git. WDYT?
Mirror configuration for registries

Hello, first thanks for the work!

I'm looking at it and I've found something that may be a blocker for us: use of mirrors

We deploy our kubernetes in airgap environment and we configure the mirrors in containerd. That means that on the "kubernetes side", you don't see the mirrors but they are present...

Would it be feasible to add a configuration for mirror registries?

would be awesome :)

Feature request: Map k8s pod labels as project tags

As a System Operator, I would like to add pod labels as project tags in DependencyTrack So that grouping/filtering by label in dtrack is possible

Background: We use k8s pod labels to determine and group things like application, stage, department, ...

Given the following deployment was applied to k8s cluster:
  ---
  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: dependencytrack-frontend
    namespace: dependencytrack-sales-live
  spec:
    template:
        metadata:
          labels:
            app=dependencytrack
            stage=live
            department=sales
            service=inventory
        spec:
          containers:
            - name: dependencytrack-frontend
              image: dependencytrack-frontend:4.5.1
  ---
When the sbom-operator scans the pod and adds the project to dtrack
Then the following tags should be added to dtrack:
  [namespace=dependencytrack-sales-live, app=dependencytrack, stage=live, department=sales, ...]
But currently only the following tags are added:
  [namespace=dependencytrack-sales-live, ...]

What do you think about the idea of adding Labels map[string]string to struct libk8s.PodInfo and allow custom mapping of labels to dtrack project tags? What would be an appropriate way to configure the custom mapping?

sbom-operator is awesome. Thank you!

feat: delete unused images from Dependency Track
This adds tags to the Dependency-Track projects to track which were created by the SBOM operator. The following tags are created:

sbom-operator

kubernetes-cluster={kubernetes-cluster-id}

The kubernetes-cluster-id can be set via a command line parameter (when not specified default is used).

This is how the tags look like in the Dependency-Track UI: You can use the tags to filter for images in a certain cluster

Closes #27
Deleting project from DependencyTrack fails when deleting pod
Deleting a pod results in error

time="2022-09-28T14:21:56Z" level=error msg="Could not load project: The project could not be found. (status: 404)"

In DependencyTrackTarget.Remove(), populating g.imageProjectMap fails and remains empty after g.LoadImages(). As a result, dtrack project uuid for the image cannot be found.

There is a bug in DependencyTrackTarget.LoadImages(): imageId is set to empty string at the beginning of project tag for loop. Therefore it will only be added to imageProjectMap if raw-image-id is the last tag in project.Tags. However, the sbom-operator usually follows raw-image-id and resets imageId To fix this particular issue imageId = "" should be moved above the loop.

However, this is not enough to fix the actual problem that deleting a project from DependencyTrack fails. Its seems there are inconsistencies in the imageProjectMap. Once sbom-operator is running, new pods successfully create new dtrack project, but the imageId and corresponding uuid are not added to imageProjectMap. When the pod gets deleted, the project uuid resolved from imageProjectMap is 00000000-0000-0000-0000-000000000000 - the default for empty uuid.UUID{}.String() - which cannot be found in dtrack. A potential fix could be to update imageProjectMap at the end of ProcessSbom.

Cloning with git-fallback-clone seems to fail

Hi @ckotzbauer. Thank you very much for your work trying to support Azure DevOps repos. Unfortunately it seems to be not working for me. This is the log output from a sbom-operator pod deployed into kubernetes:

time="2022-09-20T09:11:41Z" level=info msg="Commit: 0f4635d8a13131aa655e6096beb6f46a92199ce9"
time="2022-09-20T09:11:41Z" level=info msg="Built at: 2022-09-17T08:59:11Z"
time="2022-09-20T09:11:41Z" level=info msg="Built by: goreleaser"
time="2022-09-20T09:11:41Z" level=info msg="Go Version: go1.19"
time="2022-09-20T09:11:41Z" level=debug msg="Targets set to: [git]"
time="2022-09-20T09:11:41Z" level=info msg="Webserver is running at port 8080"
time="2022-09-20T09:11:41Z" level=info msg="Wait for cache to be synced"
time="2022-09-20T09:11:41Z" level=error msg="Open or clone failed" error="'git clone -b sbom-operator https://******@dev.azure.com/******/**********/_git/******************* /work/sbom' failed: Cloning 
into '/work/sbom'..."
time="2022-09-20T09:11:41Z" level=info msg="Start pod-informer"
time="2022-09-20T09:11:41Z" level=info msg="Processing image ghcr.io/ckotzbauer/sbom-operator@sha256:846b4d38b700d4c995404ff038f97b1748e3018e234b1255d7989a8ed7647a2e"
time="2022-09-20T09:11:41Z" level=info msg="Finished cache sync"
time="2022-09-20T09:11:41Z" level=error msg="An error occurred while processing /work/sbom/aks/sbom" error="lstat /work/sbom/aks/sbom: permission denied"
time="2022-09-20T09:11:41Z" level=debug msg="Pod sbom-operator/sbom-operator-788b8f6697-7k59x needs to be analyzed"
time="2022-09-20T09:11:41Z" level=debug msg="Skip image ghcr.io/ckotzbauer/sbom-operator@sha256:846b4d38b700d4c995404ff038f97b1748e3018e234b1255d7989a8ed7647a2e"
time="2022-09-20T09:11:41Z" level=error msg="An error occurred while processing /work/sbom/aks/sbom" error="lstat /work/sbom/aks/sbom: permission denied"
time="2022-09-20T09:11:44Z" level=error msg="Directory could not be created" error="mkdir /work/sbom/aks: permission denied"
time="2022-09-20T09:12:16Z" level=info msg="Processing image registry.k8s.io/ingress-nginx/kube-webhook-certgen@sha256:549e71a6ca248c5abd51cdb73dbc3083df62cf92ed5e6147c780e30f7e007a47"
time="2022-09-20T09:12:18Z" level=error msg="Directory could not be created" error="mkdir /work/sbom/aks: permission denied"
time="2022-09-20T09:12:18Z" level=debug msg="Image registry.k8s.io/ingress-nginx/kube-webhook-certgen@sha256:549e71a6ca248c5abd51cdb73dbc3083df62cf92ed5e6147c780e30f7e007a47 marked for removal"
time="2022-09-20T09:12:18Z" level=debug msg="Start to remove old SBOMs"
time="2022-09-20T09:12:18Z" level=error msg="Open failed" error="stat /work/sbom/.git: permission denied"
time="2022-09-20T09:12:18Z" level=debug msg="Deleted old SBOM: /work/sbom/aks/sbom/registry.k8s.io/ingress-nginx/kube-webhook-certgen/sha256_549e71a6ca248c5abd51cdb73dbc3083df62cf92ed5e6147c780e30f7e007a47/sbom.json"

Could it be that the plain git clone return is seen as an error by the wrapper code? Just guessing since it is saying 'Cloning into ...' as if it worked.

Sbom-operator - Catalogue all images of a Kubernetes cluster to multiple targets with Syft

sbom-operator

Overview

Kubernetes Compatibility

Container Registry Support

Installation

Manifests

Helm-Chart

Configuration

Example Helm-Config

Targets

Git

Security

Contributing

License

Changelog

Owner

Christian Kotzbauer

Comments

Private ECR repositories give '401 Unauthorized'

Docker Image ID Parsing -- Could not parse

Feature discussion: AWS Lambda SBOM generation

Support "Dependency Track" as alternative target for SBOMs

Clone fails with status 400 when trying to clone a private Azure DevOps repo

Registry authentication fails when secret contains only .dockercfg

store sbom result on OCI registry along with an image

Mirror configuration for registries

Feature request: Map k8s pod labels as project tags

feat: delete unused images from Dependency Track

Deleting project from DependencyTrack fails when deleting pod

Cloning with git-fallback-clone seems to fail

Related tags

An operator which complements grafana-operator for custom features which are not feasible to be merged into core operator

Terraform-operator - The Terraform Operator provides support to run Terraform modules in Kubernetes in a declaritive way as a Kubernetes manifest

PolarDB-X Operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes.

cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resource objects related of Kubernetes Cluster API.

Example goreleaser + github actions config with keyless signing and SBOM generation

The Elastalert Operator is an implementation of a Kubernetes Operator, to easily integrate elastalert with gitops.

Minecraft-operator - A Kubernetes operator for Minecraft Java Edition servers

K8s-network-config-operator - Kubernetes network config operator to push network config to switches

Pulumi-k8s-operator-example - OpenGitOps Compliant Pulumi Kubernetes Operator Example

Kubernetes Operator Samples using Go, the Operator SDK and OLM

Kubegres is a Kubernetes operator allowing to create a cluster of PostgreSql instances and manage databases replication, failover and backup.

Nebula Operator manages NebulaGraph clusters on Kubernetes and automates tasks related to operating a NebulaGraph cluster

Kubernetes Operator for MySQL NDB Cluster.

YurtCluster Operator creates and manages OpenYurt cluster atop Kubernetes

kubetnl tunnels TCP connections from within a Kubernetes cluster to a cluster-external endpoint, e.g. to your local machine. (the perfect complement to kubectl port-forward)

A Terraform module to manage cluster authentication (aws-auth) for an Elastic Kubernetes (EKS) cluster on AWS.

Starting and Stopping Services as Mage Targets

Awesome-italia-remote - A list of remote-friendly or full-remote companies that targets Italian talents

Test Operator using operator-sdk 1.15