Kubernetes Reboot Daemon

kured - Kubernetes Reboot Daemon

Introduction

Kured (KUbernetes REboot Daemon) is a Kubernetes daemonset that performs safe automatic node reboots when the need to do so is indicated by the package management system of the underlying OS.

  • Watches for the presence of a reboot sentinel e.g. /var/run/reboot-required
  • Utilises a lock in the API server to ensure only one node reboots at a time
  • Optionally defers reboots in the presence of active Prometheus alerts or selected pods
  • Cordons & drains worker nodes before reboot, uncordoning them after

Kubernetes & OS Compatibility

The daemon image contains versions of k8s.io/client-go and k8s.io/kubectl (the binary of kubectl in older releases) for the purposes of maintaining the lock and draining worker nodes. Kubernetes aims to provide forwards and backwards compatibility of one minor version between client and server:

kured kubectl k8s.io/client-go k8s.io/apimachinery expected kubernetes compatibility
master 1.19.4 v0.19.4 v0.19.4 1.18.x, 1.19.x, 1.20.x
1.6.0 1.19.4 v0.19.4 v0.19.4 1.18.x, 1.19.x, 1.20.x
1.5.1 1.18.8 v0.18.8 v0.18.8 1.17.x, 1.18.x, 1.19.x
1.4.4 1.17.7 v0.17.0 v0.17.0 1.16.x, 1.17.x, 1.18.x
1.3.0 1.15.10 v12.0.0 release-1.15 1.15.x, 1.16.x, 1.17.x
1.2.0 1.13.6 v10.0.0 release-1.13 1.12.x, 1.13.x, 1.14.x
1.1.0 1.12.1 v9.0.0 release-1.12 1.11.x, 1.12.x, 1.13.x
1.0.0 1.7.6 v4.0.0 release-1.7 1.6.x, 1.7.x, 1.8.x

See the release notes for specific version compatibility information, including which combination have been formally tested.

Versions >=1.1.0 enter the host mount namespace to invoke systemctl reboot, so should work on any systemd distribution.

Installation

To obtain a default installation without Prometheus alerting interlock or Slack notifications:

latest=$(curl -s https://api.github.com/repos/weaveworks/kured/releases | jq -r .[0].tag_name)
kubectl apply -f "https://github.com/weaveworks/kured/releases/download/$latest/kured-$latest-dockerhub.yaml"

If you want to customise the installation, download the manifest and edit it in accordance with the following section before application.

Configuration

The following arguments can be passed to kured via the daemonset pod template:

Flags:
      --alert-filter-regexp regexp.Regexp   alert names to ignore when checking for active alerts
      --blocking-pod-selector stringArray   label selector identifying pods whose presence should prevent reboots
      --ds-name string                      name of daemonset on which to place lock (default "kured")
      --ds-namespace string                 namespace containing daemonset on which to place lock (default "kube-system")
      --end-time string                     schedule reboot only before this time of day (default "23:59:59")
  -h, --help                                help for kured
      --lock-annotation string              annotation in which to record locking node (default "weave.works/kured-node-lock")
      --lock-ttl duration                   expire lock annotation after this duration (default: 0, disabled)
      --message-template-drain string       message template used to notify about a node being drained (default "Draining node %s")
      --message-template-reboot string      message template used to notify about a node being rebooted (default "Rebooting node %s")
      --period duration                     reboot check period (default 1h0m0s)
      --prefer-no-schedule-taint string     Taint name applied during pending node reboot (to prevent receiving additional pods from other rebooting nodes). Disabled by default. Set e.g. to "weave.works/kured-node-reboot" to enable tainting.
      --prometheus-url string               Prometheus instance to probe for active alerts
      --reboot-days strings                 schedule reboot on these days (default [su,mo,tu,we,th,fr,sa])
      --reboot-sentinel string              path to file whose existence signals need to reboot (default "/var/run/reboot-required")
      --slack-channel string                slack channel for reboot notfications
      --slack-hook-url string               slack hook URL for reboot notfications
      --slack-username string               slack username for reboot notfications (default "kured")
      --start-time string                   schedule reboot only after this time of day (default "0:00")
      --time-zone string                    use this timezone for schedule inputs (default "UTC")

Reboot Sentinel File & Period

By default kured checks for the existence of /var/run/reboot-required every sixty minutes; you can override these values with --reboot-sentinel and --period. Each replica of the daemon uses a random offset derived from the period on startup so that nodes don't all contend for the lock simultaneously.

Setting a schedule

By default, kured will reboot any time it detects the sentinel, but this may cause reboots during odd hours. While service disruption does not normally occur, anything is possible and operators may want to restrict reboots to predictable schedules. Use --reboot-days, --start-time, --end-time, and --time-zone to set a schedule. For example, business hours on the west coast USA can be specified with:

  --reboot-days=mon,tue,wed,thu,fri
  --start-time=9am
  --end-time=5pm
  --time-zone=America/Los_Angeles

Times can be formatted in numerous ways, including 5pm, 5:00pm 17:00, and 17. --time-zone represents a Go time.Location, and can be UTC, Local, or any entry in the standard Linux tz database.

Note that when using smaller time windows, you should consider shortening the sentinel check period (--period).

Blocking Reboots via Alerts

You may find it desirable to block automatic node reboots when there are active alerts - you can do so by providing the URL of your Prometheus server:

--prometheus-url=http://prometheus.monitoring.svc.cluster.local

By default the presence of any active (pending or firing) alerts will block reboots, however you can ignore specific alerts:

--alert-filter-regexp=^(RebootRequired|AnotherBenignAlert|...$

See the section on Prometheus metrics for an important application of this filter.

Blocking Reboots via Pods

You can also block reboots of an individual node when specific pods are scheduled on it:

--blocking-pod-selector=runtime=long,cost=expensive

Since label selector strings use commas to express logical 'and', you can specify this parameter multiple times for 'or':

--blocking-pod-selector=runtime=long,cost=expensive
--blocking-pod-selector=name=temperamental

In this case, the presence of either an (appropriately labelled) expensive long running job or a known temperamental pod on a node will stop it rebooting.

Try not to abuse this mechanism - it's better to strive for restartability where possible. If you do use it, make sure you set up a RebootRequired alert as described in the next section so that you can intervene manually if reboots are blocked for too long.

Prometheus Metrics

Each kured pod exposes a single gauge metric (:8080/metrics) that indicates the presence of the sentinel file:

# HELP kured_reboot_required OS requires reboot due to software updates.
# TYPE kured_reboot_required gauge
kured_reboot_required{node="ip-xxx-xxx-xxx-xxx.ec2.internal"} 0

The purpose of this metric is to power an alert which will summon an operator if the cluster cannot reboot itself automatically for a prolonged period:

# Alert if a reboot is required for any machines. Acts as a failsafe for the
# reboot daemon, which will not reboot nodes if there are pending alerts save
# this one.
ALERT RebootRequired
  IF          max(kured_reboot_required) != 0
  FOR         24h
  LABELS      { severity="warning" }
  ANNOTATIONS {
    summary = "Machine(s) require being rebooted, and the reboot daemon has failed to do so for 24 hours",
    impact = "Cluster nodes more vulnerable to security exploits. Eventually, no disk space left.",
    description = "Machine(s) require being rebooted, probably due to kernel update.",
  }

If you choose to employ such an alert and have configured kured to probe for active alerts before rebooting, be sure to specify --alert-filter-regexp=^RebootRequired$ to avoid deadlock!

Slack Notifications

If you specify a Slack hook via --slack-hook-url, kured will notify you immediately prior to rebooting a node:

Notification

We recommend setting --slack-username to be the name of the environment, e.g. dev or prod.

Alternatively you can use the --message-template-drain and --message-template-reboot to customize the text of the message, e.g.

--message-template-drain="Draining node %s part of *my-cluster* in region *xyz*"

Overriding Lock Configuration

The --ds-name and --ds-namespace arguments should match the name and namespace of the daemonset used to deploy the reboot daemon - the locking is implemented by means of an annotation on this resource. The defaults match the daemonset YAML provided in the repository.

Similarly --lock-annotation can be used to change the name of the annotation kured will use to store the lock, but the default is almost certainly safe.

Operation

The example commands in this section assume that you have not overriden the default lock annotation, daemonset name or namespace; if you have, you will have to adjust the commands accordingly.

Testing

You can test your configuration by provoking a reboot on a node:

sudo touch /var/run/reboot-required

Disabling Reboots

If you need to temporarily stop kured from rebooting any nodes, you can take the lock manually:

kubectl -n kube-system annotate ds kured weave.works/kured-node-lock='{"nodeID":"manual"}'

Don't forget to release it afterwards!

Manual Unlock

In exceptional circumstances, such as a node experiencing a permanent failure whilst rebooting, manual intervention may be required to remove the cluster lock:

kubectl -n kube-system annotate ds kured weave.works/kured-node-lock-

NB the - at the end of the command is important - it instructs kubectl to remove that annotation entirely.

Automatic Unlock

In exceptional circumstances (especially when used with cluster-autoscaler) a node which holds lock might be killed thus annotation will stay there for ever.

Using --lock-ttl=30m will allow other nodes to take over if TTL has expired (in this case 30min) and continue reboot process.

Building

Kured now uses Go Modules, so build instructions vary depending on where you have checked out the repository:

Building outside $GOPATH:

make

Building inside $GOPATH:

GO111MODULE=on make

You can find the current preferred version of Golang in the go.mod file.

If you are interested in contributing code to kured, please take a look at our development docs.

Frequently Asked/Anticipated Questions

Why is there no latest tag on Docker Hub?

Use of latest for production deployments is bad practice - see here for details. The manifest on master refers to latest for local development testing with minikube only; for production use choose a versioned manifest from the release page.

Getting Help

If you have any questions about, feedback for or problems with kured:

We follow the CNCF Code of Conduct.

Your feedback is always welcome!

Owner
Weaveworks
weaving containers into applications
Weaveworks
Comments
  • Migrate to kubereboot/kured

    Migrate to kubereboot/kured

    As we are now a CNCF Sandbox project, we want to move to our new home kubereboot/kured, but we want to do it sensibly.

    Items likely involved are:

    • [x] ~~create a kubereboot/kured docker hub~~
    • [x] ~~push our old releases there~~
    • [x] Ask Weaveworks IT to transfer the repo
    • [x] Make sure our Helm setup continues to work
    • [x] Update instructions
    • [x] ... please add ...
  • Implement universal notification mechanism using shoutrrr

    Implement universal notification mechanism using shoutrrr

    Closes https://github.com/weaveworks/kured/issues/117 Closes https://github.com/weaveworks/kured/issues/94 This PR doesn't interfere with the slack-notification logic that is in place. Instead of using only slack-related args, one can pass to kured flags:

    • --notify-url=<URL link that follows this syntax as f(chat tech solution): https://containrrr.dev/shoutrrr/services/overview/>

    So far, I've tested it on SUSE CaaSP 4.2 with k8s 1.17, on vmware. I've tested rocketchat. It worked as it should. Further this functionality can be more ellaborated, specifying (may be depending on a --verbosity input) for each situation (like draining, cordoning, reboot and so on).

  • Option to allow reboots only in a time window

    Option to allow reboots only in a time window

    Hello there,

    as a possible feature, can you add an option so that reboots can be constrained in a specific time window? This way I would be able to have the reboots, let's say, happening only at night time.

    Thank you

  • Kured v.1.2 fails on Kubernetes v.1.16 (on prem, kubeadm created)

    Kured v.1.2 fails on Kubernetes v.1.16 (on prem, kubeadm created)

    Kured v.1.2 fails on my kubeadm created K8s v.1.16.0 cluster. It seems that this issue has been solved by fix #75, which isn't a part of a release yet.

    The error message I received before compiling kubed from the latest source was:

    time="2019-09-22T19:45:07Z" level=info msg="Blocking Pod Selectors: []" time="2019-09-22T19:45:07Z" level=fatal msg="Error testing lock: the server could not find the requested resource"

    It would be really nice to get a kubed release v.1.3 including an updated stable helm chart. This will hopefully make the stable kured helm chart work on a K8s v.1.16 clusters without modifications.

    I have not used Kured on older Kubernetes versions (yet).

  • Does Kured help in updating the AKS node OS images of type Availability sets?

    Does Kured help in updating the AKS node OS images of type Availability sets?

    Team,

    We have AKS cluster v1.23.5 with nodes of type Availability sets. Does kured help in updating these node images for security patches/updates.

    We tried updating them using azure cli with a flag --node-image-only but getting an error because we don't have virtual machine scale set.

    - This cluster is not using VirtualMachineScaleSets. Node image upgrade only operation can only be applied on VirtualMachineScaleSets cluster
    

    Please let me know if Kured can help in this case.. We are facing lot of vulnerability challenges since many months and we have to resolve them asap.. Would really appreciate your support.

    Thank you

  • Problems pulling docker image

    Problems pulling docker image

    This morning I'm getting this error when trying to pull the kured docker image.

    sudo docker pull quay.io/weaveworks/kured:master-5731b98 Error response from daemon: Get https://quay.io/v2/weaveworks/kured/manifests/master-5731b98: unknown: Namespace weaveworks has been disabled. Please contact a system administrator.

  • Add `label-with-exclude-from-external-lbs` CLI argument to enable graceful removal/addition from external load balancers

    Add `label-with-exclude-from-external-lbs` CLI argument to enable graceful removal/addition from external load balancers

    Previously, kured issued the system reboot command without first removing nodes from any connected external load balancers (ELBs).

    This behavior caused downtime on restart because ELBs send traffic to kube-proxy pods running on nodes until the ELB health checks fail or the node is de-registered explicitly.

    This patch solves the problem by adding a command line argument (label-with-exclude-from-external-lbs) that, when enabled, adds a "node.kubernetes.io/exclude-from-external-load-balancers" label to nodes undergoing kured reboot. This label tells the Kubernetes control plane to de-register the affected node from any connected ELBs. The node label is removed after restart which causes the control plane to re-register the node with the ELBs.

    Close https://github.com/weaveworks/kured/issues/358

  • Adjust kured to work with rancheros

    Adjust kured to work with rancheros

    In Rancher os pid 1 is docker. in this pid the namespace mnt doesn't have /usb/bin/test To reboot rancheros node:

    IN_DOCKER=true /sbin/reboot

    Tested on RancherOS. This change check the name of pid 1, if the name contains 'systemd' proceed normaly, otherwise send the comands (test the sentinel and reboot) like RancherOS.

    fix #65

  • Testing KureD on AKS Cluster and Frequency of Reboots on AKS?

    Testing KureD on AKS Cluster and Frequency of Reboots on AKS?

    We have a lab cluster sitting in AKS that has had the KureD daemon set running on it since March 10, about 20 days ago, but haven't gotten any reboot messages. Does anyone know how often nodes do need to be rebooted on AKS? And does KureD log when it does reboot a node? Is SSH into the VMSS the only way to trigger a test?

    time="2020-03-10T16:31:29Z" level=info msg="Kubernetes Reboot Daemon: 1.3.0" time="2020-03-10T16:31:29Z" level=info msg="Node ID: aks-XXXX-XXXXXX-vmssXXXX" time="2020-03-10T16:31:29Z" level=info msg="Lock Annotation: kube-system/kured:weave.works/kured-node-lock" time="2020-03-10T16:31:29Z" level=info msg="Reboot Sentinel: /var/run/reboot-required every 12h0m0s" time="2020-03-10T16:31:29Z" level=info msg="Blocking Pod Selectors: []" time="2020-03-10T16:31:29Z" level=info msg="Reboot on: SunMonTueWedThuFriSat between 19:00 and 07:00 America/New_York"

  • Replaced --annotationTTL with --lockTTL and made it work correctly

    Replaced --annotationTTL with --lockTTL and made it work correctly

    This is a follow-up on #54, #119 & #143

    • I renamed the --annotationTTL flag into --lockTTL as I believe it is a more pertinent nomenclature
    • I fixed its implementation as reported not being functional in #143
  • Support unprivileged container

    Support unprivileged container

    Add support to run kured as a non-privileged container also restraining it from using any capability apart from SYS_BOOT.

    Relates to https://github.com/weaveworks/kured/issues/60 and https://github.com/SUSE/skuba/issues/1237

  • Prevent cluster autoscaler from scaling down rebooted node

    Prevent cluster autoscaler from scaling down rebooted node

    If the cluster autoscaler is quick enough to scale up when rebooting occurs, the workload from the rebooting node is moved to the new node before the rebooting process has finished, leaving the newly rebooted node unneeded. To prevent this from happening, we would need to be able to annotate the rebooting node with the following annotation:

    "cluster-autoscaler.kubernetes.io/scale-down-disabled": "true"

    The idea is that with this annotation, the cluster autoscaler will allow the newly rebooted node to last while it figures out that by moving the workload back to this node, the node that was scaled up can be removed instead of the rebooted one.

    It would also be advisable to introduce an option to remove the annotation mentioned above after a configurable amount of time, so the cluster autoscaler can go about doing its buisness as usual again.

    @jackfrancis has voluntered to do the work required for this.

  • New Annotation reboot-required

    New Annotation reboot-required

    Hi

    We have some workload in the cluster, which can not be moved at will and requires maintenance windows. It would be much easier for our tools to move stuff around if Kured would set an annotation like weave.works/kured-reboot-required=true and we must not fall back to access an Prometheus.

    weave.works/kured-most-recent-reboot-needed does not work for us, since it is only set when the reboot process is already started. We need an indicator beforehand, so we don't schedule long running stuff on it and move single instances in their respective maintenance window from the node.

    Is this something the community would be interested in?

  • Azure/container-scan is not maintained any more

    Azure/container-scan is not maintained any more

    From https://github.com/Azure/container-scan

    Deprecation Notice

    This project is no longer actively maintained, and has had some deficiencies for sometime now. If anyone is interested to implement the action logic on their own or fork the repo then feel free to do so.

  • Make metrics port configurable

    Make metrics port configurable

    Kured is hard-coded to serve metrics on port 8080. Can this be made configurable?

    There are certain cases where a port conflict could occur. For instance, we run node-local-dns cache, which also runs as a privileged workload and uses port 8080 to serve a health check endpoint.

  • How to manage control plan and etcd node by kured reboot daemonset

    How to manage control plan and etcd node by kured reboot daemonset

    We are looking for solution that after auto parch Kubernetes ubuntu node require (worker node, Etcd node and control node) automatically reboot (drain, Corden and after reboot uncorden) and it worked on worker node as expected. But we observed that on etcd and control plan node kured solution is not worked, in fact it seems kured daemonset not manage etcd and control plan node.

    Now concern is that, Is kured solution manage etcd and control plan node as well ? If yes then how we can configure it.

    It would be great if we get guidance or help on this subject.

    Thanks Ganesh

Related tags
App for VMware Workstartion to auto start VM's on windows reboot

VMware Workstation AutoStart This is an auto start app for VMware Workstation to auto start VM's on windows reboot with VMware Workstation installed.

Dec 15, 2021
Reboot a machine without interrupting ongoing work

Smart Reboot What? Smart Reboot is a daemon(smartrebootd) and CLI(smartreboot) t

Dec 11, 2022
Kubernetes OS Server - Kubernetes Extension API server exposing OS configuration like sysctl via Kubernetes API

KOSS is a Extension API Server which exposes OS properties and functionality using Kubernetes API, so it can be accessed using e.g. kubectl. At the moment this is highly experimental and only managing sysctl is supported. To make things actually usable, you must run KOSS binary as root on the machine you will be managing.

May 19, 2021
Bastionzeros Agent and Daemon!

Bzero Bastionzero Bastionzero is a simple to use zero trust access SaaS for dynamic cloud environments. Bastionzero is the most secure way to lock dow

Oct 12, 2022
A proof-of-concept project that makes accessible buildkitd daemon from macOS

buildkit-machine buildkit-machine allows you to make buildkitd daemon accessible in your macOS environment. To do so, it uses lima, which is a Linux s

Dec 21, 2022
This is a POC for a Falco Plugin allowing to gather events from a locale docker daemon.

Docker Events Plugin This is a POC for a Falco Plugin allowing to gather events from a locale docker daemon. ⚠️ This is a POC, don't use in Production

Apr 15, 2022
nerdctl daemon (Docker API)
nerdctl daemon (Docker API)

nerdctld This is a daemon offering a nerdctl.sock endpoint. It can be used with DOCKER_HOST=unix://nerdctl.sock. Normally the nerdctl tool is a CLI-on

Dec 15, 2022
CetusGuard is a tool that allows to protect the Docker daemon socket by filtering the calls to its API endpoints.

CetusGuard CetusGuard is a tool that allows to protect the Docker daemon socket by filtering the calls to its API endpoints. Some highlights: It is wr

Dec 23, 2022
Litmus helps Kubernetes SREs and developers practice chaos engineering in a Kubernetes native way.
Litmus helps Kubernetes SREs and developers practice chaos engineering in a Kubernetes native way.

Litmus Cloud-Native Chaos Engineering Read this in other languages. ???? ???? ???? ???? Overview Litmus is a toolset to do cloud-native chaos engineer

Jan 1, 2023
KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes
 KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes

Kubernetes-based Event Driven Autoscaling KEDA allows for fine-grained autoscaling (including to/from zero) for event driven Kubernetes workloads. KED

Jan 7, 2023
vcluster - Create fully functional virtual Kubernetes clusters - Each cluster runs inside a Kubernetes namespace and can be started within seconds
vcluster - Create fully functional virtual Kubernetes clusters - Each cluster runs inside a Kubernetes namespace and can be started within seconds

Website • Quickstart • Documentation • Blog • Twitter • Slack vcluster - Virtual Clusters For Kubernetes Lightweight & Low-Overhead - Based on k3s, bu

Jan 4, 2023
network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of kubernetes.
network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of kubernetes.

Network Node Manager network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of ku

Dec 18, 2022
A k8s vault webhook is a Kubernetes webhook that can inject secrets into Kubernetes resources by connecting to multiple secret managers
A k8s vault webhook is a Kubernetes webhook that can inject secrets into Kubernetes resources by connecting to multiple secret managers

k8s-vault-webhook is a Kubernetes admission webhook which listen for the events related to Kubernetes resources for injecting secret directly from sec

Oct 15, 2022
Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes.
Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes.

Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes. This project is inspired by agones. Introduction Genera

Nov 25, 2022
Kubei is a flexible Kubernetes runtime scanner, scanning images of worker and Kubernetes nodes providing accurate vulnerabilities assessment, for more information checkout:
Kubei is a flexible Kubernetes runtime scanner, scanning images of worker and Kubernetes nodes providing accurate vulnerabilities assessment, for more information checkout:

Kubei is a vulnerabilities scanning and CIS Docker benchmark tool that allows users to get an accurate and immediate risk assessment of their kubernet

Dec 30, 2022
The OCI Service Operator for Kubernetes (OSOK) makes it easy to connect and manage OCI services from a cloud native application running in a Kubernetes environment.

OCI Service Operator for Kubernetes Introduction The OCI Service Operator for Kubernetes (OSOK) makes it easy to create, manage, and connect to Oracle

Sep 27, 2022
Kubernetes IN Docker - local clusters for testing Kubernetes
Kubernetes IN Docker - local clusters for testing Kubernetes

kind is a tool for running local Kubernetes clusters using Docker container "nodes".

Jan 5, 2023
An Easy to use Go framework for Kubernetes based on kubernetes/client-go

k8devel An Easy to use Go framework for Kubernetes based on kubernetes/client-go, see examples dir for a quick start. How to test it ? Download the mo

Mar 25, 2022
PolarDB-X Operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes.

GalaxyKube -- PolarDB-X Operator PolarDB-X Operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes. It follo

Dec 19, 2022