Machine controller manager provider local

machine-controller-manager-provider-local

Go Report Card

Out of tree (controller-based) implementation for local as a new provider. The local out-of-tree provider implements the interface defined at MCM OOT driver.

Fundamental Design Principles

Following are the basic principles kept in mind while developing the external plugin.

  • Communication between this Machine Controller (MC) and Machine Controller Manager (MCM) is achieved using the Kubernetes native declarative approach.
  • Machine Controller (MC) behaves as the controller used to interact with the cloud provider AWS and manage the VMs corresponding to the machine objects.
  • Machine Controller Manager (MCM) deals with higher level objects such as machine-set and machine-deployment objects.

Testing the Controller

  1. Open terminal and change directory to $GOPATH/src/github.com/gardener. Clone this repository.

  2. Navigate to $GOPATH/src/github.com/gardener/machine-controller-manager-provider-local:

    • In the MAKEFILE make sure $TARGET_KUBECONFIG points to the cluster where you wish to manage machines. $CONTROL_NAMESPACE represents the namespaces where MCM is looking for machine CR objects, and $CONTROL_KUBECONFIG points to the cluster which holds these machine CRs.

    • Run the machine controller (driver) using the command below.

      make start
  3. On the second terminal pointing to $GOPATH/src/github.com/gardener,

    • Clone the latest MCM code:

      git clone [email protected]:gardener/machine-controller-manager.git
    • Navigate to the newly created directory:

      cd machine-controller-manager
    • Deploy the required CRDs from the machine-controller-manager repo:

      kubectl apply -f kubernetes/crds.yaml
    • Run the machine-controller-manager:

      make start
  4. On the third terminal pointing to $GOPATH/src/github.com/gardener/machine-controller-manager-provider-local

    • Fill in the object files given below and deploy them as described below.

    • Deploy the machine-class

      kubectl apply -f kubernetes/machine-class.yaml
    • Deploy the kubernetes secret if required.

      kubectl apply -f kubernetes/secret.yaml
    • Deploy the machine object and make sure it joins the cluster successfully.

      kubectl apply -f kubernetes/machine.yaml
    • Once machine joins, you can test by deploying a machine-deployment.

    • Deploy the machine-deployment object and make sure it joins the cluster successfully.

      kubectl apply -f kubernetes/machine-deployment.yaml
    • Make sure to delete both the machine and machine-deployment object after use.

      kubectl delete -f kubernetes/machine.yaml
      kubectl delete -f kubernetes/machine-deployment.yaml

Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.

Feedback and Support

Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).

Owner
Comments
  • Vendor `machine-controller-manager@0.47.0`

    Vendor `[email protected]`

    What this PR does / why we need it: Vendor [email protected]

    Which issue(s) this PR fixes: Part of https://github.com/gardener/gardener/issues/6567

    Special notes for your reviewer:

    Release note:

    The following dependency is updated:
    - github.com/gardener/machine-controller-manager v0.45.0 -> 0.47.0
    
  • Bump `kindest/node` to `v1.21.12`

    Bump `kindest/node` to `v1.21.12`

    What this PR does / why we need it: Bump kindest/node to v1.21.12. This implicitly upgrades the containerd version which itself upgrades the version of the github.com/imdario/mergo dependency which fixes https://github.com/imdario/mergo/issues/90 which is required to properly import additional containerd configuration.

  • Explicitly delete artefacts in `DeleteMachine` function and do not rely on garbage collection

    Explicitly delete artefacts in `DeleteMachine` function and do not rely on garbage collection

    Otherwise, the Machine object might already disappear from the system while the "backing VM" (aka the Pod) still exists. This can lead to interesting situations (e.g. we saw that the Node of a deleted machine was re-registered after deletion).

    Related to https://github.com/gardener/gardener/issues/5895

  • Install `jq`

    Install `jq`

    ref https://github.com/gardener/gardener/blob/master/docs/extensions/operatingsystemconfig.md#contract-operatingsystemconfig-resource

    /invite @timebertt

  • `machine-controller-manager-provider-local node` image build fails

    `machine-controller-manager-provider-local node` image build fails

    What happened: machine-controller-manager-provider-local node image build fails with:

    #10 [linux/amd64 2/9] RUN apt-get update -yq &&     apt-get install -yq --no-install-recommends wget apparmor apparmor-utils jq
    #0 0.193 Ign:1 http://security.ubuntu.com/ubuntu impish-security InRelease
    #0 0.210 Err:2 http://security.ubuntu.com/ubuntu impish-security Release
    #0 0.210   404  Not Found [IP: 91.189.91.39 80]
    #0 0.318 Ign:3 http://archive.ubuntu.com/ubuntu impish InRelease
    #0 0.396 Ign:4 http://archive.ubuntu.com/ubuntu impish-updates InRelease
    ...
    #0 0.711 Err:8 http://archive.ubuntu.com/ubuntu impish-backports Release
    #0 0.712   404  Not Found [IP: 185.125.190.39 80]
    #0 0.719 Reading package lists...
    #0 0.725 E: The repository 'http://security.ubuntu.com/ubuntu impish-security Release' does not have a Release file.
    #0 0.726 E: The repository 'http://archive.ubuntu.com/ubuntu impish Release' does not have a Release file.
    #0 0.728 E: The repository 'http://archive.ubuntu.com/ubuntu impish-updates Release' does not have a Release file.
    #0 0.728 E: The repository 'http://archive.ubuntu.com/ubuntu impish-backports Release' does not have a Release file.
    #10 ERROR: process "/bin/sh -c apt-get update -yq &&     apt-get install -yq --no-install-recommends wget apparmor apparmor-utils jq" did not complete successfully: exit code: 100
    ------
    

    Ref This is due to https://github.com/kubernetes-sigs/kind/issues/2863, and it is fixed in the kindest/node:v1.25.0 image but we cannot use this yet because of https://github.com/gardener/gardener/issues/5325

    What you expected to happen: Node image build to be successful.

    How to reproduce it (as minimally and precisely as possible): Run docker build -t test:v1 ./node/. locally with the current kind image (https://github.com/gardener/machine-controller-manager-provider-local/blob/c85c4eaf942d87fa1cb694d6df79016c5ffe28e2/node/Dockerfile#L1). and see it fail on step: https://github.com/gardener/machine-controller-manager-provider-local/blob/c85c4eaf942d87fa1cb694d6df79016c5ffe28e2/node/Dockerfile#L6-L7 with the above error.

    Run again with kindest/node:v1.25.0 and see it passes. Anything else we need to know:

    Environment:

  • Missing kubectl binary after docker-for-desktop restart

    Missing kubectl binary after docker-for-desktop restart

    After restarting the docker deamon on my local machine and subsequent reconciliation of the local Shoot, I observed that the pod representing the Shoot's machine reported the following event:

    Warning  Unhealthy  2m36s (x543 over 92m)  kubelet  Readiness probe failed: sh: 1: /opt/bin/kubectl: not found
    

    Relevant MCM logs

    • it seems like the machine was detected as orphaned and terminated
    I0321 10:37:11.826964       1 machine_safety.go:297] SafetyController: Orphan VM found and terminated VM: machine-shoot--local--local-local-58d9b-dvnb7, machine-shoot--local--local-local-58d9b-dvnb7
    I0321 10:37:11.827059       1 machine_safety.go:68] reconcileClusterMachineSafetyOrphanVMs: End, reSync-Period: 30m0s
    W0321 10:42:20.627178       1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
    I0321 10:43:14.405600       1 machine_util.go:554] Conditions of Machine "shoot--local--local-local-58d9b-dvnb7" with providerID "machine-shoot--local--local-local-58d9b-dvnb7" and backing node "machine-shoot--local--local-local-58d9b-dvnb7" are changing
    W0321 10:43:14.405711       1 machine_util.go:562] Machine shoot--local--local-local-58d9b-dvnb7 is unhealthy - changing MachineState to Unknown
    I0321 10:43:14.417333       1 machine_util.go:715] Machine State has been updated for "shoot--local--local-local-58d9b-dvnb7" with providerID "machine-shoot--local--local-local-58d9b-dvnb7" and backing node "machine-shoot--local--local-local-58d9b-dvnb7"
    W0321 10:47:40.397396       1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
    W0321 10:52:48.194677       1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
    E0321 10:53:29.006090       1 machine_util.go:688] Machine shoot--local--local-local-58d9b-dvnb7 is not healthy since 10m0s minutes. Changing status to failed. Node Conditions: [{Type:KernelDeadlock Status:False LastHeartbeatTime:2022-03-18 18:37:41 +0000 UTC LastTransitionTime:2022-03-18 14:02:18 +0000 UTC Reason:KernelHasNoDeadlock Message:kernel has no deadlock} {Type:ReadonlyFilesystem Status:False LastHeartbeatTime:2022-03-18 18:37:41 +0000 UTC LastTransitionTime:2022-03-18 14:02:18 +0000 UTC Reason:FilesystemIsNotReadOnly Message:Filesystem is not read-only} {Type:FrequentKubeletRestart Status:Unknown LastHeartbeatTime:2022-03-18 18:37:41 +0000 UTC LastTransitionTime:2022-03-18 14:02:20 +0000 UTC Reason:NoFrequentKubeletRestart Message:error watching journald: failed to stat the log path "/var/log/journal": stat /v} {Type:FrequentDockerRestart Status:Unknown LastHeartbeatTime:2022-03-18 18:37:41 +0000 UTC LastTransitionTime:2022-03-18 14:02:21 +0000 UTC Reason:NoFrequentDockerRestart Message:error watching journald: failed to stat the log path "/var/log/journal": stat /v} {Type:FrequentContainerdRestart Status:Unknown LastHeartbeatTime:2022-03-18 18:37:41 +0000 UTC LastTransitionTime:2022-03-18 14:02:21 +0000 UTC Reason:NoFrequentContainerdRestart Message:error watching journald: failed to stat the log path "/var/log/journal": stat /v} {Type:FrequentUnregisterNetDevice Status:Unknown LastHeartbeatTime:2022-03-18 18:37:41 +0000 UTC LastTransitionTime:2022-03-18 14:02:20 +0000 UTC Reason:NoFrequentUnregisterNetDevice Message:error watching journald: failed to stat the log path "/var/log/journal": stat /v} {Type:MemoryPressure Status:Unknown LastHeartbeatTime:2022-03-18 18:38:04 +0000 UTC LastTransitionTime:2022-03-21 10:43:14 +0000 UTC Reason:NodeStatusUnknown Message:Kubelet stopped posting node status.} {Type:DiskPressure Status:Unknown LastHeartbeatTime:2022-03-18 18:38:04 +0000 UTC LastTransitionTime:2022-03-21 10:43:14 +0000 UTC Reason:NodeStatusUnknown Message:Kubelet stopped posting node status.} {Type:PIDPressure Status:Unknown LastHeartbeatTime:2022-03-18 18:38:04 +0000 UTC LastTransitionTime:2022-03-21 10:43:14 +0000 UTC Reason:NodeStatusUnknown Message:Kubelet stopped posting node status.} {Type:Ready Status:Unknown LastHeartbeatTime:2022-03-18 18:38:04 +0000 UTC LastTransitionTime:2022-03-21 10:43:14 +0000 UTC Reason:NodeStatusUnknown Message:Kubelet stopped posting node status.}]
    

    I suspect that it is related to the docker deamon restart, but I was unable to investigate the issue as the machine was replaced by the MCM - will update this issue if it happens again.

  • The Machine objects should not be deployed in the shoot namespace.

    The Machine objects should not be deployed in the shoot namespace.

    Currently the machine objects are deployed in the shoot namespace in the seed. This is not compatible with the control plane migration, because at the end of the migration that namespace is deleted and with it all machine objects. This is not desired in a scenario with control plane migration, so the the machine objects should be created in a different namespace, most probably dedicated only for them.

    cc @plkokanov @rfranzke

Image clone controller is a kubernetes controller to safe guard against the risk of container images disappearing

Image clone controller image clone controller is a kubernetes controller to safe guard against the risk of container images disappearing from public r

Oct 10, 2021
A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore

bookstore-sample-controller A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore. A resource cre

Jan 20, 2022
network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of kubernetes.
network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of kubernetes.

Network Node Manager network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of ku

Dec 18, 2022
kubetnl tunnels TCP connections from within a Kubernetes cluster to a cluster-external endpoint, e.g. to your local machine. (the perfect complement to kubectl port-forward)

kubetnl kubetnl (kube tunnel) is a command line utility to tunnel TCP connections from within a Kubernetes to a cluster-external endpoint, e.g. to you

Dec 16, 2022
kolm - Kubernetes on your local machine

kolm - Kubernetes on your local machine kolm is a tool for running a Kubernetes 'cluster' consisting of an etcd and a kube-apiserver on your local mac

May 19, 2022
Implementations of Power VS Provider for the OpenShift machine-api

Machine API Provider Power VS This repository contains implementations of Power VS Provider for the OpenShift machine-api. This provider runs as a mac

Jan 31, 2022
Local Storage is one of HwameiStor components. It will provision the local LVM volume.
Local Storage is one of HwameiStor components. It will provision the local LVM volume.

Local Storage Module English | Simplified_Chinese Introduction Local Storage is one of modules of HwameiStor which is a cloud native local storage sys

Aug 6, 2022
OpenAPI Terraform Provider that configures itself at runtime with the resources exposed by the service provider (defined in a swagger file)
OpenAPI Terraform Provider that configures itself at runtime with the resources exposed by the service provider (defined in a swagger file)

Terraform Provider OpenAPI This terraform provider aims to minimise as much as possible the efforts needed from service providers to create and mainta

Dec 26, 2022
Terraform provider to help with various AWS automation tasks (mostly all that stuff we cannot accomplish with the official AWS terraform provider)
Terraform provider to help with various AWS automation tasks (mostly all that stuff we cannot accomplish with the official AWS terraform provider)

terraform-provider-awsutils Terraform provider for performing various tasks that cannot be performed with the official AWS Terraform Provider from Has

Dec 8, 2022
provider-kubernetes is a Crossplane Provider that enables deployment and management of arbitrary Kubernetes objects on clusters

provider-kubernetes provider-kubernetes is a Crossplane Provider that enables deployment and management of arbitrary Kubernetes objects on clusters ty

Dec 14, 2022
Terraform-provider-mailcow - Terraform provider for Mailcow

Terraform Provider Scaffolding (Terraform Plugin SDK) This template repository i

Dec 31, 2021
Provider-generic-workflows - A generic provider which uses argo workflows to define the backend actions.

provider-generic-workflows provider-generic-workflows is a generic provider which uses argo workflows for managing the external resource. This will re

Jan 1, 2022
Terraform-provider-buddy - Terraform Buddy provider For golang

Terraform Provider for Buddy Documentation Requirements Terraform >= 1.0.11 Go >

Jan 5, 2022
Hashicups-tf-provider - HashiCups Terraform Provider Tutorial

Terraform Provider HashiCups Run the following command to build the provider go

Jan 10, 2022
Terraform-provider-vercel - Terraform Vercel Provider With Golang

Vercel Terraform Provider Website: https://www.terraform.io Documentation: https

Dec 14, 2022
Provider-milvus - Milvus provider for crossplane

provider-milvus provider-milvus is a minimal Crossplane Provider that is meant t

Feb 9, 2022
Terraform-provider-age - Age Terraform Provider with golang

Age Terraform Provider This provider lets you generate an Age key pair. Using th

Feb 15, 2022
Terraform-equinix-migration-tool - Tool to migrate code from Equinix Metal terraform provider to Equinix terraform provider

Equinix Terraform Provider Migration Tool This tool targets a terraform working

Feb 15, 2022
Local Disk Manager is one of HwameiStor components
Local Disk Manager is one of HwameiStor components

Local Disk Manager is one of HwameiStor components. It will manage all the local disks of the HwameiStor nodes, including provision local Disk volume, and disk health management.

Aug 6, 2022