Kubegres is a Kubernetes operator allowing to create a cluster of PostgreSql instances and manage databases replication, failover and backup.

Kubegres is a Kubernetes operator allowing to deploy a cluster of PostgreSql pods with data replication enabled out-of-the box. It brings simplicity when using PostgreSql considering how complex managing stateful-set's life-cycle and data replication could be with Kubernetes.

Features

  • It creates a cluster of PostgreSql servers with data replication enabled: it creates a Primary PostgreSql pod and a number of Replica PostgreSql pods and replicates primary's database in real-time to Replica pods.

  • It manages fail-over: if a Primary PostgreSql crashes, it automatically promotes a Replica PostgreSql as a Primary.

  • It has a data backup option allowing to dump PostgreSql data regularly in a given volume.

  • It provides a very simple YAML with properties specialised for PostgreSql.

  • It is resilient, has over 55 automatized tests cases and has been running in production.

Please click here to get started.

More details

Kubegres was developed by Reactive Tech Limited and Alex Arica as the lead developer. Reactive Tech offers support services for Kubegres, Kubernetes and PostgreSql.

It was developed with the framework Kubebuilder version 3, an SDK for building Kubernetes APIs using CRDs. Kubebuilder is maintained by the official Kubernetes API Machinery Special Interest Group (SIG).

Please click here to get started.

Contribute

If you would like to contribute to Kubegres' documentation, the GIT repo is available here. Any changes to the documentation will update the website https://www.kubegres.io/

Comments
  • Configure SSL

    Configure SSL

    I mounted a TLS secret successfully. But, I can't set up permissions for postgres to access the file. Is there any way to set fsGroup? Also, is there any way to set init containers?

  • Backup failed

    Backup failed

    We enabled the backup function as documented here and get the following error.

    28/08/2021 00:00:01 - Starting DB backup of Kubegres resource postgres into file: /var/lib/backup/postgres-backup-28_08_2021_00_00_01.gz
    28/08/2021 00:00:01 - Running: pg_dumpall -h postgres-replica -U postgres -c | gzip > /var/lib/backup/postgres-backup-28_08_2021_00_00_01.gz
    pg_dump: error: Dumping the contents of table "table" failed: PQgetResult() failed.
    pg_dump: error: Error message from server: ERROR:  canceling statement due to conflict with recovery
    DETAIL:  User query might have needed to see row versions that must be removed.
    pg_dump: error: The command was: COPY public.table(column1, column2) TO stdout;
    pg_dumpall: error: pg_dump failed on database "db", exiting
    28/08/2021 00:00:01 - DB backup completed for Kubegres resource postgres into file: /var/lib/backup/postgres-backup-28_08_2021_00_00_01.gz
    

    In this Stackoverflow post they recommend that we should activate hot_standby_feedback or increase max_standby. Is there a suggested solution from your side to overcome this issue?

  • Allow kubegres cluster to run on secure Kubernetes environments (Pod security policies)

    Allow kubegres cluster to run on secure Kubernetes environments (Pod security policies)

    Hi,

    First thanks for your work on a Postgres cluster Kubernetes Operator.

    We are deploying Kubernetes clusters in a secure by design manner using Rancher's RKE2 (aka RKE Government v1.20.11+rke2r2)

    This creates a cluster with a hardened Pod Security policy which forbids, among others pods, from running as root.

    This implies the workloads must have security context defined with at least these to settings (1001 is the postgres image running user):

    securityContext:
      - runAsNonRoot: true
      - runAsUser: 1001
    

    Looking at the baseConfigMap I also see you do some chown postgres:postgres when copying data from primary to replicas. I guess these would fail with these settings.

    In "enterprise" setups this is actually not needed as CSI provisioned Pas do belong to the pod running user AFAIK.

    While I understand this level of security is not needed by everyone and some users want to be able to run things in smaller clusters where security is not mandatory, I was wondering if it was possible to have some boolean flag in the CR yaml (ie: hardened : true/false) that would allow the workload to run in hardened PSP clusters.

    This flag would basically use an alternate baseConfigMap with no chown commands and the added securityContext to the statefulSet.

    FYI, right now we run a single node not replicated postgres server with the above securityContext with no issues whatsoever.

    Let me know what you think about this.

    many thanks,

    Eric

  • Add support to manage volumes from Kubegres YAML

    Add support to manage volumes from Kubegres YAML

    By default, the docker image of official PostgreSQL only configure 64MB for /dev/shm (check the Caveats section in Docker hub. It could cause problem for larger database.

    # This is taken from the pod generated by Kubegres
    Filesystem                  Size  Used Avail Use% Mounted on
    shm                          64M   64K   64M   1% /dev/shm
    

    There is one possible method, it is to increase the /dev/shm inside the container to be the same size as the host OS, default to 50% of RAM. It is mentioned in Stackoverflow.

    But currently, Kubegres Kind definition only allowed "volumeMount:". It would not recognize "volumes:" and "volumeMounts:", so it seems there is no way to increase the value for the shared memory.

  • master pod failure

    master pod failure

    When the master pod(1) is crashed, the second pod starts again as the master and a new pod(4) is created because the first pod is crashed. While the second pod and the fourth pod work synchronously, the third pod works alone.

  • Improve the failover process for Replica

    Improve the failover process for Replica

    Hi Alex, many thanks for your amazing work with Kubegres. I tried to test crash of secondary Postgres. I conducted 2 separate tests. In 1st test I scaled down STS of secondary Postgres to zero. In second test I stopped k3d node on which that STS is running. Unfortunately in both test nothing happened. It would be great if Kubegres will run new instance of secondary Postgres in that case to achieve desired state. Regards, Juliusz.

  • Disable expand Storage

    Disable expand Storage

    I just started using the operator and I think its got a really good potential. How do you expand the storage class for new Pods in the same Statefulset without affecting existing pods?

  • Is this project alive and actively maintained?

    Is this project alive and actively maintained?

    Just found it and as i see it, it's a little outdated last commit from 4 months ago, some of the issues we've seen are showstoppers for us..

    Is this project alive a worth pursuing it and try to help it, or better redirect our efforts to some other place?

    Sorry for by that rude, but as promising this project is, it seems not very ready for production use,,,

  • The update of an existing Kubegres resource fails if the field 'resources' contains a value with a decimal point

    The update of an existing Kubegres resource fails if the field 'resources' contains a value with a decimal point

    Thank you for maintaining this repo.

    I am looking for steps/recommendations for upgrading between minor versions and major versions.

    I am guessing that upgrading between minor versions is as simple as changing the container image, i.e. postgres:13.2 -> postgres:13.4.

    Now that the official image for Postgres 14 is available, are there any steps that need to be followed to go from postgres:13.2 -> postgres:14.0 ?

    Cheers.

  • When the field

    When the field "spec.database.storageClassName" is omitted in Kubegres YAML, Kubegres operator should assign the default storageClass to the deployed PostgreSql cluster

    Hi, I found that kubeges's controller segfaults and stays down when I omit the storageClassName property. It recovers once the offending Kubegres object is removed.

    Expected behavior: Kubegres would request PVCs with the default storage class, i.e. simply omit the storageClassName field when generating its PVCs.

    Steps to reproduce: Apply the following yaml:

    apiVersion: v1
    kind: Secret
    metadata:
      name: postgres
    type: Opaque
    stringData:
      rootpasswd: foo
      replpasswd: bar
      userpasswd: baz
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: postgres-conf
    data:
      primary_init_script.sh: |
        #!/bin/bash
        set -e
        psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
        CREATE DATABASE $MY_POSTGRES_USER;
        CREATE USER $MY_POSTGRES_USER WITH PASSWORD '$MY_POSTGRES_PASSWORD';
        GRANT ALL PRIVILEGES ON DATABASE $MY_POSTGRES_USER to $MY_POSTGRES_USER;
        EOSQL
    ---
    apiVersion: kubegres.reactive-tech.io/v1
    kind: Kubegres
    metadata:
      name: postgres
    spec:
      image: postgres:13.2
      port: 5432
      replicas: 1
      database:
        size: 10Gi
      env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres
              key: rootpasswd
        - name: POSTGRES_REPLICATION_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres
              key: replpasswd
        - name: MY_POSTGRES_USER
          value: notisvc
        - name: MY_POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres
              key: userpasswd
      customConfig: postgres-conf
    

    And observe how the manager container of kubegres-controller-manager panics immediately:

    manager
    2021-06-07T17:14:53.192Z    INFO controllers.Kubegres =======================================================
    manager
    2021-06-07T17:14:53.192Z    INFO controllers.Kubegres =======================================================
    manager
    2021-06-07T17:14:54.192Z    INFO controllers.Kubegres KUBEGRES {"name": "postgres", "Status": {"blockingOperation":{"statefulSetOperation":{},"statefulSetSpecUpdateOperation":{}},"previousBlockingOperation":{"statefulSetOperation":{},"statefulSetSpecUpdateOperation":{}}}}
    manager
    2021-06-07T17:14:54.192Z    INFO controllers.Kubegres Corrected an undefined value in Spec. {"spec.database.volumeMount": "New value: /var/lib/postgresql/data"}
    manager
    2021-06-07T17:14:54.192Z    INFO controllers.Kubegres Updating Kubegres Spec {"name": "postgres"}
    manager
    2021-06-07T17:14:54.192Z    DEBUG controller-runtime.manager.events Normal {"object": {"kind":"Kubegres","namespace":"kubegres-crash","name":"postgres","uid":"c4fcc9d9-e21c-4af3-9e33-18c1f980654e","apiVersion":"kubegres.reactive-tech.io/v1","resourceVersion":"62606471"}, "reason": "SpecCheckCorrection", "message": "Corrected an undefined value in Spec. 'spec.database.volumeMount': New value: /var/lib/postgresql/data"}
    manager
    E0607 17:14:54.207006 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
    manager
    goroutine 403 [running]:
    manager
    k8s.io/apimachinery/pkg/util/runtime.logPanic(0x15f20e0, 0x22f5700)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xa6
    manager
    k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x89
    manager
    panic(0x15f20e0, 0x22f5700)
    manager
        /usr/local/go/src/runtime/panic.go:969 +0x1b9
    manager
    reactive-tech.io/kubegres/controllers/states.(*DbStorageClassStates).getSpecStorageClassName(...)
    manager
        /workspace/controllers/states/DbStorageClassStates.go:80
    manager
    reactive-tech.io/kubegres/controllers/states.(*DbStorageClassStates).GetStorageClass(0xc000ae8b50, 0x8, 0xc000ea7be0, 0x19)
    manager
        /workspace/controllers/states/DbStorageClassStates.go:62 +0x47
    manager
    reactive-tech.io/kubegres/controllers/states.(*DbStorageClassStates).loadStates(0xc000ae8b50, 0x8, 0xc000ea7be0)
    manager
        /workspace/controllers/states/DbStorageClassStates.go:46 +0x2f
    manager
    reactive-tech.io/kubegres/controllers/states.loadDbStorageClass(...)
    manager
        /workspace/controllers/states/DbStorageClassStates.go:39
    manager
    reactive-tech.io/kubegres/controllers/states.(*ResourcesStates).loadDbStorageClassStates(0xc000ae9370, 0x14d91f5, 0x8)
    manager
        /workspace/controllers/states/ResourcesStates.go:75 +0xe5
    manager
    reactive-tech.io/kubegres/controllers/states.(*ResourcesStates).loadStates(0xc000ae9370, 0xc000843200, 0x0)
    manager
        /workspace/controllers/states/ResourcesStates.go:46 +0x45
    manager
    reactive-tech.io/kubegres/controllers/states.LoadResourcesStates(...)
    manager
        /workspace/controllers/states/ResourcesStates.go:40
    manager
    reactive-tech.io/kubegres/controllers/ctx/resources.CreateResourcesContext(0xc0005d22c0, 0x19a3880, 0xc000eaee70, 0x19ac260, 0xc0001ba2e0, 0x19b6f60, 0xc00044bae0, 0x19a06c0, 0xc00054a2c0, 0x0, ...)
    manager
        /workspace/controllers/ctx/resources/ResourcesContext.go:105 +0x6c5
    manager
    reactive-tech.io/kubegres/controllers.(*KubegresReconciler).Reconcile(0xc00054a300, 0x19a3880, 0xc000eaee70, 0xc0006bf2c0, 0xe, 0xc0006bf2a0, 0x8, 0xc000eaee70, 0x40a1ff, 0xc000030000, ...)
    manager
        /workspace/controllers/kubegres_controller.go:74 +0x17f
    manager
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0005448c0, 0x19a37c0, 0xc00029a000, 0x16482e0, 0xc001000460)
    manager
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263 +0x317
    manager
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0005448c0, 0x19a37c0, 0xc00029a000, 0x203000)
    manager
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x205
    manager
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1(0x19a37c0, 0xc00029a000)
    manager
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:198 +0x4a
    manager
    k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185 +0x37
    manager
    k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0004b0f50)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155 +0x5f
    manager
    k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000afbf50, 0x196dee0, 0xc000991620, 0xc00029a001, 0xc0003621e0)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156 +0xad
    manager
    k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0004b0f50, 0x3b9aca00, 0x0, 0xc00029a001, 0xc0003621e0)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133 +0x98
    manager
    k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x19a37c0, 0xc00029a000, 0xc0001547c0, 0x3b9aca00, 0x0, 0xc000704501)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185 +0xa6
    manager
    k8s.io/apimachinery/pkg/util/wait.UntilWithContext(0x19a37c0, 0xc00029a000, 0xc0001547c0, 0x3b9aca00)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99 +0x57
    manager
    created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
    manager
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:195 +0x4e7
    manager
    panic: runtime error: invalid memory address or nil pointer dereference [recovered]
    manager
        panic: runtime error: invalid memory address or nil pointer dereference
    manager
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x14aafe7]
    manager
    goroutine 403 [running]:
    manager
    k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x10c
    manager
    panic(0x15f20e0, 0x22f5700)
    manager
        /usr/local/go/src/runtime/panic.go:969 +0x1b9
    manager
    reactive-tech.io/kubegres/controllers/states.(*DbStorageClassStates).getSpecStorageClassName(...)
    manager
        /workspace/controllers/states/DbStorageClassStates.go:80
    manager
    reactive-tech.io/kubegres/controllers/states.(*DbStorageClassStates).GetStorageClass(0xc000ae8b50, 0x8, 0xc000ea7be0, 0x19)
    manager
        /workspace/controllers/states/DbStorageClassStates.go:62 +0x47
    manager
    reactive-tech.io/kubegres/controllers/states.(*DbStorageClassStates).loadStates(0xc000ae8b50, 0x8, 0xc000ea7be0)
    manager
        /workspace/controllers/states/DbStorageClassStates.go:46 +0x2f
    manager
    reactive-tech.io/kubegres/controllers/states.loadDbStorageClass(...)
    manager
        /workspace/controllers/states/DbStorageClassStates.go:39
    manager
    reactive-tech.io/kubegres/controllers/states.(*ResourcesStates).loadDbStorageClassStates(0xc000ae9370, 0x14d91f5, 0x8)
    manager
        /workspace/controllers/states/ResourcesStates.go:75 +0xe5
    manager
    reactive-tech.io/kubegres/controllers/states.(*ResourcesStates).loadStates(0xc000ae9370, 0xc000843200, 0x0)
    manager
        /workspace/controllers/states/ResourcesStates.go:46 +0x45
    manager
    reactive-tech.io/kubegres/controllers/states.LoadResourcesStates(...)
    manager
        /workspace/controllers/states/ResourcesStates.go:40
    manager
    reactive-tech.io/kubegres/controllers/ctx/resources.CreateResourcesContext(0xc0005d22c0, 0x19a3880, 0xc000eaee70, 0x19ac260, 0xc0001ba2e0, 0x19b6f60, 0xc00044bae0, 0x19a06c0, 0xc00054a2c0, 0x0, ...)
    manager
        /workspace/controllers/ctx/resources/ResourcesContext.go:105 +0x6c5
    manager
    reactive-tech.io/kubegres/controllers.(*KubegresReconciler).Reconcile(0xc00054a300, 0x19a3880, 0xc000eaee70, 0xc0006bf2c0, 0xe, 0xc0006bf2a0, 0x8, 0xc000eaee70, 0x40a1ff, 0xc000030000, ...)
    manager
        /workspace/controllers/kubegres_controller.go:74 +0x17f
    manager
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0005448c0, 0x19a37c0, 0xc00029a000, 0x16482e0, 0xc001000460)
    manager
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263 +0x317
    manager
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0005448c0, 0x19a37c0, 0xc00029a000, 0x203000)
    manager
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x205
    manager
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1(0x19a37c0, 0xc00029a000)
    manager
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:198 +0x4a
    manager
    k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185 +0x37
    manager
    k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0004b0f50)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155 +0x5f
    manager
    k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000afbf50, 0x196dee0, 0xc000991620, 0xc00029a001, 0xc0003621e0)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156 +0xad
    manager
    k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0004b0f50, 0x3b9aca00, 0x0, 0xc00029a001, 0xc0003621e0)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133 +0x98
    manager
    k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x19a37c0, 0xc00029a000, 0xc0001547c0, 0x3b9aca00, 0x0, 0xc000704501)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185 +0xa6
    manager
    k8s.io/apimachinery/pkg/util/wait.UntilWithContext(0x19a37c0, 0xc00029a000, 0xc0001547c0, 0x3b9aca00)
    manager
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99 +0x57
    manager
    created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
    manager
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:195 +0x4e7
    
  • Node scheduling

    Node scheduling

    Is it possible to specify node scheduling parameters? For example, if I want to ensure the pods schedule on on-demand nodes (instead of spot nodes) or if I want to add a toleration so it's able to schedule on a node with a taint.

  • service clusterip: none  , is this normal?

    service clusterip: none , is this normal?

    Hi,

    after installing postgres with kubegres i see there is no ips on service mypostgres/mypostgres-replica. How can this work?

    kubernetes: MicroK8s v1.25.4 revision 4221 kubegres: 1.16

    Steps:
    1. kubectl apply -f https://raw.githubusercontent.com/reactive-tech/kubegres/v1.16/kubegres.yaml
    2. create secret
    3. deploy this yaml:
    
    apiVersion: kubegres.reactive-tech.io/v1
    kind: Kubegres
    metadata:
      name: mypostgres
      namespace: default
    
    spec:
    
       replicas: 3 
       image: postgres:latest
    
       database:
          size: 1000Mi
    
       env:
          - name: POSTGRES_PASSWORD
            valueFrom:
               secretKeyRef:
                  name: mypostgres-secret
                  key: superUserPassword
    
          - name: POSTGRES_REPLICATION_PASSWORD
            valueFrom:
               secretKeyRef:
                  name: mypostgres-secret
                  key: replicationUserPassword
    
    
    
    5: results:
    
    k8sadm@microk8s-nas:~$ kubectl get all -l 'app=mypostgres' --show-managed-fields -A -o wide
    
    NAMESPACE   NAME                 READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES
    default     pod/mypostgres-1-0   1/1     Running   0          88m   10.1.91.141   microk8s-nas     <none>           <none>
    default     pod/mypostgres-2-0   1/1     Running   0          87m   10.1.69.16    microk8s-node2   <none>           <none>
    default     pod/mypostgres-3-0   1/1     Running   0          86m   10.1.91.143   microk8s-nas     <none>           <none>
    
    NAMESPACE   NAME                         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE   SELECTOR
    default     service/mypostgres           ClusterIP   None         <none>        5432/TCP   88m   app=mypostgres,replicationRole=primary
    default     service/mypostgres-replica   ClusterIP   None         <none>        5432/TCP   86m   app=mypostgres,replicationRole=replica
    
    NAMESPACE   NAME                            READY   AGE   CONTAINERS     IMAGES
    default     statefulset.apps/mypostgres-1   1/1     89m   mypostgres-1   postgres:latest
    default     statefulset.apps/mypostgres-2   1/1     88m   mypostgres-2   postgres:latest
    default     statefulset.apps/mypostgres-3   1/1     86m   mypostgres-3   postgres:latest
    
    
     dns tests:
    k8sadm@microk8s-nas:~$ kubectl get svc -A -l "k8s-app=kube-dns" -o wide
    NAMESPACE     NAME       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE     SELECTOR
    kube-system   kube-dns   ClusterIP   10.152.183.10   <none>        53/UDP,53/TCP,9153/TCP   4h54m   k8s-app=kube-dns
    
    k8sadm@microk8s-nas:~$ nslookup mypostgres.svc.cluster.local 10.152.183.10
    Server:		10.152.183.10
    Address:	10.152.183.10#53
    
    ** server can't find mypostgres.svc.cluster.local: NXDOMAIN
    
    
    

    regards

  • backup job doesn't start cause:  Not starting job because prior execution is running and concurrency policy is Forbid

    backup job doesn't start cause: Not starting job because prior execution is running and concurrency policy is Forbid

    Hi, I have a simple cluster: 1 master and 2 replicas. I have added to the deployment of my cluster backup configuration: backup: schedule: "0 */1 * * *" pvcName: my-backup-pvc volumeMount: /var/lib/backup But, It doesn't work well, because if I try to describe my cronjob I get this message: Normal JobAlreadyActive 112s (x5 over 121m) cronjob-controller Not starting job because prior execution is running and concurrency policy is Forbid

    And inside the backup cronjob pod I have the following log:

    [root@worker-sgpd postgis]# kubectl logs -f backup-mypostgres-27851640-tlqcx 15/12/2022 14:54:37 - Starting DB backup of Kubegres resource mypostgres into file: /var/lib/backup/mypostgres-backup-15_12_2022_14_54_37.gz 15/12/2022 14:54:37 - Running: pg_dumpall -h mypostgres-replica -U postgres -c | gzip > /var/lib/backup/mypostgres-backup-15_12_2022_14_54_37.gz pg_dumpall: error: connection to server at "mypostgres-replica" (192.168.108.126), port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? connection to server at "mypostgres-replica" (192.168.108.100), port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections?

    If I use a Postgres client like this: kubectl run postgresql-dev-client --rm --tty -i --restart='Never' --namespace default --image docker.io/bitnami/postgresql:14.1.0-debian-10-r80 --env="PGPASSWORD=admin" -- bash

    and inside the container I can connect to my postgres: psql --host mypostgres -U postgres -d postgres -p 5432

    postgres-# \conninfo You are connected to database "postgres" as user "postgres" on host "mypostgres" (address "192.168.108.123") at port "5432".

    I don't know why I can connect inside a pod to my Postgres, but the connection from cronjob failed. Another strange thing, If I delete the corresponding pod of my backup cron job. The first backup work and the second fails to cause connection timeout. Can you help me? Regards Antonio

  • Backup container not being upgraded when Postgres is being upgraded.

    Backup container not being upgraded when Postgres is being upgraded.

    Hi,

    Just documenting this so that it can be fixed when you guys have a chance.

    As an example, install Postgres 14.1. Backup container uses Postgres 14.1. Then upgrade to 14.2. DB containers are upgraded to 14.2, but backup container is still at Postgres 14.1.

    Thanks.

  • kubegres can't recover if all statefulsets are deleted

    kubegres can't recover if all statefulsets are deleted

    Starting with a health cluster with 3 replicas:

    $ kubectl describe kubegres postgres-uaa
    Status:
      Blocking Operation:
        Stateful Set Operation:
        Stateful Set Spec Update Operation:
      Enforced Replicas:            4
      Last Created Instance Index:  5
      Previous Blocking Operation:
        Operation Id:  Replica DB count spec enforcement
        Stateful Set Operation:
          Instance Index:  5
          Name:            postgres-uaa-5
        Stateful Set Spec Update Operation:
        Step Id:                   Replica DB is deploying
        Time Out Epoc In Seconds:  1669789919
    Events:                        <none>
    

    Delete the statefulsets:

    $ kubectl delete sts postgres-uaa-2 postgres-uaa-4 postgres-uaa-5
    

    The following error is seen:

    Events:
      Type    Reason                                   Age                From                 Message
      ----    ------                                   ----               ----                 -------
      Normal  FailoverCannotHappenAsNoReplicaDeployed  25s (x2 over 26s)  Kubegres-controller  A failover is required for a Primary Pod as it is not healthy. However, a failover cannot happen because there is not any Replica deployed.
    

    The error makes sense because no replica is available. However, its unclear how to recover the cluster. Although the statefulsets were deleted, the PVCs still exist, and the database is intact.

    Using promotePod is not possible because we cannot promote a pod that is not running.

    As a workaround, I was able to manually create a statefulset out of band, and then promote the pod. But this process was kind of error prone (editing index labels) and unclear. I'm not sure I did it right, but it seemed to work eventually.

    Feature idea: maybe a promotePVC option that can start the statefulset from an existing PVC.

  • Error when starting up postgres, using version 1.16

    Error when starting up postgres, using version 1.16

    I am not able to startup Postgres. Installed version 1.16. I am getting the following error. Any thoughts?

    1.667944382668021e+09 INFO controllers.Kubegres KUBEGRES {"name": "test-postgres", "Status": {"blockingOperation":{"statefulSetOperation":{},"statefulSetSpecUpdateOperation":{}},"previousBlockingOperation":{"statefulSetOperation":{},"statefulSetSpecUpdateOperation":{}}}} 1.667944383520829e+09 ERROR controllers.Kubegres Unable to load any deployed BackUp CronJob. {"CronJob name": "backup-test-postgres", "error": "no matches for kind "CronJob" in version "batch/v1""} reactive-tech.io/kubegres/controllers/ctx/log.(*LogWrapper).ErrorEvent /workspace/controllers/ctx/log/LogWrapper.go:62 reactive-tech.io/kubegres/controllers/states.(*BackUpStates).getDeployedCronJob /workspace/controllers/states/BackUpStates.go:91 reactive-tech.io/kubegres/controllers/states.(*BackUpStates).loadStates /workspace/controllers/states/BackUpStates.go:49 reactive-tech.io/kubegres/controllers/states.loadBackUpStates /workspace/controllers/states/BackUpStates.go:43 reactive-tech.io/kubegres/controllers/states.(*ResourcesStates).loadBackUpStates /workspace/controllers/states/ResourcesStates.go:95 reactive-tech.io/kubegres/controllers/states.(*ResourcesStates).loadStates /workspace/controllers/states/ResourcesStates.go:66 reactive-tech.io/kubegres/controllers/states.LoadResourcesStates /workspace/controllers/states/ResourcesStates.go:40 reactive-tech.io/kubegres/controllers/ctx/resources.CreateResourcesContext /workspace/controllers/ctx/resources/ResourcesContext.go:110 reactive-tech.io/kubegres/controllers.(*KubegresReconciler).Reconcile /workspace/controllers/kubegres_controller.go:76 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:121 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:320 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234 1.6679443835210457e+09 ERROR Reconciler error {"controller": "kubegres", "controllerGroup": "kubegres.reactive-tech.io", "controllerKind": "Kubegres", "kubegres": {"name":"test-postgres","namespace":"default"}, "namespace": "default", "name": "test-postgres", "reconcileID": "cb49cdb7-2464-4a48-8365-822d9e0af891", "error": "no matches for kind "CronJob" in version "batch/v1""} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234 1.6679443835211165e+09 DEBUG events Warning {"object": {"kind":"Kubegres","namespace":"default","name":"test-postgres","uid":"b380917d-6c13-4471-995e-8069960adc3b","apiVersion":"kubegres.reactive-tech.io/v1","resourceVersion":"4535076"}, "reason": "BackUpCronJobLoadingErr", "message": "Unable to load any deployed BackUp CronJob. 'CronJob name': backup-test-postgres - no matches for kind "CronJob" in version "batch/v1""}

    ===================================== apiVersion: kubegres.reactive-tech.io/v1 kind: Kubegres metadata: name: test-postgres namespace: default spec: replicas: 2 image: postgres:14.1 database: size: 8Gi storageClassName: local-storage env: - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: postgres-secret key: superUserPassword - name: POSTGRES_REPLICATION_PASSWORD valueFrom: secretKeyRef: name: postgres-secret key: replicationUserPassword

    ================ k8s version: Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.15", GitCommit:"8f1e5bf0b9729a899b8df86249b56e2c74aebc55", GitTreeState:"clean", BuildDate:"2022-01-19T17:23:01Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}

    ================ storage classes: NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE local-storage kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 20d

  • References for PITR / Continuous Archiving implementation?

    References for PITR / Continuous Archiving implementation?

    Hi.

    I've had some issues with Zalando, and now I'm looking for a simpler operator. Kubegres seems to fit the bill, and my experience deploy a cluster was great. I have a custom image setup to run pg_dump and pg_restore scripts , CronJobs for the dump and an on-demand job for the restoration process. This is really simple, and works well, but with restrictions: won't work for larger databases, slow, very high RPO.

    I've been looking at strategies to implement PITR and continuous backup. Zalando had this baked in using pg_basebackup and WAL-G (I think). Outside the k8s world, I've read a lot about PgBackrest, Barman, WAL-G and couple of other solutions. But those doesn't look all that simple to setup when the DB is running in containers (they might be, but I don't find much information on it except one or two repos). I know Timescale runs PgBackrest as a sidecar, Zalando runs a custom image with WAL-G/E + pg_basebackup, Percona also uses PgBackrest (not sure about the architecture). PGO Crunchy also backrest, Stackgres I think is custom solution, not sure.

    I tried running a separate container for PgBackrest, so I changed the VolumeClaim policy for ReadWriteMany (so that Backrest could connect directly to the data directory), but I had quite a few issues all around the process and couldn't make it work (yet? will keep trying).

    I understand Kubegres is not particularly going in this direction at the moment, but I wonder if this could be an option for the future. There has been a brief discussion about it here, but it stopped at pg_dump. Stackgres has an interesting approach with several CRDs. Although this looks complex at first having multiple CRD also allows for more flexibility. Zalando's approach tries to put everything into the cluster definition and/or configuration file, so things are not always trivial to grasp. (I'll follow up in a bit with potential implementations.

    I imagine that this should be a common requirement for folks deploying PSQL to k8s, so even if this is not a plan for Kubegres in future, I imagine the pain still exists, so I was wondering if there were any examples, references or any other material really to implement this solution with Kubegres, or any experience people could share.

    Thanks a lot!

PolarDB-X Operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes.

GalaxyKube -- PolarDB-X Operator PolarDB-X Operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes. It follo

Dec 19, 2022
Basic Kubernetes operator that have multiple versions in CRD. This operator can be used to experiment and understand Operator/CRD behaviors.

add-operator Basic Kubernetes operator that have multiple versions in CRD. This operator can be used to experiment and understand Operator/CRD behavio

Dec 15, 2021
Modular Kubernetes operator to manage the lifecycle of databases

Ensemble Ensemble is a simple and modular Kubernetes Operator to manage the lifecycle of a wide range of databases. Infrastructure as code with Kubern

Aug 12, 2022
A Terraform module to manage cluster authentication (aws-auth) for an Elastic Kubernetes (EKS) cluster on AWS.

Archive Notice The terraform-aws-modules/eks/aws v.18.20.0 release has brought back support aws-auth configmap! For this reason, I highly encourage us

Dec 4, 2022
An operator which complements grafana-operator for custom features which are not feasible to be merged into core operator

Grafana Complementary Operator A grafana which complements grafana-operator for custom features which are not feasible to be merged into core operator

Aug 16, 2022
The OCI Service Operator for Kubernetes (OSOK) makes it easy to connect and manage OCI services from a cloud native application running in a Kubernetes environment.

OCI Service Operator for Kubernetes Introduction The OCI Service Operator for Kubernetes (OSOK) makes it easy to create, manage, and connect to Oracle

Sep 27, 2022
Lxmin - Backup and Restore LXC instances from MinIO

lxmin Backup and restore LXC instances from MinIO Usage NAME: lxmin - backup a

Dec 7, 2022
A simple application, demo at this point, on how to pull a backup from Collibra on prem (say for Cohesity backup)
A simple application, demo at this point, on how to pull a backup from Collibra on prem (say for Cohesity backup)

go-get-collibra-backup Introduction This repository is a very simple go application that's intended, at this point, more of a demonstration about how

Dec 10, 2021
Lightweight, single-binary Backup Repository client. Part of E2E Backup Architecture designed by RiotKit

Backup Maker Tiny backup client packed in a single binary. Interacts with a Backup Repository server to store files, uses GPG to secure your backups e

Apr 4, 2022
The Oracle Database Operator for Kubernetes (a.k.a. OraOperator) helps developers, DBAs, DevOps and GitOps teams reduce the time and complexity of deploying and managing Oracle Databases

The Oracle Database Operator for Kubernetes (a.k.a. OraOperator) helps developers, DBAs, DevOps and GitOps teams reduce the time and complexity of deploying and managing Oracle Databases. It eliminates the dependency on a human operator or administrator for the majority of database operations.

Dec 14, 2022
cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resource objects related of Kubernetes Cluster API.

Overview cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resou

Oct 27, 2022
Kubernetes Operator Samples using Go, the Operator SDK and OLM
Kubernetes Operator Samples using Go, the Operator SDK and OLM

Kubernetes Operator Patterns and Best Practises This project contains Kubernetes operator samples that demonstrate best practices how to develop opera

Nov 24, 2022
vcluster - Create fully functional virtual Kubernetes clusters - Each cluster runs inside a Kubernetes namespace and can be started within seconds
vcluster - Create fully functional virtual Kubernetes clusters - Each cluster runs inside a Kubernetes namespace and can be started within seconds

Website • Quickstart • Documentation • Blog • Twitter • Slack vcluster - Virtual Clusters For Kubernetes Lightweight & Low-Overhead - Based on k3s, bu

Jan 4, 2023
The Elastalert Operator is an implementation of a Kubernetes Operator, to easily integrate elastalert with gitops.

Elastalert Operator for Kubernetes The Elastalert Operator is an implementation of a Kubernetes Operator. Getting started Firstly, learn How to use el

Jun 28, 2022
Minecraft-operator - A Kubernetes operator for Minecraft Java Edition servers

Minecraft Operator A Kubernetes operator for dedicated servers of the video game

Dec 15, 2022
K8s-network-config-operator - Kubernetes network config operator to push network config to switches

Kubernetes Network operator Will add more to the readme later :D Operations The

May 16, 2022
Pulumi-k8s-operator-example - OpenGitOps Compliant Pulumi Kubernetes Operator Example

Pulumi GitOps Example OpenGitOps Compliant Pulumi Kubernetes Operator Example Pr

May 6, 2022
Nebula Operator manages NebulaGraph clusters on Kubernetes and automates tasks related to operating a NebulaGraph cluster

Nebula Operator manages NebulaGraph clusters on Kubernetes and automates tasks related to operating a NebulaGraph cluster. It evolved from NebulaGraph Cloud Service, makes NebulaGraph a truly cloud-native database.

Dec 31, 2022