Nebula Operator manages NebulaGraph clusters on Kubernetes and automates tasks related to operating a NebulaGraph cluster

Nebula Operator

Nebula Operator manages NebulaGraph clusters on Kubernetes and automates tasks related to operating a NebulaGraph cluster. It evolved from NebulaGraph Cloud Service, makes NebulaGraph a truly cloud-native database.

Quick Start

install nebula operator

See install/uninstall nebula operator .

Create and destroy a nebula cluster

$ kubectl create -f config/samples/apps_v1alpha1_nebulacluster.yaml

A none ha-mode nebula cluster will be created.

$ kubectl get pods -l app.kubernetes.io/instance=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          1m
nebula-metad-0      1/1     Running   0          1m
nebula-storaged-0   1/1     Running   0          1m
nebula-storaged-1   1/1     Running   0          1m
nebula-storaged-2   1/1     Running   0          1m

See client service for how to access nebula clusters created by the operator.
If you are working with kubeadm locally, create a nodePort service and test that nebula is responding:

$ kubectl create -f config/samples/graphd-nodeport-service.yaml

/ # nebula-console -u user -p password --address=192.168.8.26 --port=32236
2021/04/15 16:50:23 [INFO] connection pool is initialized successfully

Welcome to Nebula Graph!
(user@nebula) [(none)]> 

Destroy the nebula cluster:

$ kubectl delete -f config/samples/apps_v1alpha1_nebulacluster.yaml

Resize a nebula cluster

Create a nebula cluster:

$ kubectl create -f config/samples/apps_v1alpha1_nebulacluster.yaml

In config/samples/apps_v1alpha1_nebulacluster.yaml the initial storaged replicas is 3.
Modify the file and change replicas from 3 to 5.

  storaged:
    resources:
      requests:
        cpu: "1"
        memory: "1Gi"
      limits:
        cpu: "1"
        memory: "1Gi"
    replicas: 5
    image: vesoft/nebula-storaged
    version: v2.0.0
    storageClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks

Apply the replicas change to the cluster CR:

$ kubectl apply -f config/samples/apps_v1alpha1_nebulacluster.yaml

The storaged cluster will scale to 5 members (5 pods):

$ kubectl get pods -l app.kubernetes.io/instance=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          2m
nebula-metad-0      1/1     Running   0          2m
nebula-storaged-0   1/1     Running   0          2m
nebula-storaged-1   1/1     Running   0          2m
nebula-storaged-2   1/1     Running   0          2m
nebula-storaged-3   1/1     Running   0          5m
nebula-storaged-4   1/1     Running   0          5m

Similarly we can decrease the size of the cluster from 5 back to 3 by changing the replicas field again and reapplying the change.

  storaged:
    resources:
      requests:
        cpu: "1"
        memory: "1Gi"
      limits:
        cpu: "1"
        memory: "1Gi"
    replicas: 3
    image: vesoft/nebula-storaged
    version: v2.0.0
    storageClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks

We should see that storaged cluster will eventually reduce to 3 pods:

$ kubectl get pods -l app.kubernetes.io/instance=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          10m
nebula-metad-0      1/1     Running   0          10m
nebula-storaged-0   1/1     Running   0          10m
nebula-storaged-1   1/1     Running   0          10m
nebula-storaged-2   1/1     Running   0          10m

Failover

If the minority of nebula components crash, the nebula operator will automatically recover the failure. Let's walk through this in the following steps.

Create a nebula cluster:

$ kubectl create -f config/samples/apps_v1alpha1_nebulacluster.yaml

Wait until pods are up. Simulate a member failure by deleting a storaged pod:

$ kubectl delete pod nebula-storaged-2 --now

The nebula operator will recover the failure by creating a new pod nebula-storaged-2:

$ kubectl get pods -l app.kubernetes.io/instance=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          15m
nebula-metad-0      1/1     Running   0          15m
nebula-storaged-0   1/1     Running   0          15m
nebula-storaged-1   1/1     Running   0          15m
nebula-storaged-2   1/1     Running   0          19s

FAQ

Please refer to FAQ.md

Community

Feel free to reach out if you have any questions. The maintainers of this project are reachable via:

Activity

Contributing

Contributions are welcome and greatly appreciated.

  • Start by some issues
  • Submit Pull Requests to us. Please refer to how-to-contribute.

Acknowledgements

nebula-operator refers to tidb-operator. They have made a very good product. We have a similar architecture, although the product pattern is different from the application scenario, we would like to express our gratitude here.

License

NebulaGraph is under the Apache 2.0 license. See the LICENSE file for details.

Comments
  • Error while upgrading to 1.3

    Error while upgrading to 1.3

    When I upgrade to the latest version using helm upgrade I get the error:

    Error: UPGRADE FAILED: error validating "": error validating data: ValidationError(NebulaCluster.spec.storaged): unknown field "dataVolumeClaims" in io.nebula-graph.apps.v1alpha1.NebulaCluster.spec.storaged

  • Failed to deploy k8s in huawei cloud

    Failed to deploy k8s in huawei cloud

    General Question

    参考文档,以nubula-operator方式进行部署,环境华为云,k8s版本1.21。 nebula-metad无法启动: image pv和pvc状态看起来是正常的: image pod的describe信息显示是进程未就绪: image

    查看pod中日志文件内容: 0号实例连接2号实例失败 image 2号实例未启动,原因是持久化报错: image 进入2号实例中,目录挂载路径没问题,读写权限看起来也是正常的: image

    重启pod不能解决这个问题。 请问,我还能如何进一步定位问题,并且进行处理?

  • Install Operator failure

    Install Operator failure

    As you can see in the below picture, I have attempted to install the nebula-operator with helm for many times. However, I always encounter this failure. I don't know why it happens? image

  • Deploy nebula operator with ArgoCD

    Deploy nebula operator with ArgoCD

    Hi,

    we want to deploy nebula operator with CD tool ArgoCD, but we have a problem with specify k8s namespace. In values.yaml file we cant define namespace. Do you have any suggestions how to solve this problem ?

    Thank you

  • GraphService.cpp:228] Unknown auth type:

    GraphService.cpp:228] Unknown auth type:

    集群运行环境:nebula-operator 1.0.0 nebula-graphd 3.0.2 [root@k8s-master1 ~]# kubectl get pod -n kube-nebula NAME READY STATUS RESTARTS AGE nebula-graphd-0 1/1 Running 0 34m nebula-metad-0 1/1 Running 0 29m nebula-operator-controller-manager-deployment-79fb98f94-pbfj7 2/2 Running 0 76m nebula-operator-controller-manager-deployment-79fb98f94-vk98s 2/2 Running 2 5h7m nebula-operator-scheduler-deployment-7898449b87-469zx 2/2 Running 1 5h7m nebula-operator-scheduler-deployment-7898449b87-blpxw 2/2 Running 2 5h7m nebula-storaged-0 1/1 Running 1 46m nebula-storaged-1 1/1 Running 1 46m nebula-storaged-2 1/1 Running 1 46m

    客户端连接日志ba报错 1649927696(1)

    nebula-graphd日志报错: 1649927740(1) 1649927790(1)

  • fix typo and remove redundant if cases

    fix typo and remove redundant if cases

    A little work:

    1. run go mod tidy to remove unused dependencies
    2. run gofmt -s to add and remove spaces
    3. remove unnecessary if cases
    4. add comments for exported functions
    5. fix typo in UpdateNebualCluster to make FakeClusterControl implement ControlInterface interface
  • Use a more clarify phrasing to state out the assigned value `gp2` on k8s property `storageClassName`

    Use a more clarify phrasing to state out the assigned value `gp2` on k8s property `storageClassName`

    Introduction

    When I trying to deploy nebula-clutser with helm, the comment for storageClassName states:

    export STORAGE_CLASS_NAME=gp2 # the storage class for the nebula cluster

    I think such phrasing mislead me. While such phrasing may be familiar with these operators and developers that are using or have used AWS EBS on or not on K8S, however, I originally thought gp2 might be some special storgae value that nebula graph has provided for K8S container runtime with the same property name on it when in cluster deployment (since the phrasing says for nebula clutser), until I found out this is just the StorageClass type value of K8S itself.

    Contents

    We may be able to state out that such value for storageClassName will be applied to K8S itself and operators should pick one of the desired and possible value for this property on the demand of themselves instead of just pointing out it is for the nebula cluster .

    Or I suggest to set the default value to local on storageClassName in values.yaml, and state out the preferences may be made by operators and developers in comments.

    Related work

    None

  • Clarify add-ons installation instructions

    Clarify add-ons installation instructions

    Summary

    here, it says users can install add-ons, but it's unclear what should happen after the installation. do i need to configure anything for nebula to work with these add-ons?

    Add more detailed documentation on how to configure Nebula to work with these add-ons, how these add-ons work with Nebula, what each of them is for?

  • Cannot download latest nebula-cluster chart

    Cannot download latest nebula-cluster chart

    helm upgrade nebula nebula-operator/nebula-cluster -f values.yaml --version=1.1.0
    Error: Failed to render chart: exit status 1: Error: failed to download "nebula-operator/nebula-cluster" at version "1.1.0"
    

    nebula-operator/nebula-operator @ 1.1.0 seems to work fine though

  • add more configurable options for pod template

    add more configurable options for pod template

    During the tests of MEG friends, it's found there could be a conflict between nebulaCluster pods and istio-proxy sidecar pods, in order to disable the sidecar, below annotation fields are needed.

    IMG_0546

    While for now, there is no such configuration interface to do so.

    Is it feasible to add configurable options on this kind of k/v pairs via values.yaml ?

  • Add sidecar for nebula logging

    Add sidecar for nebula logging

    currently, we cannot review nebula logs via kubectl logs or other logging system. Add the sidecar for nebula-graphd, nebula-metad, nebula-storaged, so that we can capture the logs via stdout.

    e.g. image

  • Cluster creation fails due to webhook error.

    Cluster creation fails due to webhook error.

    Describe the bug (required) I am unable to actually install nebula cluster1.3.0` on GKE, already seend to be failing on some webhook error at install time.

    kruise,cert-manager and operator pods are actually working, is at the time of creating the nebula-cluster that is giving use the failure.

    Your Environments (required) already test on two versions on GKE.

    • GKE: 1.22.12-gke.2300
    • GKE: 1.24.7-gke.900

    How To Reproduce(required)

    prerequisites:

    kruise 1.1.0 cert-manager 1.10.0 nebula-operator 1.3.0

    helm install kruise openkruise/kruise --version 1.1.0
    NAME: kruise
    LAST DEPLOYED: Tue Nov 22 18:09:47 2022
    NAMESPACE: default
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None
    
    
    kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.10.0/cert-manager.yaml
    namespace/cert-manager created
    customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created
    customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created
    customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created
    customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created
    customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created
    customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created
    serviceaccount/cert-manager-cainjector created
    serviceaccount/cert-manager created
    serviceaccount/cert-manager-webhook created
    configmap/cert-manager-webhook created
    clusterrole.rbac.authorization.k8s.io/cert-manager-cainjector created
    clusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers created
    clusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created
    clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates created
    clusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders created
    clusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges created
    clusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created
    clusterrole.rbac.authorization.k8s.io/cert-manager-view created
    clusterrole.rbac.authorization.k8s.io/cert-manager-edit created
    clusterrole.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created
    clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests created
    clusterrole.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created
    clusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector created
    clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers created
    clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created
    clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates created
    clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders created
    clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges created
    clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created
    clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created
    clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests created
    clusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created
    role.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created
    role.rbac.authorization.k8s.io/cert-manager:leaderelection created
    role.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created
    rolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created
    rolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection created
    rolebinding.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created
    service/cert-manager created
    service/cert-manager-webhook created
    deployment.apps/cert-manager-cainjector created
    deployment.apps/cert-manager created
    deployment.apps/cert-manager-webhook created
    mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created
    validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created
    
    
    kubectl create ns nebula-operator-system
    
    namespace/nebula-operator-system created
    helm install nebula-operator nebula-operator/nebula-operator --namespace=nebula-operator-system -f values/dev.operator.values.yaml --version 1.3.0
    NAME: nebula-operator
    LAST DEPLOYED: Tue Nov 22 18:12:17 2022
    NAMESPACE: nebula-operator-system
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None
    NOTES:
    Nebula Operator installed!
    

    error when installing actual nebula-cluster v1.3.0

    helm install starmatch-nebula-dev nebula-operator/nebula-cluster -f values/dev.values.yaml --version 1.3.0
    Error: Internal error occurred: failed calling webhook "nebulaclustervalidating.nebula-graph.io": failed to call webhook: Post "https://nebula-operator-webhook-service.nebula-operator-system.svc:443/apis/admission.nebula-graph.io/v1alpha1/nebulaclustervalidating?timeout=10s": no endpoints available for service "nebula-operator-webhook-service"
    exit status 1
    
  • Conf change precisely reconcile

    Conf change precisely reconcile

    Some of the configurations are possible to be lively changed with curl from the HTTP interface, some are not.

    If we scope them properly, ideally, in the operator, we could choose to reconcile corresponding configurations:

    • for those need a process restart, do it
    • for those who don't need a restart, update with the HTTP interface and update the conf file, on the fly.
  • compatibility: v0.8.0 --> v0.9

    compatibility: v0.8.0 --> v0.9

    There are some pain points that could be addressed if possible:

    • [ ] manually CRD update is required, this could be a pitfall even in case of fresh deployment of v0.9.0 operator
    • [ ] upgrade path from v0.8.0(v2.5.x) to v0.9(v2.5.x) seems to be not possible due to the CRD change, it will force the end-user to run production v0.8.0 to a redeployment

    Thanks :) PS. great job on the v0.9.0 on rolling upgrade support!

    • https://discuss.nebula-graph.com.cn/t/topic/6656
    • https://discuss.nebula-graph.com.cn/t/topic/6702
    • https://discuss.nebula-graph.com.cn/t/topic/6573
  • ngctl: small improvements

    ngctl: small improvements

    First, it's awesome to have ngctl, really love it, thank you so much!!

    • [x] In initial connection, the first line of +--------------------+ should be better newline-ed:
    $ngctl console
    (root@nebula) [(none)]> show spaces
    (root@nebula) [(none)]> +--------------------+
    | Name               |
    +--------------------+
    | "basketballplayer" |
    | "shareholding"     |
    +--------------------+
    
    • [ ] Possible to support UP-btn for history commands or even Ctrl-R?
YurtCluster Operator creates and manages OpenYurt cluster atop Kubernetes

YurtCluster Operator Quick Start Prepare a Kubernetes cluster # cat <<EOF | kind create cluster --config=- kind: Cluster apiVersion: kind.x-k8s.io/v1a

Aug 3, 2022
cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resource objects related of Kubernetes Cluster API.

Overview cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resou

Oct 27, 2022
Basic Kubernetes operator that have multiple versions in CRD. This operator can be used to experiment and understand Operator/CRD behaviors.

add-operator Basic Kubernetes operator that have multiple versions in CRD. This operator can be used to experiment and understand Operator/CRD behavio

Dec 15, 2021
vcluster - Create fully functional virtual Kubernetes clusters - Each cluster runs inside a Kubernetes namespace and can be started within seconds
vcluster - Create fully functional virtual Kubernetes clusters - Each cluster runs inside a Kubernetes namespace and can be started within seconds

Website • Quickstart • Documentation • Blog • Twitter • Slack vcluster - Virtual Clusters For Kubernetes Lightweight & Low-Overhead - Based on k3s, bu

Jan 4, 2023
An operator which complements grafana-operator for custom features which are not feasible to be merged into core operator

Grafana Complementary Operator A grafana which complements grafana-operator for custom features which are not feasible to be merged into core operator

Aug 16, 2022
PolarDB-X Operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes.

GalaxyKube -- PolarDB-X Operator PolarDB-X Operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes. It follo

Dec 19, 2022
KinK is a helper CLI that facilitates to manage KinD clusters as Kubernetes pods. Designed to ease clusters up for fast testing with batteries included in mind.
KinK is a helper CLI that facilitates to manage KinD clusters as Kubernetes pods. Designed to ease clusters up for fast testing with batteries included in mind.

kink A helper CLI that facilitates to manage KinD clusters as Kubernetes pods. Table of Contents kink (KinD in Kubernetes) Introduction How it works ?

Dec 10, 2022
Cloud-gaming-operator - The one that manages VMs for cloud gaming built on GCE

cloud-gaming-operator GCE上に建てたクラウドゲーミング用のVMを管理するやつ 事前準備 GCEのインスタンスかマシンイメージを作成してお

Jan 22, 2022
Kubernetes operator to autoscale Google's Cloud Bigtable clusters
Kubernetes operator to autoscale Google's Cloud Bigtable clusters

Bigtable Autoscaler Operator Bigtable Autoscaler Operator is a Kubernetes Operator to autoscale the number of nodes of a Google Cloud Bigtable instanc

Nov 5, 2021
Kubernetes Operator Samples using Go, the Operator SDK and OLM
Kubernetes Operator Samples using Go, the Operator SDK and OLM

Kubernetes Operator Patterns and Best Practises This project contains Kubernetes operator samples that demonstrate best practices how to develop opera

Nov 24, 2022
Manages nodes in hybrid k8s self-hosted cluster

node-manager Manages nodes in hybrid k8s self-hosted cluster Supported providers Contabo Hetzner Robot (dedicated) Supported commands Heal - reboots a

Dec 23, 2021
Kubegres is a Kubernetes operator allowing to create a cluster of PostgreSql instances and manage databases replication, failover and backup.

Kubegres is a Kubernetes operator allowing to deploy a cluster of PostgreSql pods with data replication enabled out-of-the box. It brings simplicity w

Dec 30, 2022
The Elastalert Operator is an implementation of a Kubernetes Operator, to easily integrate elastalert with gitops.

Elastalert Operator for Kubernetes The Elastalert Operator is an implementation of a Kubernetes Operator. Getting started Firstly, learn How to use el

Jun 28, 2022
Minecraft-operator - A Kubernetes operator for Minecraft Java Edition servers

Minecraft Operator A Kubernetes operator for dedicated servers of the video game

Dec 15, 2022
K8s-network-config-operator - Kubernetes network config operator to push network config to switches

Kubernetes Network operator Will add more to the readme later :D Operations The

May 16, 2022
Pulumi-k8s-operator-example - OpenGitOps Compliant Pulumi Kubernetes Operator Example

Pulumi GitOps Example OpenGitOps Compliant Pulumi Kubernetes Operator Example Pr

May 6, 2022
Kubernetes Operator for MySQL NDB Cluster.

MySQL NDB Operator The MySQL NDB Operator is a Kubernetes operator for managing a MySQL NDB Cluster setup inside a Kubernetes Cluster. This is in prev

Dec 21, 2022
Sbom-operator - Catalogue all images of a Kubernetes cluster to multiple targets with Syft

sbom-operator Catalogue all images of a Kubernetes cluster to multiple targets w

Jan 4, 2023