Nebula Operator manages NebulaGraph clusters on Kubernetes and automates tasks related to operating a NebulaGraph cluster

Last update: Dec 31, 2022

Comments: 15

Nebula Operator

Nebula Operator manages NebulaGraph clusters on Kubernetes and automates tasks related to operating a NebulaGraph cluster. It evolved from NebulaGraph Cloud Service, makes NebulaGraph a truly cloud-native database.

install nebula operator

See install/uninstall nebula operator .

Create and destroy a nebula cluster

$ kubectl create -f config/samples/apps_v1alpha1_nebulacluster.yaml

A none ha-mode nebula cluster will be created.

$ kubectl get pods -l app.kubernetes.io/instance=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          1m
nebula-metad-0      1/1     Running   0          1m
nebula-storaged-0   1/1     Running   0          1m
nebula-storaged-1   1/1     Running   0          1m
nebula-storaged-2   1/1     Running   0          1m

See client service for how to access nebula clusters created by the operator.
If you are working with kubeadm locally, create a nodePort service and test that nebula is responding:

$ kubectl create -f config/samples/graphd-nodeport-service.yaml

/ # nebula-console -u user -p password --address=192.168.8.26 --port=32236
2021/04/15 16:50:23 [INFO] connection pool is initialized successfully

Welcome to Nebula Graph!
(user@nebula) [(none)]>

Destroy the nebula cluster:

$ kubectl delete -f config/samples/apps_v1alpha1_nebulacluster.yaml

Resize a nebula cluster

Create a nebula cluster:

$ kubectl create -f config/samples/apps_v1alpha1_nebulacluster.yaml

In config/samples/apps_v1alpha1_nebulacluster.yaml the initial storaged replicas is 3.
Modify the file and change replicas from 3 to 5.

  storaged:
    resources:
      requests:
        cpu: "1"
        memory: "1Gi"
      limits:
        cpu: "1"
        memory: "1Gi"
    replicas: 5
    image: vesoft/nebula-storaged
    version: v2.0.0
    storageClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks

Apply the replicas change to the cluster CR:

$ kubectl apply -f config/samples/apps_v1alpha1_nebulacluster.yaml

The storaged cluster will scale to 5 members (5 pods):

$ kubectl get pods -l app.kubernetes.io/instance=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          2m
nebula-metad-0      1/1     Running   0          2m
nebula-storaged-0   1/1     Running   0          2m
nebula-storaged-1   1/1     Running   0          2m
nebula-storaged-2   1/1     Running   0          2m
nebula-storaged-3   1/1     Running   0          5m
nebula-storaged-4   1/1     Running   0          5m

Similarly we can decrease the size of the cluster from 5 back to 3 by changing the replicas field again and reapplying the change.

  storaged:
    resources:
      requests:
        cpu: "1"
        memory: "1Gi"
      limits:
        cpu: "1"
        memory: "1Gi"
    replicas: 3
    image: vesoft/nebula-storaged
    version: v2.0.0
    storageClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks

We should see that storaged cluster will eventually reduce to 3 pods:

$ kubectl get pods -l app.kubernetes.io/instance=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          10m
nebula-metad-0      1/1     Running   0          10m
nebula-storaged-0   1/1     Running   0          10m
nebula-storaged-1   1/1     Running   0          10m
nebula-storaged-2   1/1     Running   0          10m

Failover

If the minority of nebula components crash, the nebula operator will automatically recover the failure. Let's walk through this in the following steps.

Create a nebula cluster:

$ kubectl create -f config/samples/apps_v1alpha1_nebulacluster.yaml

Wait until pods are up. Simulate a member failure by deleting a storaged pod:

$ kubectl delete pod nebula-storaged-2 --now

The nebula operator will recover the failure by creating a new pod nebula-storaged-2:

$ kubectl get pods -l app.kubernetes.io/instance=nebula
NAME                READY   STATUS    RESTARTS   AGE
nebula-graphd-0     1/1     Running   0          15m
nebula-metad-0      1/1     Running   0          15m
nebula-storaged-0   1/1     Running   0          15m
nebula-storaged-1   1/1     Running   0          15m
nebula-storaged-2   1/1     Running   0          19s

FAQ

Please refer to FAQ.md

Community

Feel free to reach out if you have any questions. The maintainers of this project are reachable via:

Filing an issue against this repo

Activity

🆕 Hello, Nebula Operator Chief Feature Officer

Contributing

Contributions are welcome and greatly appreciated.

Start by some issues
Submit Pull Requests to us. Please refer to how-to-contribute.

Acknowledgements

nebula-operator refers to tidb-operator. They have made a very good product. We have a similar architecture, although the product pattern is different from the application scenario, we would like to express our gratitude here.

License

NebulaGraph is under the Apache 2.0 license. See the LICENSE file for details.

Owner

vesoft inc.

https://github.com/vesoft-inc/nebula-operator

Comments

Error while upgrading to 1.3

When I upgrade to the latest version using helm upgrade I get the error:

Error: UPGRADE FAILED: error validating "": error validating data: ValidationError(NebulaCluster.spec.storaged): unknown field "dataVolumeClaims" in io.nebula-graph.apps.v1alpha1.NebulaCluster.spec.storaged
Failed to deploy k8s in huawei cloud

General Question

参考文档，以nubula-operator方式进行部署，环境华为云，k8s版本1.21。 nebula-metad无法启动： pv和pvc状态看起来是正常的： pod的describe信息显示是进程未就绪：

查看pod中日志文件内容： 0号实例连接2号实例失败 2号实例未启动，原因是持久化报错：进入2号实例中，目录挂载路径没问题，读写权限看起来也是正常的：

重启pod不能解决这个问题。请问，我还能如何进一步定位问题，并且进行处理？
Install Operator failure

As you can see in the below picture, I have attempted to install the nebula-operator with helm for many times. However, I always encounter this failure. I don't know why it happens?
Deploy nebula operator with ArgoCD

Hi,

we want to deploy nebula operator with CD tool ArgoCD, but we have a problem with specify k8s namespace. In values.yaml file we cant define namespace. Do you have any suggestions how to solve this problem ?

Thank you
GraphService.cpp:228] Unknown auth type:

集群运行环境：nebula-operator 1.0.0 nebula-graphd 3.0.2 [root@k8s-master1 ~]# kubectl get pod -n kube-nebula NAME READY STATUS RESTARTS AGE nebula-graphd-0 1/1 Running 0 34m nebula-metad-0 1/1 Running 0 29m nebula-operator-controller-manager-deployment-79fb98f94-pbfj7 2/2 Running 0 76m nebula-operator-controller-manager-deployment-79fb98f94-vk98s 2/2 Running 2 5h7m nebula-operator-scheduler-deployment-7898449b87-469zx 2/2 Running 1 5h7m nebula-operator-scheduler-deployment-7898449b87-blpxw 2/2 Running 2 5h7m nebula-storaged-0 1/1 Running 1 46m nebula-storaged-1 1/1 Running 1 46m nebula-storaged-2 1/1 Running 1 46m

客户端连接日志ba报错

nebula-graphd日志报错：
fix typo and remove redundant if cases
A little work:

run go mod tidy to remove unused dependencies

run gofmt -s to add and remove spaces

remove unnecessary if cases

add comments for exported functions

fix typo in UpdateNebualCluster to make FakeClusterControl implement ControlInterface interface
Use a more clarify phrasing to state out the assigned value `gp2` on k8s property `storageClassName`

Introduction

When I trying to deploy nebula-clutser with helm, the comment for storageClassName states:

export STORAGE_CLASS_NAME=gp2 # the storage class for the nebula cluster

I think such phrasing mislead me. While such phrasing may be familiar with these operators and developers that are using or have used AWS EBS on or not on K8S, however, I originally thought gp2 might be some special storgae value that nebula graph has provided for K8S container runtime with the same property name on it when in cluster deployment (since the phrasing says for nebula clutser), until I found out this is just the StorageClass type value of K8S itself.

Contents

We may be able to state out that such value for storageClassName will be applied to K8S itself and operators should pick one of the desired and possible value for this property on the demand of themselves instead of just pointing out it is for the nebula cluster .

Or I suggest to set the default value to local on storageClassName in values.yaml, and state out the preferences may be made by operators and developers in comments.

Related work

None
Clarify add-ons installation instructions

Summary

here, it says users can install add-ons, but it's unclear what should happen after the installation. do i need to configure anything for nebula to work with these add-ons?

Add more detailed documentation on how to configure Nebula to work with these add-ons, how these add-ons work with Nebula, what each of them is for?

Cannot download latest nebula-cluster chart

helm upgrade nebula nebula-operator/nebula-cluster -f values.yaml --version=1.1.0
Error: Failed to render chart: exit status 1: Error: failed to download "nebula-operator/nebula-cluster" at version "1.1.0"

nebula-operator/nebula-operator @ 1.1.0 seems to work fine though

add more configurable options for pod template

During the tests of MEG friends, it's found there could be a conflict between nebulaCluster pods and istio-proxy sidecar pods, in order to disable the sidecar, below annotation fields are needed.

While for now, there is no such configuration interface to do so.

Is it feasible to add configurable options on this kind of k/v pairs via values.yaml ?
Add sidecar for nebula logging

currently, we cannot review nebula logs via kubectl logs or other logging system. Add the sidecar for nebula-graphd, nebula-metad, nebula-storaged, so that we can capture the logs via stdout.

e.g.

Cluster creation fails due to webhook error.

Describe the bug (required) I am unable to actually install nebula cluster1.3.0` on GKE, already seend to be failing on some webhook error at install time.

kruise,cert-manager and operator pods are actually working, is at the time of creating the nebula-cluster that is giving use the failure.

Your Environments (required) already test on two versions on GKE.

GKE: 1.22.12-gke.2300
GKE: 1.24.7-gke.900

How To Reproduce(required)

prerequisites:

kruise 1.1.0 cert-manager 1.10.0 nebula-operator 1.3.0

helm install kruise openkruise/kruise --version 1.1.0
NAME: kruise
LAST DEPLOYED: Tue Nov 22 18:09:47 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None


kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.10.0/cert-manager.yaml
namespace/cert-manager created
customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created
serviceaccount/cert-manager-cainjector created
serviceaccount/cert-manager created
serviceaccount/cert-manager-webhook created
configmap/cert-manager-webhook created
clusterrole.rbac.authorization.k8s.io/cert-manager-cainjector created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created
clusterrole.rbac.authorization.k8s.io/cert-manager-view created
clusterrole.rbac.authorization.k8s.io/cert-manager-edit created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests created
clusterrole.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created
role.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created
role.rbac.authorization.k8s.io/cert-manager:leaderelection created
role.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created
rolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created
rolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection created
rolebinding.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created
service/cert-manager created
service/cert-manager-webhook created
deployment.apps/cert-manager-cainjector created
deployment.apps/cert-manager created
deployment.apps/cert-manager-webhook created
mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created
validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created


kubectl create ns nebula-operator-system

namespace/nebula-operator-system created
helm install nebula-operator nebula-operator/nebula-operator --namespace=nebula-operator-system -f values/dev.operator.values.yaml --version 1.3.0
NAME: nebula-operator
LAST DEPLOYED: Tue Nov 22 18:12:17 2022
NAMESPACE: nebula-operator-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Nebula Operator installed!

error when installing actual `nebula-cluster v1.3.0`

helm install starmatch-nebula-dev nebula-operator/nebula-cluster -f values/dev.values.yaml --version 1.3.0
Error: Internal error occurred: failed calling webhook "nebulaclustervalidating.nebula-graph.io": failed to call webhook: Post "https://nebula-operator-webhook-service.nebula-operator-system.svc:443/apis/admission.nebula-graph.io/v1alpha1/nebulaclustervalidating?timeout=10s": no endpoints available for service "nebula-operator-webhook-service"
exit status 1

Conf change precisely reconcile
Some of the configurations are possible to be lively changed with curl from the HTTP interface, some are not.

If we scope them properly, ideally, in the operator, we could choose to reconcile corresponding configurations:

for those need a process restart, do it

for those who don't need a restart, update with the HTTP interface and update the conf file, on the fly.
compatibility: v0.8.0 --> v0.9
There are some pain points that could be addressed if possible:

[ ] manually CRD update is required, this could be a pitfall even in case of fresh deployment of v0.9.0 operator

[ ] upgrade path from v0.8.0(v2.5.x) to v0.9(v2.5.x) seems to be not possible due to the CRD change, it will force the end-user to run production v0.8.0 to a redeployment

Thanks :) PS. great job on the v0.9.0 on rolling upgrade support!

https://discuss.nebula-graph.com.cn/t/topic/6656

https://discuss.nebula-graph.com.cn/t/topic/6702

https://discuss.nebula-graph.com.cn/t/topic/6573
ngctl: small improvements
First, it's awesome to have ngctl, really love it, thank you so much!!

[x] In initial connection, the first line of +--------------------+ should be better newline-ed:

$ngctl console (root@nebula) [(none)]> show spaces (root@nebula) [(none)]> +--------------------+ | Name | +--------------------+ | "basketballplayer" | | "shareholding" | +--------------------+

[ ] Possible to support UP-btn for history commands or even Ctrl-R?

Nebula Operator manages NebulaGraph clusters on Kubernetes and automates tasks related to operating a NebulaGraph cluster

Nebula Operator

Quick Start

install nebula operator

Create and destroy a nebula cluster

Resize a nebula cluster

Failover

FAQ

Community

Activity

Contributing

Acknowledgements

License

Owner

vesoft inc.

Comments

Error while upgrading to 1.3

Failed to deploy k8s in huawei cloud

Install Operator failure

Deploy nebula operator with ArgoCD

GraphService.cpp:228] Unknown auth type:

fix typo and remove redundant if cases

Use a more clarify phrasing to state out the assigned value `gp2` on k8s property `storageClassName`

Clarify add-ons installation instructions

Summary

Cannot download latest nebula-cluster chart

add more configurable options for pod template

Add sidecar for nebula logging

Cluster creation fails due to webhook error.

error when installing actual nebula-cluster v1.3.0

Conf change precisely reconcile

compatibility: v0.8.0 --> v0.9

ngctl: small improvements

Related tags

YurtCluster Operator creates and manages OpenYurt cluster atop Kubernetes

cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resource objects related of Kubernetes Cluster API.

Basic Kubernetes operator that have multiple versions in CRD. This operator can be used to experiment and understand Operator/CRD behaviors.

vcluster - Create fully functional virtual Kubernetes clusters - Each cluster runs inside a Kubernetes namespace and can be started within seconds

An operator which complements grafana-operator for custom features which are not feasible to be merged into core operator

Terraform-operator - The Terraform Operator provides support to run Terraform modules in Kubernetes in a declaritive way as a Kubernetes manifest

PolarDB-X Operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes.

KinK is a helper CLI that facilitates to manage KinD clusters as Kubernetes pods. Designed to ease clusters up for fast testing with batteries included in mind.

Cloud-gaming-operator - The one that manages VMs for cloud gaming built on GCE

Kubernetes operator to autoscale Google's Cloud Bigtable clusters

Kubernetes Operator Samples using Go, the Operator SDK and OLM

Manages nodes in hybrid k8s self-hosted cluster

Kubegres is a Kubernetes operator allowing to create a cluster of PostgreSql instances and manage databases replication, failover and backup.

The Elastalert Operator is an implementation of a Kubernetes Operator, to easily integrate elastalert with gitops.

Minecraft-operator - A Kubernetes operator for Minecraft Java Edition servers

K8s-network-config-operator - Kubernetes network config operator to push network config to switches

Pulumi-k8s-operator-example - OpenGitOps Compliant Pulumi Kubernetes Operator Example

Kubernetes Operator for MySQL NDB Cluster.

Sbom-operator - Catalogue all images of a Kubernetes cluster to multiple targets with Syft

error when installing actual `nebula-cluster v1.3.0`