Crane scheduler is a Kubernetes scheduler which can schedule pod based on actual node load.

Crane-scheduler

Overview

Crane-scheduler is a collection of scheduler plugins based on scheduler framework, including:

Get Started

1. Install Prometheus

Make sure your kubernetes cluster has Prometheus installed. If not, please refer to Install Prometheus.

2. Configure Prometheus Rules

  1. Configure the rules of Prometheus to get expected aggregated data:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
    name: example-record
spec:
    groups:
    - name: cpu_mem_usage_active
        interval: 30s
        rules:
        - record: cpu_usage_active
        expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[30s])) * 100)
        - record: mem_usage_active
        expr: 100*(1-node_memory_MemAvailable_bytes/node_memory_MemTotal_bytes)
    - name: cpu-usage-5m
        interval: 5m
        rules:
        - record: cpu_usage_max_avg_1h
        expr: max_over_time(cpu_usage_avg_5m[1h])
        - record: cpu_usage_max_avg_1d
        expr: max_over_time(cpu_usage_avg_5m[1d])
    - name: cpu-usage-1m
        interval: 1m
        rules:
        - record: cpu_usage_avg_5m
        expr: avg_over_time(cpu_usage_active[5m])
    - name: mem-usage-5m
        interval: 5m
        rules:
        - record: mem_usage_max_avg_1h
        expr: max_over_time(mem_usage_avg_5m[1h])
        - record: mem_usage_max_avg_1d
        expr: max_over_time(mem_usage_avg_5m[1d])
    - name: mem-usage-1m
        interval: 1m
        rules:
        - record: mem_usage_avg_5m
        expr: avg_over_time(mem_usage_active[5m])

⚠️ Troubleshooting: The sampling interval of Prometheus must be less than 30 seconds, otherwise the above rules(such as cpu_usage_active) may not take effect.

  1. Update the configuration of Prometheus service discovery to ensure that node_exporters/telegraf are using node name as instance name:
    - job_name: kubernetes-node-exporter
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecure_skip_verify: true
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      scheme: https
      kubernetes_sd_configs:
      ...
      # Host name
      - source_labels: [__meta_kubernetes_node_name]
        target_label: instance
      ...

Note: This step can be skipped if the node name itself is the host IP.

3. Install Crane-scheduler

There are two options:

  1. Install Crane-scheduler as a second scheduler:
    helm repo add crane https://gocrane.github.io/helm-charts
    helm install scheduler -n crane-system --create-namespace --set global.prometheusAddr="REPLACE_ME_WITH_PROMETHEUS_ADDR" crane/scheduler
  2. Replace native Kube-scheduler with Crane-scheduler:
    1. Backup /etc/kubernetes/manifests/kube-scheduler.yaml
    cp /etc/kubernetes/manifests/kube-scheduler.yaml /etc/kubernetes/
    1. Modify configfile of kube-scheduler(scheduler-config.yaml) to enable Dynamic scheduler plugin and configure plugin args:
    apiVersion: kubescheduler.config.k8s.io/v1beta2
    kind: KubeSchedulerConfiguration
    ...
    profiles:
    - schedulerName: default-scheduler
      plugins:
        filter:
          enabled:
          - name: Dynamic
        score:
          enabled:
          - name: Dynamic
            weight: 3
      pluginConfig:
      - name: Dynamic
         args:
          policyConfigPath: /etc/kubernetes/policy.yaml
    ...
    1. Create /etc/kubernetes/policy.yaml, using as scheduler policy of Dynamic plugin:
     apiVersion: scheduler.policy.crane.io/v1alpha1
     kind: DynamicSchedulerPolicy
     spec:
       syncPolicy:
         ##cpu usage
         - name: cpu_usage_avg_5m
           period: 3m
         - name: cpu_usage_max_avg_1h
           period: 15m
         - name: cpu_usage_max_avg_1d
           period: 3h
         ##memory usage
         - name: mem_usage_avg_5m
           period: 3m
         - name: mem_usage_max_avg_1h
           period: 15m
         - name: mem_usage_max_avg_1d
           period: 3h
    
       predicate:
         ##cpu usage
         - name: cpu_usage_avg_5m
           maxLimitPecent: 0.65
         - name: cpu_usage_max_avg_1h
           maxLimitPecent: 0.75
         ##memory usage
         - name: mem_usage_avg_5m
           maxLimitPecent: 0.65
         - name: mem_usage_max_avg_1h
           maxLimitPecent: 0.75
    
       priority:
         ##cpu usage
         - name: cpu_usage_avg_5m
           weight: 0.2
         - name: cpu_usage_max_avg_1h
           weight: 0.3
         - name: cpu_usage_max_avg_1d
           weight: 0.5
         ##memory usage
         - name: mem_usage_avg_5m
           weight: 0.2
         - name: mem_usage_max_avg_1h
           weight: 0.3
         - name: mem_usage_max_avg_1d
           weight: 0.5
    
       hotValue:
         - timeRange: 5m
           count: 5
         - timeRange: 1m
           count: 2
    1. Modify kube-scheduler.yaml and replace kube-scheduler image with Crane-scheduler:
    ...
     image: docker.io/gocrane/crane-scheduler:0.0.23
    ...
    1. Install crane-scheduler-controller:
    kubectl apply ./deploy/controller/rbac.yaml && kubectl apply -f ./deploy/controller/deployment.yaml

4. Schedule Pods With Crane-scheduler

Test Crane-scheduler with following example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cpu-stress
spec:
  selector:
    matchLabels:
      app: cpu-stress
  replicas: 1
  template:
    metadata:
      labels:
        app: cpu-stress
    spec:
      schedulerName: crane-scheduler
      hostNetwork: true
      tolerations:
      - key: node.kubernetes.io/network-unavailable
        operator: Exists
        effect: NoSchedule
      containers:
      - name: stress
        image: docker.io/gocrane/stress:latest
        command: ["stress", "-c", "1"]
        resources:
          requests:
            memory: "1Gi"
            cpu: "1"
          limits:
            memory: "1Gi"
            cpu: "1"

Note: Change crane-scheduler to default-scheduler if crane-scheduler is used as default.

There will be the following event if the test pod is successfully scheduled:

Type    Reason     Age   From             Message
----    ------     ----  ----             -------
Normal  Scheduled  28s   crane-scheduler  Successfully assigned default/cpu-stress-7669499b57-zmrgb to vm-162-247-ubuntu
Owner
Crane
Cloud Resource Analytics and Economics
Crane
Comments
  • first issue !!

    first issue !!

    hello, I deployed crane-scheduler using helm chart. my prometheus svc addr is as below:

    but there are error logs in controller pod:

    Post "192.168.15.25/api/v1/query": unsupported protocol scheme ""

    can someone explains why it happens. Thanks.

  • helm 模板文件语法错误

    helm 模板文件语法错误

    helm chart中的templates/scheduler-deployment.yaml 语法错误,if格式修复如下

    containers:
          - command:
            - /scheduler
            - --leader-elect=false
            - --config=/etc/kubernetes/kube-scheduler/scheduler-config.yaml
            {{- if ge .Capabilities.KubeVersion.Minor "22" }}
            image: "{{ .Values.scheduler.image.repository }}:0.0.23"
            {{- else }}
            image: "{{ .Values.scheduler.image.repository }}:0.0.20"
            {{- end }}
    
  • dynamic plugin score is abnormal

    dynamic plugin score is abnormal

    What happened?

    applying score defaultWeights on Score plugins: plugin "Dynamic" returns an invalid score -8, it should in the range of [0, 100] after normalizing

    What did you expect to happen?

    pod can be scheduled successfully

    How can we reproduce it (as minimally and precisely as possible)?

    This problem occurs when the prometheus result times out, but the hotValue is normal.

  • 无法拉取到 0.0.20 版本的 docker 镜像

    无法拉取到 0.0.20 版本的 docker 镜像

    您好,我部署 0.0.20 版本的时候提示没有找到镜像,请问是我部署的过程有问题还是什么问题呢? k8s 版本:v1.21 helm 版本:v3.3.3

    部署流程:

    1. 直接使用 helm 部署,失败 图片

    2. 将项目克隆下来,使用 kubectl.exe apply -f rbac.yaml 部署成功,在 k8s 服务器上使用下面命令修改版本失败 KUBE_EDITOR="sed -i 's/v1beta2/v1beta1/g'" kubectl edit cm scheduler-config -n crane-system && KUBE_EDITOR="sed -i 's/0.0.23/0.0.20/g'" kubectl edit deploy crane-scheduler -n crane-system 图片 图片

    3. 直接修改 yaml 文件部署,部署后提示没有找到镜像

    将 git\crane-scheduler\deploy\manifests\scheduler-config.yaml 中的 v1beta2 修改为 v1beta1

    apiVersion: kubescheduler.config.k8s.io/v1beta1 kind: KubeSchedulerConfiguration leaderElection: ......

    将 git\crane-scheduler\deploy\controller\deployment.yaml 中的 0.0.23 修改为 0.0.20

    ...... command: - /controller - --policy-config-path=/data/policy.yaml - --prometheus-address=PROMETHEUS_ADDRESS image: docker.io/gocrane/crane-scheduler-controller:0.0.20 imagePullPolicy: IfNotPresent volumeMounts: - mountPath: /data name: dynamic-scheduler-policy ......

    提示没有找到镜像 图片

  • replace the k8s scheduler with crane scheduler,new pod pending

    replace the k8s scheduler with crane scheduler,new pod pending

    I replace the k8s scheduler with crane scheduler, and then created a new pod. I found the new pod always “Pending”, also no related events info.

    ... ... QoS Class: Burstable Node-Selectors: Tolerations: node-role.kubernetes.io/master:NoSchedule
    node.kubernetes.io/disk-pressure:NoSchedule op=Exists
    node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists node.kubernetes.io/pid-pressure:NoSchedule op=Exists node.kubernetes.io/unreachable:NoExecute op=Exists node.kubernetes.io/unschedulable:NoSchedule op=Exists Events:

    I can only found some useful info in logs, as as follows: I0614 02:18:42.492977 1 eventhandlers.go:118] "Add event for unscheduled pod" pod="kube-system/kubernetes-dashboard-jqhhq"

    I wonder if the new pod is pop from the ‘SchedulingQueue’, and how I solved the problem

  • 替换默认调度器问题

    替换默认调度器问题

    [root@zcsmaster1 manifests]# cat kube-scheduler.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-scheduler tier: control-plane name: kube-scheduler namespace: kube-system spec: containers:

    • command:

    - kube-scheduler

    - /scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=192.168.40.180
    - --config=/etc/kubernetes/kube-scheduler/scheduler-config.yaml
    

    - --kubeconfig=/etc/kubernetes/scheduler.conf

    - --leader-elect=true
    image: docker.io/gocrane/crane-scheduler:0.0.20
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 192.168.40.180 
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 12
      periodSeconds: 10
      timeoutSeconds: 15
    name: kube-scheduler
    resources:
      requests:
        cpu: 100m
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 192.168.40.180
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    volumeMounts:
    

    - mountPath: /etc/kubernetes/scheduler.conf

    name: kubeconfig

    readOnly: true

    - name: scheduler-config
      mountPath: /etc/kubernetes/kube-scheduler
      readOnly: true
    - name: dynamic-scheduler-policy
      mountPath: /etc/kubernetes
    

    hostNetwork: true priorityClassName: system-node-critical volumes:

    - hostPath:

    path: /etc/kubernetes/scheduler.conf

    type: FileOrCreate

    name: kubeconfig

    • name: scheduler-config configMap: name: scheduler-config
    • name: dynamic-scheduler-policy configMap: name: dynamic-scheduler-policy status: {}

    [root@zcsmaster1 manifests]#

    您好 ,对默认的调度器进行替换 ,这种方式一直不成功,可以出一下详细的文档吗? Events: Type Reason Age From Message


    Normal Scheduled 93s default-scheduler Successfully assigned kube-system/kube-scheduler to zcsnode2 Normal Pulled 36s (x4 over 93s) kubelet Container image "docker.io/gocrane/crane-scheduler:0.0.20" already present on machine Normal Created 36s (x4 over 93s) kubelet Created container kube-scheduler Normal Started 36s (x4 over 93s) kubelet Started container kube-scheduler Warning BackOff 3s (x10 over 91s) kubelet Back-off restarting failed container [root@zcsmaster1 manifests]# kubectl describe pod kube-scheduler -n kube-system

    Events: Type Reason Age From Message


    Normal Scheduled 53m default-scheduler Successfully assigned kube-system/crane-scheduler-controller-7845b4cbf7-dhrkm to zcsnode2 Normal Pulled 52m (x2 over 53m) kubelet Container image "docker.io/gocrane/crane-scheduler-controller:0.0.23" already present on machine Normal Created 52m (x2 over 53m) kubelet Created container controller Normal Started 52m (x2 over 53m) kubelet Started container controller Normal Killing 52m kubelet Container controller failed liveness probe, will be restarted Warning Unhealthy 51m (x5 over 53m) kubelet Liveness probe failed: Get "http://10.244.234.118:8090/healthz": dial tcp 10.244.234.118:8090: connect: connection refused Warning BackOff 8m29s (x116 over 46m) kubelet Back-off restarting failed container Warning Unhealthy 3m40s (x138 over 53m) kubelet Readiness probe failed: Get "http://10.244.234.118:8090/healthz": dial tcp 10.244.234.118:8090: connect: connection refused [root@zcsmaster1 manifests]#

  • crane-scheduler-controller 获取prometheus指标失败

    crane-scheduler-controller 获取prometheus指标失败

    crane-scheduler-controller版本:0.0.23 craned版本:0.5.0 k8s版本:1.21.10 docker版本:19.3.14 系统版本:Ubuntu 20.04.3 LTS

    手动调用prometheus api接口可以获取到对应的指标 curl -g http://prometheus-k8s.monitoring.svc.cluster.local:9090/api/v1/query?query=cpu_usage_avg_5m {"status":"success","data":{"resultType":"vector","result":[{"metric":{"name":"cpu_usage_avg_5m","instance":"ceph-01"},"value":[1656488784.456,"2.7104166665715894"]},{"metric":{"name":"cpu_usage_avg_5m","instance":"ceph-02"},"value":[1656488784.456,"1.9583333333351618"]},{"metric":{"name":"cpu_usage_avg_5m","instance":"ceph-03"},"value":[1656488784.456,"2.6000000000931323"]},{"metric":{"name":"cpu_usage_avg_5m","instance":"node-01"},"value":[1656488784.456,"4.0291666666841195"]},{"metric":{"name":"cpu_usage_avg_5m","instance":"node-04"},"value":[1656488784.456,"6.870833333426461"]},{"metric":{"name":"cpu_usage_avg_5m","instance":"ykj"},"value":[1656488784.456,"5.891666666672492"]}]}}/ #

    curl -g http://prometheus-k8s.monitoring.svc.cluster.local:9090/api/v1/query?query=mem_usage_avg_5m {"status":"success","data":{"resultType":"vector","result":[{"metric":{"name":"mem_usage_avg_5m","instance":"ceph-01","job":"node-exporter","namespace":"monitoring","pod":"node-exporter-sn9lp"},"value":[1656488826.549,"32.75862684328356"]},{"metric":{"name":"mem_usage_avg_5m","instance":"ceph-02","job":"node-exporter","namespace":"monitoring","pod":"node-exporter-dgd54"},"value":[1656488826.549,"15.044355868789062"]},{"metric":{"name":"mem_usage_avg_5m","instance":"ceph-03","job":"node-exporter","namespace":"monitoring","pod":"node-exporter-td7k2"},"value":[1656488826.549,"34.21244570563606"]},{"metric":{"name":"mem_usage_avg_5m","instance":"node-01","job":"node-exporter","namespace":"monitoring","pod":"node-exporter-zzxmd"},"value":[1656488826.549,"57.21168005976536"]},{"metric":{"name":"mem_usage_avg_5m","instance":"node-04","job":"node-exporter","namespace":"monitoring","pod":"node-exporter-2zkgk"},"value":[1656488826.549,"72.4792896090607"]},{"metric":{"name":"mem_usage_avg_5m","instance":"ykj","job":"node-exporter","namespace":"monitoring","pod":"node-exporter-xfq4n"},"value":[1656488826./

  • update condition for topology scheduling

    update condition for topology scheduling

    Signed-off-by: Garrybest [email protected]

    Now we not only support guaranteed pods, any pods which have guaranteed containers could be allocated now.

  • kube-scheduler.yaml 该如何修改

    kube-scheduler.yaml 该如何修改

    [root@zcsmaster1 manifests]# cat kube-scheduler.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-scheduler tier: control-plane name: kube-scheduler namespace: kube-system spec: containers:

    • command:
      • /scheduler
      • --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
      • --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
      • --bind-address=127.0.0.1 #KubeSchedulerConfiguration文件在容器中的路径
      • --kubeconfig=/etc/kubernetes/policy.yaml
      • --config=/etc/kubernetes/scheduler-config.yaml
      • --leader-elect=true image: docker.io/gocrane/crane-scheduler:0.0.20 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /healthz port: 10259 scheme: HTTPS initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 name: kube-scheduler resources: requests: cpu: 100m startupProbe: failureThreshold: 24 httpGet: host: 127.0.0.1 path: /healthz port: 10259 scheme: HTTPS initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 volumeMounts:
      • mountPath: /etc/kubernetes name: kubeconfig readOnly: true hostNetwork: true priorityClassName: system-node-critical volumes:
    • hostPath: path: /etc/kubernetes/ type: Directory name: kubeconfig status: {}

    请问那边有问题

    State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Tue, 18 Oct 2022 09:49:30 +0800 Finished: Tue, 18 Oct 2022 09:49:30 +0800 Ready: False Restart Count: 0

  • 最低版本是k8s 1.18吗?1.16可以用?

    最低版本是k8s 1.18吗?1.16可以用?

    k8s 1.16可以用吗?使用了1.18才有的特性?

    https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/624-scheduling-framework/README.md#graduation-criteria

    毕业标准描述。1.16开始支持调度框架

  • 集群重启 调度器出现问题 (使用的修改默认调度器)

    集群重启 调度器出现问题 (使用的修改默认调度器)

    E1018 09:42:10.621700 1 stats.go:128] [crane] failed to get node 's score: zcsnode2%!(EXTRA string=mem_usage_max_avg_1h, float64=33.699000000000005) E1018 09:42:10.621708 1 stats.go:128] [crane] failed to get node 's score: zcsnode2%!(EXTRA string=mem_usage_max_avg_1d, float64=33.699000000000005) I1018 09:42:10.621717 1 plugins.go:92] [crane] Node[zcsnode2]'s finalscore is 6, while score is 16 and hotvalue is 1.000000 E1018 09:48:25.615198 1 stats.go:128] [crane] failed to get node 's score: zcsmaster1%!(EXTRA string=cpu_usage_max_avg_1d, float64=45.77980000000001) E1018 09:48:25.615301 1 stats.go:128] [crane] failed to get node 's score: zcsmaster1%!(EXTRA string=mem_usage_max_avg_1d, float64=71.2381) I1018 09:48:25.615339 1 plugins.go:92] [crane] Node[zcsmaster1]'s finalscore is 35, while score is 35 and hotvalue is 0.000000 E1018 09:48:25.615397 1 stats.go:128] [crane] failed to get node 's score: zcsnode1%!(EXTRA string=cpu_usage_max_avg_1d, float64=47.9795) E1018 09:48:25.615417 1 stats.go:128] [crane] failed to get node 's score: zcsnode1%!(EXTRA string=mem_usage_max_avg_1d, float64=75.73570000000001) I1018 09:48:25.615424 1 plugins.go:92] [crane] Node[zcsnode1]'s finalscore is 37, while score is 37 and hotvalue is 0.000000 E1018 09:48:25.615447 1 stats.go:128] [crane] failed to get node 's score: zcsnode2%!(EXTRA string=cpu_usage_max_avg_1d, float64=47.5513) E1018 09:48:25.615461 1 stats.go:128] [crane] failed to get node 's score: zcsnode2%!(EXTRA string=mem_usage_max_avg_1d, float64=83.1259) I1018 09:48:25.615468 1 plugins.go:92] [crane] Node[zcsnode2]'s finalscore is 41, while score is 41 and hotvalue is 0.000000 E1018 09:48:56.352200 1 stats.go:128] [crane] failed to get node 's score: zcsmaster1%!(EXTRA string=cpu_usage_max_avg_1d, float64=45.77980000000001) E1018 09:48:56.352275 1 stats.go:128] [crane] failed to get node 's score: zcsmaster1%!(EXTRA string=mem_usage_max_avg_1d, float64=71.2381) I1018 09:48:56.352287 1 plugins.go:92] [crane] Node[zcsmaster1]'s finalscore is 35, while score is 35 and hotvalue is 0.000000 E1018 09:48:56.352346 1 stats.go:128] [crane] failed to get node 's score: zcsnode1%!(EXTRA string=cpu_usage_max_avg_1d, float64=47.9795) E1018 09:48:56.352368 1 stats.go:128] [crane] failed to get node 's score: zcsnode1%!(EXTRA string=mem_usage_max_avg_1d, float64=75.73570000000001) I1018 09:48:56.352379 1 plugins.go:92] [crane] Node[zcsnode1]'s finalscore is 37, while score is 37 and hotvalue is 0.000000 E1018 09:48:56.352415 1 stats.go:128] [crane] failed to get node 's score: zcsnode2%!(EXTRA string=cpu_usage_max_avg_1d, float64=47.5513) E1018 09:48:56.352455 1 stats.go:128] [crane] failed to get node 's score: zcsnode2%!(EXTRA string=mem_usage_max_avg_1d, float64=83.1259) I1018 09:48:56.352466 1 plugins.go:92] [crane] Node[zcsnode2]'s finalscore is 41, while score is 41 and hotvalue is 0.000000 E1018 09:51:48.506156 1 stats.go:128] [crane] failed to get node 's score: zcsmaster1%!(EXTRA string=cpu_usage_max_avg_1d, float64=45.854200000000006) E1018 09:51:48.506282 1 stats.go:128] [crane] failed to get node 's score: zcsmaster1%!(EXTRA string=mem_usage_max_avg_1d, float64=71.34190000000001) I1018 09:51:48.506296 1 plugins.go:92] [crane] Node[zcsmaster1]'s finalscore is 35, while score is 35 and hotvalue is 0.000000 E1018 09:51:48.506329 1 stats.go:128] [crane] failed to get node 's score: zcsnode1%!(EXTRA string=cpu_usage_max_avg_1d, float64=48.017900000000004) E1018 09:51:48.506357 1 stats.go:128] [crane] failed to get node 's score: zcsnode1%!(EXTRA string=mem_usage_max_avg_1d, float64=75.80170000000001) I1018 09:51:48.506364 1 plugins.go:92] [crane] Node[zcsnode1]'s finalscore is 37, while score is 37 and hotvalue is 0.000000 E1018 09:51:48.506390 1 stats.go:128] [crane] failed to get node 's score: zcsnode2%!(EXTRA string=cpu_usage_max_avg_1d, float64=47.545500000000004) E1018 09:51:48.506408 1 stats.go:128] [crane] failed to get node 's score: zcsnode2%!(EXTRA string=mem_usage_max_avg_1d, float64=83.0675) I1018 09:51:48.506416 1 plugins.go:92] [crane] Node[zcsnode2]'s finalscore is 41, while score is 41 and hotvalue is 0.000000

  • 开启多副本不生效

    开启多副本不生效

    crane-scheduler-controller版本:0.0.23 craned版本:0.5.0 k8s版本:1.21.10 docker版本:19.3.14 系统版本:Ubuntu 20.04.3 LTS image image scheduler pod日志: E0920 09:23:09.244915 1 scheduler.go:379] scheduler cache AssumePod failed: pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed E0920 09:23:09.244970 1 factory.go:338] "Error scheduling pod; retrying" err="pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed" pod="testpods-test/testpods-test-test-pods-65494bf66c-c8k6t" E0920 09:24:33.207711 1 scheduler.go:379] scheduler cache AssumePod failed: pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed E0920 09:24:33.207754 1 factory.go:338] "Error scheduling pod; retrying" err="pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed" pod="testpods-test/testpods-test-test-pods-65494bf66c-c8k6t" E0920 09:25:10.865375 1 scheduler.go:379] scheduler cache AssumePod failed: pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed E0920 09:25:10.865408 1 factory.go:338] "Error scheduling pod; retrying" err="pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed" pod="testpods-test/testpods-test-test-pods-65494bf66c-c8k6t" E0920 09:26:07.905010 1 scheduler.go:379] scheduler cache AssumePod failed: pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed E0920 09:26:07.905083 1 factory.go:338] "Error scheduling pod; retrying" err="pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed" pod="testpods-test/testpods-test-test-pods-65494bf66c-c8k6t" E0920 09:27:33.211667 1 scheduler.go:379] scheduler cache AssumePod failed: pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed E0920 09:27:33.211720 1 factory.go:338] "Error scheduling pod; retrying" err="pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed" pod="testpods-test/testpods-test-test-pods-65494bf66c-c8k6t" E0920 09:28:33.213730 1 scheduler.go:379] scheduler cache AssumePod failed: pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed E0920 09:28:33.213767 1 factory.go:338] "Error scheduling pod; retrying" err="pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed" pod="testpods-test/testpods-test-test-pods-65494bf66c-c8k6t" E0920 09:29:33.214463 1 scheduler.go:379] scheduler cache AssumePod failed: pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed E0920 09:29:33.214499 1 factory.go:338] "Error scheduling pod; retrying" err="pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed" pod="testpods-test/testpods-test-test-pods-65494bf66c-c8k6t" E0920 09:30:33.215495 1 scheduler.go:379] scheduler cache AssumePod failed: pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed E0920 09:30:33.215532 1 factory.go:338] "Error scheduling pod; retrying" err="pod 3ed0ea4b-407f-427e-a92d-1c1d2adbc55c is in the cache, so can't be assumed" pod="testpods-test/testpods-test-test-pods-65494bf66c-c8k6t"

  • rbac does not have sufficient permissions

    rbac does not have sufficient permissions

    What happened?

    When the pod uses the waitForFirstConsumer type pvc, the crane-scheduler does not have sufficient permissions to update the annotation of the pvc. Scheduler needs permission to update the pvc.

    What did you expect to happen?

    Pod was successfully scheduled.

    How can we reproduce it (as minimally and precisely as possible)?

    Create a pod that uses pvc, and the pvc uses a waitForFirstConsumer type storageclass.

Linstor-scheduler-extender - LINSTOR scheduler extender plugin for Kubernetes

linstor-scheduler-extender LINSTOR scheduler extender plugin for Kubernetes whic

Dec 30, 2022
Statefulset-scheduler (aka sfs-scheduler)

statefulset-scheduler (aka sfs-scheduler) Installation I already upload docker i

Dec 19, 2021
Scheduler: the scheduler of distbuild written in Golang

scheduler Introduction scheduler is the scheduler of distbuild written in Go. Pr

Feb 9, 2022
Scheduler - Scheduler package is a zero-dependency scheduling library for Go

Scheduler Scheduler package is a zero-dependency scheduling library for Go Insta

Jan 14, 2022
A Framework for FaaS load balancing | stack-scheduler repository|

P2PFaaS A Framework for FaaS load balancing | stack-scheduler repository Introduction The P2PFaaS is a framework that allows you to implement a load b

Oct 29, 2021
Reminder is a Golang package to allow users to schedule alerts.

Reminder is a Golang package to allow users to schedule alerts. It has 4 parts: Scheduler Repeater Notifier Reminder A scheduler takes in a t

Sep 28, 2022
Run Jobs on a schedule, supports fixed interval, timely, and cron-expression timers; Instrument your processes and expose metrics for each job.

A simple process manager that allows you to specify a Schedule that execute a Job based on a Timer. Schedule manage the state of this job allowing you to start/stop/restart in concurrent safe way. Schedule also instrument this Job and gather metrics and optionally expose them via uber-go/tally scope.

Dec 8, 2022
Easily schedule commands to run multiple times at set intervals (like a cronjob, but with one command)

hakcron Easily schedule commands to run multiple times at set intervals (like a cronjob, but for a single command) Description hakcron allows you to r

Aug 17, 2022
go-sche is a golang library that lets you schedule your task to be executed later.

go-sche is a golang library that lets you schedule your task to be executed later.

Dec 24, 2022
GPU Sharing Scheduler for Kubernetes Cluster
GPU Sharing Scheduler for Kubernetes Cluster

GPU Sharing Scheduler Extender in Kubernetes Overview More and more data scientists run their Nvidia GPU based inference tasks on Kubernetes. Some of

Jan 6, 2023
A lightweight job scheduler based on priority queue with timeout, retry, replica, context cancellation and easy semantics for job chaining. Build for golang web apps.

Table of Contents Introduction What is RIO? Concern An asynchronous job processor Easy management of these goroutines and chaining them Introduction W

Dec 9, 2022
You had one job, or more then one, which can be done in steps

Leprechaun Leprechaun is tool where you can schedule your recurring tasks to be performed over and over. In Leprechaun tasks are recipes, lets observe

Nov 23, 2022
Package tasks is an easy to use in-process scheduler for recurring tasks in Go

Tasks Package tasks is an easy to use in-process scheduler for recurring tasks in Go. Tasks is focused on high frequency tasks that run quick, and oft

Dec 18, 2022
A simple job scheduler backed by Postgres.

A simple job scheduler backed by Postgres used in production at https://operand.ai. Setup needs two environment variables, SECRET and ENDPOINT. The se

Sep 10, 2022
Chrono is a scheduler library that lets you run your task and code periodically
Chrono is a scheduler library that lets you run your task and code periodically

Chrono is a scheduler library that lets you run your tasks and code periodically. It provides different scheduling functionalities to make it easier t

Dec 26, 2022
cpuworker - A Customized Goroutine Scheduler over Golang Runtime
cpuworker - A Customized Goroutine Scheduler over Golang Runtime

cpuworker Status Working in process. Run the Demo Make sure the GOMAXPROCS is bigger than 1 and there is at least GOMAXPROCS physical OS threads avail

Dec 6, 2022
goInterLock is golang job/task scheduler with distributed locking mechanism (by Using Redis🔒).
goInterLock is golang job/task scheduler with distributed locking mechanism (by Using Redis🔒).

goInterLock is golang job/task scheduler with distributed locking mechanism. In distributed system locking is preventing task been executed in every instant that has the scheduler,

Dec 5, 2022
Chadburn is a scheduler alternative to cron, built on Go and designed for Docker environments.

Chadburn - a job scheduler Chadburn is a modern and low footprint job scheduler for docker environments, written in Go. Chadburn aims to be a replacem

Dec 6, 2022
Go distributed task scheduler

Go distributed task scheduler

Nov 13, 2021