Core - Eru, a simple, stateless, flexible, production-ready orchestrator designed to easily integrate into existing workflows. Can run any virtualization things in long or short time.

Eru

Codacy Badge

Eru is a stateless, flexible, production-ready resource scheduler designed to easily integrate into existing systems.

Eru can use multiple engines to run anything for the long or short term.

This project is Eru Core. The Core use for resource allocation and manage resource's lifetime.

Testing

Run make test

Compile

  • Run make build if you want binary.
  • Run ./make-rpm if you want RPM for el7. However we use FPM for packing, so you have to prepare it first.

Developing

Run make deps for generating vendor dir.

You can use our footstone image for testing and compiling.

GRPC

Generate golang grpc definitions.

go get -u github.com/golang/protobuf/{proto,protoc-gen-go}
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
make grpc

Run it

$ eru-core --config /etc/eru/core.yaml.sample

or

$ export ERU_CONFIG_PATH=/path/to/core.yaml
$ eru-core

Dockerized Core manually

Image: projecteru2/core

docker run -d \
  --name eru_core_$HOSTNAME \
  --net host \
  --restart always \
  -v <HOST_CONFIG_DIR_PATH>:/etc/eru \
  projecteru2/core \
  /usr/bin/eru-core

Build and Deploy by Eru itself

After we implemented bootstrap in eru, now you can build and deploy eru with cli tool.

  1. Test source code and build image
<cli_execute_path> --name <image_name> http://bit.ly/EruCore

Make sure you can clone code. After the fresh image was named and tagged, it will be auto pushed to the remote registry which was defined in config file.

  1. Deploy core itself
<cli_execute_path> workloads deploy --pod <pod_name> [--node <node_name>] --entry core --network <network_name> --image <projecteru2/core>|<your_own_image> --file <core_config_yaml>:/core.yaml [--count <count_num>] [--cpu 0.3 | --mem 1024000000] http://bit.ly/EruCore

Now you will find core was started in nodes.

Owner
Eru
Docker container orchestration platform
Eru
Comments
  • Resource Plugin

    Resource Plugin

    关于资源插件化的PR。

    设计想法:对于core来说,它不需要知道各种资源的作用&分配方式,只需要把请求的参数转发给资源插件即可。资源插件会进行资源的具体分配工作,并对core返回engine args,用于后续engine层创建workload。

    资源插件采用可执行文件的方式,core会加载指定目录下的所有可执行文件作为资源插件。

    在创建workload的流程中,eru-cli要把所有关于资源的请求参数转化为map[string][]string的形式,例如{"cpu": ["1.2"]}。由于core不知道哪些参数对应哪个资源插件,所以它会把所有的参数依次转发给所有的资源插件。各个插件自己负责解析自己需要的参数,并做出正确的处理。

    由于要实现插件化,所以资源相关的信息并不会再存储在node metadata里了,而是由各个资源插件自己负责存储,存储也不一定用ETCD,处理得当的话MySQL / MongoDB之类的也行(就是部署麻烦点)。

    因此在workload metadata中,关于资源的描述换成了map[string]map[string][]string的方式(是不是看着有点绕),例如:

    {
      "cpu-plugin": {
        "cpu": ["1.2"],
        "file": ["some-file"]
      },
      "mem-plugin": {
        "mem": ["1PB"],
        "file": ["some-other-file"]
      }
    }
    

    要分别记录各个资源插件分配的资源,否则在rollback / remove要释放资源的时候不知道怎么释放。

    主要要做的事情:

    • 把当前资源处理的逻辑改成插件的形式。
    • cluster层和engine层要基于resource args的改动做出变化。
  • Abstract Resources

    Abstract Resources

    指导思想

    1. DeployOptions 里不再有各个资源, 而是统一的 []ResourceRequest
    2. 划清了 NodeInfo 的使用边界 a. NodeInfo 分拆为两个对象 NodeInfo + StrategyInfo, 新 NodeInfo 负责给 scheduler 提供信息, StrategyInfo 为 strategy deploy 提供信息 b. scheduler 和 strategy 的返回信息不再记录在 NodeInfo 和 StrategyInfo 里, 而是单独返回数据结构 c. NodeInfo 只出现在 SelectNode 函数里, StrategyInfo 只出现在 strategy.Deploy 里, 不泄漏, 不滥用
    3. 事务 a. doAllocResource 不再提交变更给 store, 只做计算, 提交变更由上层完成, 便于实现事务 b. doCreateWorkloads 实现资源元数据事务, 对资源元数据的提交和回滚都 在这个函数里实现 c. doDeployOneWorkload 实现容器元数据的提交, 容器元数据和容器实例的一致性在这个函数里保证
  • safeSplit will discard quoted single word (w/o space)

    safeSplit will discard quoted single word (w/o space)

    https://github.com/projecteru2/core/blob/30f925f8346f578e6d0d9e95530eaba34b846647/utils/utils.go#L271-L298

    If s has a single quoted word, block will not be appended to result correctly and resulting in absence of this word after split.

    e.g. with eru-cli single quoted word:

    eru-cli lambda --pod eru echo \"123\"
    INFO[2020-11-10 17:07:16] [EruResolver] start sync service discovery
    [0778c83]: &&
    

    with non-single quoted word it will be correct:

    eru-cli lambda --pod eru echo \"123 456\"
    INFO[2020-11-10 17:07:06] [EruResolver] start sync service discovery
    [63e1660]: && 123 && 456
    
  • optimize engine cache checker

    optimize engine cache checker

    同https://github.com/projecteru2/core/pull/539

    P.S. 应该是因为我之前有个改动提了2个PR,分别到master和v21.01.05的某个alpha版本,导致直接git rebase v21.01.05再提PR出现一大堆commits,感觉这样是不太合理的。以后应该只提1个PR到alpha版本,等QA测完了再合并到master去。

  • Systemd engine改进

    Systemd engine改进

    Systemd engine改进

    已经实现的API

    1. Info
    2. VirtualizationCreate
    3. VirtualizationCopyTo
    4. VirtualizationStart
    5. VirtualizationStop
    6. VirtualizationRemove
    7. VirtualizationInspect

    可以添加的API

    1. VirtualizationLogs

    部署改进 - 把Systemd执行内容打包成标准的分发内容(Docker Image, Tar &&)

    • 如果使用Docker Image,镜像可以作为Container执行,也可以把文件抽离后用Systemd执行
    • 通过Image分发内容(镜像内包含标准化的systemd执行内容)
    1. 实现ImageLocalDigests
    2. 实现ImagePull
    • 通过Docker Image构建出systemd运行内容
    1. 改进VirtualizationCreate
    • 构建Systemd镜像,实现以下API
    1. BuildRefs
    2. BuildContent
    3. ImageBuild
    4. ImageBuildCachePrune
    5. ImageBuildFromExist
    6. ImageList
    7. ImageRemove
    8. ImagesPrune

    其他可实现API

    1. VirtualizationResize

    其他讨论

    1. 替换ssh client?在目标机器的容器内一定条件下逃逸掉systemctl指令?好处:统一到docker client实现
  • HA: server side

    HA: server side

    features:

    1. eru-core registers service address in etcd once up
    2. eru-core unregisters service address in etcd once down
    3. in the case eru-core is killed brutally (kill -9, OOM, machine shutdown, network issue), service address key expires in 10s
    4. eru-core expose WatchServiceStatus RPC to publish all available service addresses
  • [WIP] Regular integration test using YAML

    [WIP] Regular integration test using YAML

    在写 image 测试用例的时候梳理清楚了需求, 所以直接重写了, 原来的 https://github.com/projecteru2/core/pull/469 关闭.

    这一套方案解决几个问题:

    1. 可以直接写 yaml 来声明请求, 不必被 pb 类型恶心, 这是用 grpc 反射来做的. 这么做的直接动机是, 不依赖 eru-cli, 甚至不依赖 core/client module, 而是直接用 proto 文件, 避免了 eru-cli 等中间工具的 bug 带来的潜在隐患.

    2. 实现了一套参数组合策略来批量生成请求, 比如

    requests:
      name: zc
      entrypoint:
        name: zc
        command: sleep 1000000
      podname: test
      image: bash
      count@:
        - 1
        - 2
        - 3
        - 4
        - 5
        - 6 
      deploy_strategy@:
        - AUTO
        - FILL
        - EACH
        - GLOABL
        - DUMMY
      resource_opts:
        cpu_quota_limit: 0.1
        cpu_bind: true
        memory_limit: 14000000
    
    

    上面的这个 Deploy 请求声明, 本来 strategy 和 count 字段类型不是 array, 但是如果在字段最后加上一个 @, 那么其值声明 array 会被用来生成组合, 比如上面的示例会生成 5*6=30 个请求, 包含了所有参数组合(笛卡尔积). 这主要是方便批量构造包含全体参数排列组合状况的请求.

    1. 可以在 yaml 里写 assert, 支持 bash, 好处是非常灵活, 比如说删除一个 workload 之后我的 assert 写成这样:
      asserts:
        for_each:
          equals:
            - actual: jq -r '.id' <<<$resp
              expected: jq -r '.id' <<<$req
            - actual: eru-cli -o json workload get $(jq -r '.id' <<<$req) || echo not found
              expected: echo not found
            - actual: etcdctl get /eru/ --prefix | grep $(jq -r '.id' <<<$req) -c
              expected: echo 0
    

    有三组 equal assert, 第一个是检查 grpc API response 是否包含正确的字段 id, 第二个是通过 eru-cli 检查 get workload 是否会 not found, 第三个会直接检查 etcd 里的数据. 直接裸写 bash 的灵活性可以让我做非常全面的检查, 更多的例子比如设置资源后通过 docker inspect 检查设置, 以及直接检查 cgroups, 等. 4. grpc 请求参数也允许运行 bash, 比如

      requests:
        ids: $bash(test -p /tmp/fifo || mkfifo /tmp/fifo; cat /tmp/fifo)
        force: true
    

    里的 id 就是通过一个 /tmp/fifo 读出来的, 这个 fifo 的写端是之前另一个请求的结果.


    至于 image 的测试, 应该是小菜一叠了.

  • There might be bug when scheduling volumes

    There might be bug when scheduling volumes

    On one host two containers are assigned to the same volume:

    INFO[2021-05-21 14:34:09] CPU Used: 46.00                              
    INFO[2021-05-21 14:34:09] Memory Used: 380104605696/1099511627776 bytes 
    INFO[2021-05-21 14:34:09]   Volume /data1: Used 721554505728/943819063296 bytes 
    INFO[2021-05-21 14:34:09]   Volume /data2: Used 721554505728/943819063296 bytes 
    INFO[2021-05-21 14:34:09]   Volume /data3: Used 721554505728/943819063296 bytes 
    INFO[2021-05-21 14:34:09]   Volume /data4: Used 721554505728/943819063296 bytes 
    INFO[2021-05-21 14:34:09]   Volume /data5: Used 721554505728/943819063296 bytes 
    INFO[2021-05-21 14:34:09]   Volume /data6: Used 721554505728/943819063296 bytes 
    INFO[2021-05-21 14:34:09]   Volume /data7: Used 721554505728/943819063296 bytes 
    INFO[2021-05-21 14:34:09] Volume Used: 5050881540096/6606733443072 bytes 
    INFO[2021-05-21 14:34:09] Storage Used: 5050881540096 bytes
    

    each container requires 682GB space, they are both assigned to volume /data4

  • Label filter support on agent

    Label filter support on agent

    Background

    In some scenarios one node may be added to different Eru, that is, sort of reusing or sharing.
    By doing this we can share the same node with different types of businesses. But there is a problem on the agent that it cannot detect which set of containers belonging to it.

    Proposal

    • Introduce label filters to agent.
    • Better to introduce a new label to containers like ERU_NAME, etc. .
  • docker engine BuildFromExist

    docker engine BuildFromExist

    用法:

    cli image build --exist --name zctest $CONTAINER_ID
    

    然后就会 build && push, 镜像名字是 $(core.yaml 里的 hub domain)/$(core.yaml 里的 docker namespace)/$(命令行里的 --name):latest, 我测试的结果是 harbor.shopeemobile.com/zc/zctest:latest

  • [WIP] fix vet error

    [WIP] fix vet error

    engine/docker下有两个test,其中有一个编译是通不过的,但是在make test的时候没有把他们包含在里面。

    以及我发现github这个ci似乎有点奇怪,unit test里可能有若干个命令,但是只有最后一个命令报错的时候,才会导致ci报错。

    比如现在unit test里有三步,分别是一个go vet和两个go test。第一个go vet和第一个go test报错都不会导致ci报错。有机会我想看看这个问题的原因是什么...

  • logs-related

    logs-related

    in future

    • uses pkgerrors for init error objects in types/errors.go, because the standard errors package doesn't include stackinfo.
    • introduces a Error() to avoid the necessary 3rd option "format" within Errorf
    • to delete the unused "ctx" option on those log functions.
  • [WIP] Add send large file from cli

    [WIP] Add send large file from cli

    关于cli的send命令,之前的send是直接整个文件发到core里面,但是core的grpc配置为单条message最大只能为20M,所以有些文件是无法传输到core端的。

    这里是增加了一个叫做SendLargeFile的方法,用stream的方式接受cli端分chunk发送过来的数据,并拼接起来再一并发送到容器里面。

  • How to remove a volume on a node in Eru

    How to remove a volume on a node in Eru

    Background

    Sometimes, we added wrong volume to a node. But we only can change the capacity or resource declaration using eru-cli node set but cannot remove it completely. So things could become:

      init_volume:
        /data: 945966546944
        /data1: 945966546944
        /data2: 945966546944
        /data3: 945966546944
        /data4: 945966546944
        /data5: 945966546944
        /data6: 945966546944
        /data7: 945966546944
        /data8: 0
      memory: 719407022080
      memory_used: 380104605696
      name: -------
      podname: --------
      storage: 2516850835456
      storage_used: 5050881540096
      volume:
        /data: 945966546944
        /data1: 224412041216
        /data2: 224412041216
        /data3: 224412041216
        /data4: 224412041216
        /data5: 224412041216
        /data6: 224412041216
        /data7: 224412041216
        /data8: 0
      volume_used: 5050881540096
    

    Expectation

    Able to remove it completely. So is there a way to do so?

SubCenter is a middleware that integrate task subscriptions and real-time push

Subscription Center SubCenter是一个集成各种任务并进行实时推送的中间件,本身不提供数据与推送服务。

Oct 31, 2022
Magma is an open-source software platform that gives network operators an open, flexible and extendable mobile core network solution.
Magma is an open-source software platform that gives network operators an open, flexible and extendable mobile core network solution.

Connecting the Next Billion People Magma is an open-source software platform that gives network operators an open, flexible and extendable mobile core

Dec 31, 2022
httpx is a fast and multi-purpose HTTP toolkit allows to run multiple probers using retryablehttp library, it is designed to maintain the result reliability with increased threads.
httpx is a fast and multi-purpose HTTP toolkit allows to run multiple probers using retryablehttp library, it is designed to maintain the result reliability with increased threads.

Features • Installation • Usage • Running httpx • Notes • Join Discord httpx is a fast and multi-purpose HTTP toolkit allow to run multiple probers us

Jan 8, 2023
A Xray backend framework that can easily support many panels. 一个基于Xray的后端框架,支持V2ay,Trojan,Shadowsocks协议,极易扩展,支持多面板对接

XRayR A Xray backend framework that can easily support many panels. 一个基于Xray的后端框架,支持V2ay,Trojan,Shadowsocks协议,极易扩展,支持多面板对接。 如果您喜欢本项目,可以右上角点个star+watch

Jan 4, 2023
fast tool for separate existing domains from list of domains using DNS/HTTP.

NETGREP How To Install • How to use Description netgrep can send http/https request or resolve domain from dns (can customize dns server) to separate

Jan 27, 2022
Gogrok is a self hosted, easy to use alternative to ngrok. It uses SSH as a base protocol, using channels and existing functionality to tunnel requests to an endpoint.

gogrok A simple, easy to use ngrok alternative (self hosted!) The server and client can also be easily embedded into your applications, see the 'serve

Dec 3, 2022
Sep 23, 2022
Tor ready router

DPI installing don't forget to replace the username with your username: mkdir build cd build wget https://openresty.org/download/openresty-1.19.9.1.ta

Feb 13, 2022
Deskreen turns any device with a web browser into a secondary screen for your computer
Deskreen turns any device with a web browser into a secondary screen for your computer

Deskreen Website: https://deskreen.com ▶️ Deskreen Youtube channel (video tutorials, demos, use cases for Deskreen day to day usage) Deskreen turns an

Jan 7, 2023
A Crypto-Secure, Production-Grade Reliable-UDP Library for golang with FEC
 A Crypto-Secure, Production-Grade Reliable-UDP Library for golang with FEC

Introduction kcp-go is a Production-Grade Reliable-UDP library for golang. This library intents to provide a smooth, resilient, ordered, error-checked

Dec 28, 2022
Package for downloading things from a string URL using a variety of protocols.

go-getter is a library for Go (golang) for downloading files or directories from various sources using a URL as the primary form of input.

Jan 6, 2023
BitTorrent client and library in Go. Running in production at put.io.
BitTorrent client and library in Go. Running in production at put.io.

rain BitTorrent client and library in Go. Running in production at put.io. Integration of embedded gRPC framework (All APIs translated to gRPC Server-

Apr 11, 2022
A small tool used to correspond to the IP address according to the name, id, and network alias of the docker container, which can be run as a DNS server

A small tool used to correspond to the IP address according to the name, id, and network alias of the docker container, which can be run as a DNS server

Apr 4, 2022
Totem - A Go library that can turn a single gRPC stream into bidirectional unary gRPC servers

Totem is a Go library that can turn a single gRPC stream into bidirectional unar

Jan 6, 2023
Ephemeral One Time/Build-Time gRPC TLS PKI system.

PkiSauce Ephemeral Build Time TLS PKI saucing for your intra services GRPC (or not) communications. Description A simple attempt to avoid deploying co

Jul 4, 2022
Yet another SIP003 plugin for shadowsocks, based on Xray-core

Yet another SIP003 plugin for shadowsocks, based on Xray-core Build go build Usage See command line args for advanced usages.

Jan 8, 2023
Open source 5G core network base on 3GPP R15
Open source 5G core network base on 3GPP R15

What is free5GC The free5GC is an open-source project for 5th generation (5G) mobile core networks. The ultimate goal of this project is to implement

Jan 4, 2023