Controller for ModelMesh

ModelMesh Serving

ModelMesh Serving is the Controller for managing ModelMesh, a general-purpose model serving management/routing layer.

Getting Started

To quickly get started with ModelMesh Serving, check out the Quick Start Guide.

For help, please open an issue in this repository.

Components and their Repositories

ModelMesh Serving currently comprises components spread over a number of repositories. The supported versions for the latest release are documented here.

Architecture Image

Issues across all components are tracked centrally in this repo.

Core Components

Runtime Adapters

  • modelmesh-runtime-adapter - the containers which run in each model serving pod and act as an intermediary between ModelMesh and third-party model-server containers. Its build produces a single "multi-purpose" image which can be used as an adapter to work with each of the out-of-the-box supported model servers. It also incorporates the "puller" logic which is responsible for retrieving the models from storage before handing over to the respective adapter logic to load the model (and to delete after unloading). This image is also used for a container in the load/unload path of custom ServingRuntime Pods, as a "standalone" puller.

Model Serving runtimes

ModelMesh Serving provides out-of-the-box integration with the following model servers.

ServingRuntime custom resources can be used to add support for other existing or custom-built model servers, see the docs on implementing a custom Serving Runtime

Supplementary

  • KServe V2 REST Proxy - a reverse-proxy server which translates a RESTful HTTP API into gRPC. This allows sending inference requests using the KServe V2 REST Predict Protocol to ModelMesh models which currently only support the V2 gRPC Predict Protocol.

Libraries

These are helper Java libraries used by the ModelMesh component.

  • kv-utils - Useful KV store recipes abstracted over etcd and Zookeeper
  • litelinks-core - RPC/service discovery library based on Apache Thrift, used only for communications internal to ModelMesh.

Contributing

Please read our contributing guide for details on contributing.

Building Images

# Build develop image
make build.develop

# After building the develop image,  build the runtime image
make build
Owner
KSERVE
Highly scalable and standards based Model Inference Platform on Kubernetes for Trusted AI
KSERVE
Comments
  • "code":3,"message":"json: cannot unmarshal string into Go value of type main.RESTRequest"

    Getting {"code":3,"message":"json: cannot unmarshal string into Go value of type main.RESTRequest"} which sending an inference request. Codes:

    from mlserver.codecs.pandas import PandasCodec # required mlserver>=1.1.0, 
    payloads = "./IDA_en.ndjson"
    df = pd.read_json(payloads, lines=True)
    df = df.fillna('')
    payload = PandasCodec.encode_request(df, use_bytes=False)
    response = requests.post("http://localhost:8008/v2/models/muc-en-aa-predictor/infer", json=payload.json())
    

    I never experienced the error. What's the wrong?

  • Higher payload size not working as described in doc

    Higher payload size not working as described in doc

    I have deployed a model using custom MLServer runtime. The gRPC inferencing is working as expected with small size payload.

    I modified the configurations to make it works with large payload size:

    1. In the runtime configuration:
    - name: MLSERVER_GRPC_MAX_MESSAGE_LENGTH
    value: "300000000"
    

    as well as: 2, In the global configmap:

    grpcMaxMessageSizeBytes: 300000000

    But still is giving error and showing the default exceed limits:

    io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: gRPC message exceeds maximum size 16777216: 65844251
            at io.grpc.Status.asRuntimeException(Status.java:530)
            at io.grpc.internal.MessageDeframer.processHeader(MessageDeframer.java:392)
            at io.grpc.internal.MessageDeframer.deliver(MessageDeframer.java:272)
            at io.grpc.internal.MessageDeframer.request(MessageDeframer.java:162)
            at io.grpc.internal.AbstractStream$TransportState$1RequestRunnable.run(AbstractStream.java:236)
            at io.grpc.netty.NettyServerStream$TransportState$1.run(NettyServerStream.java:198)
            at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
            at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
            at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
            at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:403)
            at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
            at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
            at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
            at java.base/java.lang.Thread.run(Thread.java:833)
    

    Client side error:

    Traceback (most recent call last):
      File "/temp/docker/grpc_call.py", line 66, in <module>
        response = grpc_stub.ModelInfer(inference_request_g)
      File "/usr/local/lib/python3.9/site-packages/grpc/_channel.py", line 946, in __call__
        return _end_unary_response_blocking(state, call, False, None)
      File "/usr/local/lib/python3.9/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
        raise _InactiveRpcError(state)
    grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
            status = StatusCode.CANCELLED
            details = "Received RST_STREAM with error code 8"
            debug_error_string = "UNKNOWN:Error received from peer ipv4:10.244.8.5:8033 {grpc_message:"Received RST_STREAM with error code 8", grpc_status:1, created_time:"2022-11-03T18:47:46.142301977+00:00"}"
    

    How can I overcome this situation?

  • feat: TorchServe support

    feat: TorchServe support

    Motivation

    The Triton runtime can be used with model-mesh to serve PyTorch torchscript models, but it does not support arbitrary PyTorch models i.e. eager mode. KServe "classic" has integration with TorchServe but it would be good to have integration with model-mesh too so that these kinds of models can be used in distributed multi-model serving contexts.

    Modifications

    The bulk of the required changes are to the adapter image, covered by PR https://github.com/kserve/modelmesh-runtime-adapter/pull/34.

    This PR contains the minimal controller changes needed to enable the support:

    • TorchServe ServingRuntime spec
    • Add "torchserve" to the list of supported built-in runtime types
    • Add "ID extraction" entry for TorchServe's gRPC Predictions RPC so that model-mesh will automatically extract the model name from corresponding request messages

    Note the supported model format is advertised as "pytorch-mar" to distinguish from the existing "pytorch" format that refers to raw TorchScript .pt files as supported by Triton.

    Result

    TorchServe can be used seamlessly with ModelMesh Serving to serve PyTorch models, including eager mode.

    Resolves #63

  • feat: storage phase 1 for inference service reconciler

    feat: storage phase 1 for inference service reconciler

    Motivation

    rebase #32 to the new inference service reconciler for model mesh

    For Storage Spec details, please refer to the design doc: https://docs.google.com/document/d/1rYNh93XNMRp8YaR56m-_zK3bqpZhhGKgpVbLPPsi-Ow/edit#

    Additional storages/parameters support will come in phase 2.

    Modifications

    Result

  • chore: Automatically set kube context in development container

    chore: Automatically set kube context in development container

    Motivation

    When using the containerized development environment make develop to run FVT tests, one needs to configure access to a Kubernetes or OpenShift cluster from inside the container. Which has to be done for every make develop session. This can be tricky when cloud provider specific CLI tools are needed to connect and authenticate to a cluster.

    Currently there is a short paragraph in the FVT README about how to export a minified kubeconfig file and create that inside the container. It is tedious to repeat those steps for each make develop session and depending on OS, shell environment, editors and possible text encoding issue it is also error prone.

    Modifications

    This PR proposes to automatically create the kubeconfig file in a local and git-ignored directory inside the local project and automatically mount it to the develop container. All the user then has to do is connect and authenticate to the cluster in the shell that will be running make develop.

    Result

    Kubernetes context is ready inside the development container.

    # shell environment, outside the develop container has access to K8s cluster
    [modelmesh-serving_ckadner]$ kubectl get pods
    
    NAME                                        READY   STATUS    RESTARTS   AGE
    pod/etcd                                    1/1     Running   0          17m
    pod/minio                                   1/1     Running   0          17m
    pod/modelmesh-controller-387aef25be-ftyqu   1/1     Running   0          17m
    
    [modelmesh-serving_ckadner]$ make develop
    
    ./scripts/build_devimage.sh
    Pulling dev image kserve/modelmesh-controller-develop:6be58b09c25833c1...
    Building dev image kserve/modelmesh-controller-develop:6be58b09c25833c1...
    Image kserve/modelmesh-controller-develop:6be58b09c25833c1 has 14 layers
    Tagging dev image kserve/modelmesh-controller-develop:6be58b09c25833c1 as latest
    ./scripts/develop.sh
    [root@17c121286549 workspace]# kubectl get pods
    NAME                                        READY   STATUS    RESTARTS   AGE
    pod/etcd                                    1/1     Running   0          18m
    pod/minio                                   1/1     Running   0          18m
    pod/modelmesh-controller-387aef25be-ftyqu   1/1     Running   0          18m
    [root@17c121286549 workspace]# 
    

    /cc @njhill

  • chore: Update GH Action workflows

    chore: Update GH Action workflows

    Both workflows for PRs and pushes were cleaned up. On a push (which includes when PRs merge), the code base is linted and tested first before building and publishing.

  • fix: Ensure delete.sh script works properly with implicit namespace

    fix: Ensure delete.sh script works properly with implicit namespace

    Motivation

    The delete.sh cleanup script currently doesn't require a namespace to be provided via the -n option and uses the current kubectl context's namespace otherwise. The $namespace variable wasn't set in this case however meaning later parts of the script might not work as intended.

    Modifications

    Ensure $namespace variable is set correctly either way.

    Result

    delete.sh script works properly when -n option isn't used.

  • fvt: a bunch of FVT improvements

    fvt: a bunch of FVT improvements

    Motivation

    A bunch of improvements to the FVT framework coming from our internal fork.

    After parallelizing and expanding the FVT suite to support testing the REST proxy internally, we had issues with the consistency of the FVTs. It took a few iterations of improvements to get them back to stable while we continued to add support for more tests. With the "fixes" and "features" coupled in a few different PRs the changes cannot be easily disentangled. This is a big mess of a PR, but the final result should be in a good place with all of our internal improvements.

    Modifications

    Test Parallelization:

    • split FVTs into separate suites (go packages)
      • ginkgo can parallelize within a suite, but runs the suites sequentially
    • refactors to enable sharing of code across the FVT suites
    • support parallelization by using ginkgo CLI to execute the tests instead of go test
    • use Ordered/Serial decorators on groups of tests that require it
      • this can help to speed up "inference" tests by creating the predictor once and using it across multiple specs, but it does mean some specs are not independent
      • TLS tests are marked as Serial because they require roll outs of the runtime pods
    • to help debugging, print inference services on failure in Predictor FVTs
    • avoid a nil pointer dereference that can occur if FVTs error during initialization while running in parallel
    • remove sleep in AfterEach of TLS tests
    • update port-forwards to select a pod directly from the Endpoints object corresponding to the service
      • when port-forwarding to a Service, there is no guard against selecting a Terminating pod

    Config and Secrets:

    • specify the full DefaultConfig in code instead of in the user-configmap.yaml file
    • allow TLS config maps to be overlayed on the base config (instead of template string in YAML)
    • generate TLS certificates for each run of the FVTs instead of using hard-coded certs

    REST Proxy Tests:

    • enable the REST proxy for FVTs and add inference tests using proxy
    • have the FVT Client manage port-forwards for each of REST and gRPC

    Result

    • a faster and more efficient FVT suite with parallelization
    • improved FVT stability and extensibility to support future changes
  • feat: Update InferenceService reconciliation logic

    feat: Update InferenceService reconciliation logic

    The ModelMesh controller can now reconcile InferenceServices using the new Model Spec in the predictor. Also, an issue was fixed with there being a leading slash in the model path when an InferenceService StorageURI was parsed. This was causing issues in the adapter, preventing models from being loaded as seen in #97.

    Closes: #96, #97 Related: #90

    Result

    Users can now successfully deploy an InferenceService using the predictor Model Spec such as the following:

    apiVersion: serving.kserve.io/v1beta1
    kind: InferenceService
    metadata:
      name: example-sklearn-isvc
      annotations:
        serving.kserve.io/deploymentMode: ModelMesh
        serving.kserve.io/secretKey: localMinIO
    spec:
      predictor:
        model:
          modelFormat:
            name: sklearn 
          storageUri: s3://modelmesh-example-models/sklearn/mnist-svm.joblib
    
  • ModelMesh Release Tracker for KServe v0.7.0

    ModelMesh Release Tracker for KServe v0.7.0

    The plan is to cut the KServe 0.7 release mid next week. For this release, ModelMesh will be loosely integrated with KServe.

    Action Items:

    • [x] ModelMesh InferenceService CRD Support
      • [x] https://github.com/kserve/modelmesh-serving/pull/34
      • [x] Documentation on using InferenceService CR with ModelMesh
        • https://github.com/kserve/modelmesh-serving/pull/47
    • [x] ModelMesh REST Proxy Sidecar Support
      • [x] https://github.com/kserve/modelmesh-serving/pull/27
      • [x] Documentation on using REST inferencing
    • [x] Add KServe ModelMesh DeploymentMode Annotation checker
      • https://github.com/kserve/kserve/pull/1851
    • [ ] Update KServe hack/quick_install.sh to include ModelMesh-Serving as part of installation.
      • https://github.com/kserve/kserve/pull/1844
    • [x] Documentation updates
      • [x] Refresh External ModelMesh Documentation
        • https://github.com/kserve/modelmesh-serving/pull/39
      • [x] Update KServe Website with ModelMesh Documentation
        • https://github.com/kserve/website/issues/17
        • Can view website here: https://kserve.github.io/website/
        • https://github.com/kserve/website/pull/32
        • https://github.com/kserve/website/pull/37
    • [x] Assemble release process items
      • Tag release for version v0.7.0 to follow suit with KServe.
      • [x] GitHub workflow for tagged release
      • [x] Release process documentation
        • Create document outline process of creating a release branch and tagging from commit in that branch. KServe should already have document like this.
        • https://github.com/kserve/modelmesh-serving/pull/40
        • https://github.com/kserve/modelmesh/pull/7
        • https://github.com/kserve/modelmesh-runtime-adapter/pull/7
  • Adjust FVT GH-Actions workflow

    Adjust FVT GH-Actions workflow

    Motivation

    Decrease the flakiness of FVT runs that occur when certain tests are run back to back.

    Modifications

    The rollingUpdate strategy is adjusted in a preprocessing step of the FVT Github Actions workflow to allow better stability in low resource environments. The defaultTimeout was increased to account for the the changes in strategy. Ran into some intermittent failures due to timeouts when the deployment doesn't become ready in time.

    Result

    Less flakiness in FVT runs.

  • test: Add TorchServe FVT

    test: Add TorchServe FVT

    Motivation

    Support for TorchServe was added in #250 and https://github.com/kserve/modelmesh-runtime-adapter/pull/34. A test should be added for it as well.

    Modifications

    • Adds basic FVT for load/inference with a TorchServe MAR model using the native TorchServe gRPC API

    Result

    Closes #280

  • Remove residual

    Remove residual "Watson" references

    Model-mesh was originally developed as part of IBM Watson. Now that it is part of KServe we should scrub any remaining places that "watson" is used in the codebase, at least starting with those that are straightforward to change.

  • Update Dockerfile packages

    Update Dockerfile packages

    Motivation

    Fix vulnerabilities in the ModelMesh Controller image.

    Modifications

    1. Remove dependencies to any packages in the runtime.
    2. Updated Go version to the latest version.

    Result

    The feature is functionally working, I have tested it on my own cluster.

  • OutOfDirectMemoryError on setting higher grpc input size

    OutOfDirectMemoryError on setting higher grpc input size

    Describe the bug

    Followed this doc and have set the grpcMaxMessageSizeBytes to 400000000
    here's my config.yaml from model-serving-config ConfigMap

    podsPerRuntime: 1
    metrics:
      enabled: true
    grpcMaxMessageSizeBytes: 400000000
    

    When I make a grpc call, it throws

    Dec 01, 2022 8:15:15 PM io.grpc.netty.NettyServerTransport notifyTerminated
    INFO: Transport failed
    io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 4194304 byte(s) of direct memory (used: 75497758, max: 76546048)
    	at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:806)
    	at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:735)
    	at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:649)
    	at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:624)
    	at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:203)
    	at io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:187)
    	at io.netty.buffer.PoolArena.allocate(PoolArena.java:136)
    	at io.netty.buffer.PoolArena.allocate(PoolArena.java:126)
    	at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:396)
    	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188)
    	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
    	at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:116)
    	at io.netty.handler.codec.ByteToMessageDecoder.expandCumulation(ByteToMessageDecoder.java:541)
    	at io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:97)
    	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:277)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800)
    	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:487)
    	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:385)
    
    

    in triton, I can see that the model is working fine and you can also see the input shape and that the model did release the response

    I am not sure what other changes I need to get this to work.
    I suppose this is where it's being set. I tried to back track it to some setting but couldn't.

    I1201 20:15:13.885755 1 grpc_server.cc:3466] Process for ModelInferHandler, rpc_ok=1, 15 step START
    I1201 20:15:13.885773 1 grpc_server.cc:3459] New request handler for ModelInferHandler, 17
    I1201 20:15:13.885778 1 model_repository_manager.cc:593] GetModel() '6230834ea7f575001e824ce9__isvc-14826e7e9a' version -1
    I1201 20:15:13.885783 1 model_repository_manager.cc:593] GetModel() '6230834ea7f575001e824ce9__isvc-14826e7e9a' version -1
    I1201 20:15:13.885792 1 infer_request.cc:675] prepared: [0x0x7f3bc0007b50] request id: , model: 6230834ea7f575001e824ce9__isvc-14826e7e9a, requested version: -1, actual version: 1, flags: 0x0, correlation id: 0, batch size: 1, priority: 0, timeout (us): 0
    original inputs:
    [0x0x7f3bc0007e58] input: CLIP, type: UINT8, original shape: [1,32,589,617,3], batch + shape: [1,32,589,617,3], shape: [32,589,617,3]
    override inputs:
    inputs:
    [0x0x7f3bc0007e58] input: CLIP, type: UINT8, original shape: [1,32,589,617,3], batch + shape: [1,32,589,617,3], shape: [32,589,617,3]
    original requested outputs:
    requested outputs:
    classes
    scores
    
    I1201 20:15:13.885874 1 python.cc:616] model 6230834ea7f575001e824ce9__isvc-14826e7e9a, instance 6230834ea7f575001e824ce9__isvc-14826e7e9a, executing 1 requests
    I1201 20:15:13.947333 1 infer_response.cc:166] add response output: output: classes, type: BYTES, shape: [1,1]
    I1201 20:15:13.947355 1 grpc_server.cc:2555] GRPC: using buffer for 'classes', size: 18, addr: 0x7f3aec004b90
    I1201 20:15:13.947360 1 infer_response.cc:166] add response output: output: scores, type: FP32, shape: [1,1]
    I1201 20:15:13.947363 1 grpc_server.cc:2555] GRPC: using buffer for 'scores', size: 4, addr: 0x7f3aec004d70
    I1201 20:15:13.947367 1 grpc_server.cc:3618] ModelInferHandler::InferResponseComplete, 15 step ISSUED
    I1201 20:15:13.947375 1 grpc_server.cc:2637] GRPC free: size 18, addr 0x7f3aec004b90
    I1201 20:15:13.947379 1 grpc_server.cc:2637] GRPC free: size 4, addr 0x7f3aec004d70
    I1201 20:15:13.947442 1 grpc_server.cc:3194] ModelInferHandler::InferRequestComplete
    I1201 20:15:13.947451 1 python.cc:1960] TRITONBACKEND_ModelInstanceExecute: model instance name 6230834ea7f575001e824ce9__isvc-14826e7e9a released 1 requests
    I1201 20:15:13.947455 1 grpc_server.cc:3466] Process for ModelInferHandler, rpc_ok=1, 15 step COMPLETE
    

    Additional context

    The model being used is a video sequence classification model and the input for it is a sequence of 32 cropped frames, hence such huge input size. I did try using encoding the cropped sequence into h.264 and decoding in the model.py but it adds a lot of overhead on inference speed. hence I am trying to infer using the large input tensor.

  • Payload logging/events

    Payload logging/events

    For various reasons including monitoring by external system for things like drift / outlier detection etc.

    It should support CloudEvents and be compatible with the logger in KServe "classic", so that it can be used in a similar way, as illustrated in these samples:

    • https://github.com/kserve/kserve/tree/master/docs/samples/logger/basic
    • https://github.com/kserve/kserve/tree/master/docs/samples/drift-detection/alibi-detect/cifar10 / https://github.com/kserve/kserve/tree/master/docs/samples/outlier-detection/alibi-detect/cifar10

    Some considerations / possible complications:

    • In KServe the logger can be configured per InferenceService. We need to decide whether we support this with model-mesh, or a simpler global configuration, or both. Another possibility could be allowing a logging destination to be configured globally and enabled/disabled per model.
    • Model-mesh doesn't really touch the payloads currently, but it only routes gRPC/protobuf. So we could emit the raw protobuf messages but this would differ from the existing KServe case and so would not necessarily be compatible with the same integrations. We could transcode to json on the fly, but this would introduce processing overhead that may be undesirable and affect data path performance.
    • The KServe examples are based on the V1 API, we should to check whether the existing logger works with the V2 API; the runtimes supported by model-mesh are primarily V2 based.

    cc @rafvasq

  • Create isolation between serving runtimes

    Create isolation between serving runtimes

    Is your feature request related to a problem? If so, please describe.

    In my team’s use case, we are currently using the KServe V2 Inference Protocol REST API for sending inference requests. On top of this protocol, we also make use of multiple virtual services to direct traffic to the modelmesh-serving service such that each serving runtime should be mapped to one virtual service.

    Our use case does not allow us to match the /v2/models/<model-id>/infer in our virtual service, and this creates a problem for us because requests sent to the virtual service for serving runtime X can end up reaching models loaded in serving runtime Y due to the fact that:

    • All serving runtimes share the same modelmesh-serving service
    • Users can set the model id to any existing inference service name in the request path

    Describe your proposed solution

    Since this is an unwanted behaviour for my team, we have 2 possible solutions for solving this.

    1. Support mm-vmodel-id header in the REST API and allow it to take precedence over the model id specified in the V2 inference path
    2. Create a dedicated service per serving runtime instead of having all serving runtimes share the same service

    Describe alternatives you have considered

    Additional context

A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore

bookstore-sample-controller A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore. A resource cre

Jan 20, 2022
Controller Area Network (CAN) SDK for Go.

?? CAN Go CAN toolkit for Go programmers. can-go makes use of the Linux SocketCAN abstraction for CAN communication. (See the SocketCAN documentation

Dec 20, 2022
Annotated and kubez-autoscaler-controller will maintain the HPA automatically for kubernetes resources.

Kubez-autoscaler Overview kubez-autoscaler 通过为 deployment / statefulset 添加 annotations 的方式,自动维护对应 HorizontalPodAutoscaler 的生命周期. Prerequisites 在 kuber

Jan 2, 2023
network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of kubernetes.
network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of kubernetes.

Network Node Manager network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of ku

Dec 18, 2022
A controller to create K8s Ingresses for Openshift routes.

route-to-ingress-operator A controller to create corresponding ingress.networking.k8s.io/v1 resources for route.openshift.io/v1 TODO int port string p

Jan 7, 2022
A Kubernetes Terraform Controller
A Kubernetes Terraform Controller

Terraform Controller Terraform Controller is a Kubernetes Controller for Terraform, which can address the requirement of Using Terraform HCL as IaC mo

Jan 2, 2023
Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes.
Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes.

Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes. This project is inspired by agones. Introduction Genera

Nov 25, 2022
A fluxcd controller for managing remote manifests with kubecfg

kubecfg-operator A fluxcd controller for managing remote manifests with kubecfg This project is in very early stages proof-of-concept. Only latest ima

Nov 1, 2022
A fluxcd controller for managing manifests declared in jsonnet

jsonnet-controller A fluxcd controller for managing manifests declared in jsonnet. Kubecfg (and its internal libraries) as well as Tanka-style directo

Nov 1, 2022
Write controller-runtime based k8s controllers that read/write to git, not k8s

Git Backed Controller The basic idea is to write a k8s controller that runs against git and not k8s apiserver. So the controller is reading and writin

Dec 10, 2021
The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the operator-sdk or controller-runtime.

k8s-generic-webhook The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the opera

Nov 24, 2022
the simplest testing framework for Kubernetes controller.

KET(Kind E2e Test framework) KET is the simplest testing framework for Kubernetes controller. KET is available as open source software, and we look fo

Dec 10, 2022
pubsub controller using kafka and base on sarama. Easy controll flow for actions streamming, event driven.

Psub helper for create system using kafka to streaming and events driven base. Install go get github.com/teng231/psub have 3 env variables for config

Sep 26, 2022
Kubernetes workload controller for container image deployment

kube-image-deployer kube-image-deployer는 Docker Registry의 Image:Tag를 감시하는 Kubernetes Controller입니다. Keel과 유사하지만 단일 태그만 감시하며 더 간결하게 동작합니다. Container, I

Mar 8, 2022
Raspberry pi GPIO controller package(CGO)
Raspberry pi GPIO controller package(CGO)

GOPIO A simple gpio controller package for raspberrypi. Documentation Examples Installation sudo apt-get install wiringpi go get github.com/polarspet

Nov 24, 2022
Knative Controller which emits cloud events when Knative Resources change state

Knative Sample Controller Knative sample-controller defines a few simple resources that are validated by webhook and managed by a controller to demons

Oct 2, 2021
A controller managing namespaces deployments, statefulsets and cronjobs objects. Inspired by kube-downscaler.

kube-ns-suspender Kubernetes controller managing namespaces life cycle. kube-ns-suspender Goal Usage Internals The watcher The suspender Flags Resourc

Dec 27, 2022
K8s controller implementing Multi-Cluster Services API based on AWS Cloud Map.

AWS Cloud Map MCS Controller for K8s Introduction AWS Cloud Map multi-cluster service discovery for Kubernetes (K8s) is a controller that implements e

Dec 17, 2022
A Pulumi NGINX Ingress Controller component

Pulumi NGINX Ingress Controller Component This repo contains the Pulumi NGINX Ingress Controller component for Kubernetes. This ingress controller use

Aug 10, 2022
Create cluster to run ingress controller and set the dns resolver
Create cluster to run ingress controller and set the dns resolver

kubebuilder-crd-dep-svc-ing create cluster to run ingress controller and set the dns resolver $ kind create cluster --config clust.yaml $ sudo

Nov 15, 2021