Controller for ModelMesh

KSERVE

Last update: Dec 23, 2022

Comments: 17

ModelMesh Serving

ModelMesh Serving is the Controller for managing ModelMesh, a general-purpose model serving management/routing layer.

Getting Started

To quickly get started with ModelMesh Serving, check out the Quick Start Guide.

For help, please open an issue in this repository.

Components and their Repositories

ModelMesh Serving currently comprises components spread over a number of repositories. The supported versions for the latest release are documented here.

Issues across all components are tracked centrally in this repo.

Core Components

https://github.com/kserve/modelmesh-serving (this repo) - the model serving controller
https://github.com/kserve/modelmesh - the ModelMesh containers used for orchestrating model placement and routing

Runtime Adapters

modelmesh-runtime-adapter - the containers which run in each model serving pod and act as an intermediary between ModelMesh and third-party model-server containers. Its build produces a single "multi-purpose" image which can be used as an adapter to work with each of the out-of-the-box supported model servers. It also incorporates the "puller" logic which is responsible for retrieving the models from storage before handing over to the respective adapter logic to load the model (and to delete after unloading). This image is also used for a container in the load/unload path of custom ServingRuntime Pods, as a "standalone" puller.

Model Serving runtimes

ModelMesh Serving provides out-of-the-box integration with the following model servers.

triton-inference-server - Nvidia's Triton Inference Server
seldon-mlserver - Seldon's Python MLServer

ServingRuntime custom resources can be used to add support for other existing or custom-built model servers, see the docs on implementing a custom Serving Runtime

Supplementary

KServe V2 REST Proxy - a reverse-proxy server which translates a RESTful HTTP API into gRPC. This allows sending inference requests using the KServe V2 REST Predict Protocol to ModelMesh models which currently only support the V2 gRPC Predict Protocol.

Libraries

These are helper Java libraries used by the ModelMesh component.

kv-utils - Useful KV store recipes abstracted over etcd and Zookeeper
litelinks-core - RPC/service discovery library based on Apache Thrift, used only for communications internal to ModelMesh.

Contributing

Please read our contributing guide for details on contributing.

Building Images

# Build develop image
make build.develop

# After building the develop image,  build the runtime image
make build

Owner

KSERVE

Highly scalable and standards based Model Inference Platform on Kubernetes for Trusted AI

https://github.com/kserve/modelmesh-serving

Comments

"code":3,"message":"json: cannot unmarshal string into Go value of type main.RESTRequest"

Getting {"code":3,"message":"json: cannot unmarshal string into Go value of type main.RESTRequest"} which sending an inference request. Codes:

from mlserver.codecs.pandas import PandasCodec # required mlserver>=1.1.0, 
payloads = "./IDA_en.ndjson"
df = pd.read_json(payloads, lines=True)
df = df.fillna('')
payload = PandasCodec.encode_request(df, use_bytes=False)
response = requests.post("http://localhost:8008/v2/models/muc-en-aa-predictor/infer", json=payload.json())

I never experienced the error. What's the wrong?

Higher payload size not working as described in doc

I have deployed a model using custom MLServer runtime. The gRPC inferencing is working as expected with small size payload.

I modified the configurations to make it works with large payload size:

In the runtime configuration:

- name: MLSERVER_GRPC_MAX_MESSAGE_LENGTH
value: "300000000"

as well as: 2, In the global configmap:

grpcMaxMessageSizeBytes: 300000000

But still is giving error and showing the default exceed limits:

io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: gRPC message exceeds maximum size 16777216: 65844251
        at io.grpc.Status.asRuntimeException(Status.java:530)
        at io.grpc.internal.MessageDeframer.processHeader(MessageDeframer.java:392)
        at io.grpc.internal.MessageDeframer.deliver(MessageDeframer.java:272)
        at io.grpc.internal.MessageDeframer.request(MessageDeframer.java:162)
        at io.grpc.internal.AbstractStream$TransportState$1RequestRunnable.run(AbstractStream.java:236)
        at io.grpc.netty.NettyServerStream$TransportState$1.run(NettyServerStream.java:198)
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:403)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:833)

Client side error:

Traceback (most recent call last):
  File "/temp/docker/grpc_call.py", line 66, in <module>
    response = grpc_stub.ModelInfer(inference_request_g)
  File "/usr/local/lib/python3.9/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/usr/local/lib/python3.9/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.CANCELLED
        details = "Received RST_STREAM with error code 8"
        debug_error_string = "UNKNOWN:Error received from peer ipv4:10.244.8.5:8033 {grpc_message:"Received RST_STREAM with error code 8", grpc_status:1, created_time:"2022-11-03T18:47:46.142301977+00:00"}"

How can I overcome this situation?

feat: TorchServe support
Motivation

The Triton runtime can be used with model-mesh to serve PyTorch torchscript models, but it does not support arbitrary PyTorch models i.e. eager mode. KServe "classic" has integration with TorchServe but it would be good to have integration with model-mesh too so that these kinds of models can be used in distributed multi-model serving contexts.

Modifications

The bulk of the required changes are to the adapter image, covered by PR https://github.com/kserve/modelmesh-runtime-adapter/pull/34.

This PR contains the minimal controller changes needed to enable the support:

TorchServe ServingRuntime spec

Add "torchserve" to the list of supported built-in runtime types

Add "ID extraction" entry for TorchServe's gRPC Predictions RPC so that model-mesh will automatically extract the model name from corresponding request messages

Note the supported model format is advertised as "pytorch-mar" to distinguish from the existing "pytorch" format that refers to raw TorchScript .pt files as supported by Triton.

Result

TorchServe can be used seamlessly with ModelMesh Serving to serve PyTorch models, including eager mode.

Resolves #63
feat: storage phase 1 for inference service reconciler

Motivation

rebase #32 to the new inference service reconciler for model mesh

For Storage Spec details, please refer to the design doc: https://docs.google.com/document/d/1rYNh93XNMRp8YaR56m-_zK3bqpZhhGKgpVbLPPsi-Ow/edit#

Additional storages/parameters support will come in phase 2.

Modifications

Result

chore: Automatically set kube context in development container

Motivation

When using the containerized development environment make develop to run FVT tests, one needs to configure access to a Kubernetes or OpenShift cluster from inside the container. Which has to be done for every make develop session. This can be tricky when cloud provider specific CLI tools are needed to connect and authenticate to a cluster.

Currently there is a short paragraph in the FVT README about how to export a minified kubeconfig file and create that inside the container. It is tedious to repeat those steps for each make develop session and depending on OS, shell environment, editors and possible text encoding issue it is also error prone.

Modifications

This PR proposes to automatically create the kubeconfig file in a local and git-ignored directory inside the local project and automatically mount it to the develop container. All the user then has to do is connect and authenticate to the cluster in the shell that will be running make develop.

Result

Kubernetes context is ready inside the development container.

# shell environment, outside the develop container has access to K8s cluster
[modelmesh-serving_ckadner]$ kubectl get pods

NAME                                        READY   STATUS    RESTARTS   AGE
pod/etcd                                    1/1     Running   0          17m
pod/minio                                   1/1     Running   0          17m
pod/modelmesh-controller-387aef25be-ftyqu   1/1     Running   0          17m

[modelmesh-serving_ckadner]$ make develop

./scripts/build_devimage.sh
Pulling dev image kserve/modelmesh-controller-develop:6be58b09c25833c1...
Building dev image kserve/modelmesh-controller-develop:6be58b09c25833c1...
Image kserve/modelmesh-controller-develop:6be58b09c25833c1 has 14 layers
Tagging dev image kserve/modelmesh-controller-develop:6be58b09c25833c1 as latest
./scripts/develop.sh
[root@17c121286549 workspace]# kubectl get pods
NAME                                        READY   STATUS    RESTARTS   AGE
pod/etcd                                    1/1     Running   0          18m
pod/minio                                   1/1     Running   0          18m
pod/modelmesh-controller-387aef25be-ftyqu   1/1     Running   0          18m
[root@17c121286549 workspace]#

/cc @njhill

chore: Update GH Action workflows

Both workflows for PRs and pushes were cleaned up. On a push (which includes when PRs merge), the code base is linted and tested first before building and publishing.
fix: Ensure delete.sh script works properly with implicit namespace

Motivation

The delete.sh cleanup script currently doesn't require a namespace to be provided via the -n option and uses the current kubectl context's namespace otherwise. The $namespace variable wasn't set in this case however meaning later parts of the script might not work as intended.

Modifications

Ensure $namespace variable is set correctly either way.

Result

delete.sh script works properly when -n option isn't used.
fvt: a bunch of FVT improvements
Motivation

A bunch of improvements to the FVT framework coming from our internal fork.

After parallelizing and expanding the FVT suite to support testing the REST proxy internally, we had issues with the consistency of the FVTs. It took a few iterations of improvements to get them back to stable while we continued to add support for more tests. With the "fixes" and "features" coupled in a few different PRs the changes cannot be easily disentangled. This is a big mess of a PR, but the final result should be in a good place with all of our internal improvements.

Modifications

Test Parallelization:

split FVTs into separate suites (go packages)

ginkgo can parallelize within a suite, but runs the suites sequentially

refactors to enable sharing of code across the FVT suites

support parallelization by using ginkgo CLI to execute the tests instead of go test

use Ordered/Serial decorators on groups of tests that require it

this can help to speed up "inference" tests by creating the predictor once and using it across multiple specs, but it does mean some specs are not independent

TLS tests are marked as Serial because they require roll outs of the runtime pods

to help debugging, print inference services on failure in Predictor FVTs

avoid a nil pointer dereference that can occur if FVTs error during initialization while running in parallel

remove sleep in AfterEach of TLS tests

update port-forwards to select a pod directly from the Endpoints object corresponding to the service

when port-forwarding to a Service, there is no guard against selecting a Terminating pod

Config and Secrets:

specify the full DefaultConfig in code instead of in the user-configmap.yaml file

allow TLS config maps to be overlayed on the base config (instead of template string in YAML)

generate TLS certificates for each run of the FVTs instead of using hard-coded certs

REST Proxy Tests:

enable the REST proxy for FVTs and add inference tests using proxy

have the FVT Client manage port-forwards for each of REST and gRPC

Result

a faster and more efficient FVT suite with parallelization

improved FVT stability and extensibility to support future changes
feat: Update InferenceService reconciliation logic
The ModelMesh controller can now reconcile InferenceServices using the new Model Spec in the predictor. Also, an issue was fixed with there being a leading slash in the model path when an InferenceService StorageURI was parsed. This was causing issues in the adapter, preventing models from being loaded as seen in #97.

Closes: #96, #97 Related: #90

Result

Users can now successfully deploy an InferenceService using the predictor Model Spec such as the following:

apiVersion: serving.kserve.io/v1beta1 kind: InferenceService metadata: name: example-sklearn-isvc annotations: serving.kserve.io/deploymentMode: ModelMesh serving.kserve.io/secretKey: localMinIO spec: predictor: model: modelFormat: name: sklearn storageUri: s3://modelmesh-example-models/sklearn/mnist-svm.joblib
ModelMesh Release Tracker for KServe v0.7.0
The plan is to cut the KServe 0.7 release mid next week. For this release, ModelMesh will be loosely integrated with KServe.

Action Items:

[x] ModelMesh InferenceService CRD Support

[x] https://github.com/kserve/modelmesh-serving/pull/34

[x] Documentation on using InferenceService CR with ModelMesh

https://github.com/kserve/modelmesh-serving/pull/47

[x] ModelMesh REST Proxy Sidecar Support

[x] https://github.com/kserve/modelmesh-serving/pull/27

[x] Documentation on using REST inferencing

[x] Add KServe ModelMesh DeploymentMode Annotation checker

https://github.com/kserve/kserve/pull/1851

[ ] Update KServe hack/quick_install.sh to include ModelMesh-Serving as part of installation.

https://github.com/kserve/kserve/pull/1844

[x] Documentation updates

[x] Refresh External ModelMesh Documentation

https://github.com/kserve/modelmesh-serving/pull/39

[x] Update KServe Website with ModelMesh Documentation

https://github.com/kserve/website/issues/17

Can view website here: https://kserve.github.io/website/

https://github.com/kserve/website/pull/32

https://github.com/kserve/website/pull/37

[x] Assemble release process items

Tag release for version v0.7.0 to follow suit with KServe.

[x] GitHub workflow for tagged release

[x] Release process documentation

Create document outline process of creating a release branch and tagging from commit in that branch. KServe should already have document like this.

https://github.com/kserve/modelmesh-serving/pull/40

https://github.com/kserve/modelmesh/pull/7

https://github.com/kserve/modelmesh-runtime-adapter/pull/7
Adjust FVT GH-Actions workflow

Motivation

Decrease the flakiness of FVT runs that occur when certain tests are run back to back.

Modifications

The rollingUpdate strategy is adjusted in a preprocessing step of the FVT Github Actions workflow to allow better stability in low resource environments. The defaultTimeout was increased to account for the the changes in strategy. Ran into some intermittent failures due to timeouts when the deployment doesn't become ready in time.

Result

Less flakiness in FVT runs.
test: Add TorchServe FVT
Motivation

Support for TorchServe was added in #250 and https://github.com/kserve/modelmesh-runtime-adapter/pull/34. A test should be added for it as well.

Modifications

Adds basic FVT for load/inference with a TorchServe MAR model using the native TorchServe gRPC API

Result

Closes #280
Remove residual "Watson" references

Model-mesh was originally developed as part of IBM Watson. Now that it is part of KServe we should scrub any remaining places that "watson" is used in the codebase, at least starting with those that are straightforward to change.
Update Dockerfile packages
Motivation

Fix vulnerabilities in the ModelMesh Controller image.

Modifications

Remove dependencies to any packages in the runtime.

Updated Go version to the latest version.

Result

The feature is functionally working, I have tested it on my own cluster.

OutOfDirectMemoryError on setting higher grpc input size

Describe the bug

Followed this doc and have set the grpcMaxMessageSizeBytes to 400000000
here's my config.yaml from model-serving-config ConfigMap

podsPerRuntime: 1
metrics:
  enabled: true
grpcMaxMessageSizeBytes: 400000000

When I make a grpc call, it throws

Dec 01, 2022 8:15:15 PM io.grpc.netty.NettyServerTransport notifyTerminated
INFO: Transport failed
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 4194304 byte(s) of direct memory (used: 75497758, max: 76546048)
	at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:806)
	at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:735)
	at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:649)
	at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:624)
	at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:203)
	at io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:187)
	at io.netty.buffer.PoolArena.allocate(PoolArena.java:136)
	at io.netty.buffer.PoolArena.allocate(PoolArena.java:126)
	at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:396)
	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188)
	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
	at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:116)
	at io.netty.handler.codec.ByteToMessageDecoder.expandCumulation(ByteToMessageDecoder.java:541)
	at io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:97)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:277)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800)
	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:487)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:385)

in triton, I can see that the model is working fine and you can also see the input shape and that the model did release the response

I am not sure what other changes I need to get this to work.
I suppose this is where it's being set. I tried to back track it to some setting but couldn't.

I1201 20:15:13.885755 1 grpc_server.cc:3466] Process for ModelInferHandler, rpc_ok=1, 15 step START
I1201 20:15:13.885773 1 grpc_server.cc:3459] New request handler for ModelInferHandler, 17
I1201 20:15:13.885778 1 model_repository_manager.cc:593] GetModel() '6230834ea7f575001e824ce9__isvc-14826e7e9a' version -1
I1201 20:15:13.885783 1 model_repository_manager.cc:593] GetModel() '6230834ea7f575001e824ce9__isvc-14826e7e9a' version -1
I1201 20:15:13.885792 1 infer_request.cc:675] prepared: [0x0x7f3bc0007b50] request id: , model: 6230834ea7f575001e824ce9__isvc-14826e7e9a, requested version: -1, actual version: 1, flags: 0x0, correlation id: 0, batch size: 1, priority: 0, timeout (us): 0
original inputs:
[0x0x7f3bc0007e58] input: CLIP, type: UINT8, original shape: [1,32,589,617,3], batch + shape: [1,32,589,617,3], shape: [32,589,617,3]
override inputs:
inputs:
[0x0x7f3bc0007e58] input: CLIP, type: UINT8, original shape: [1,32,589,617,3], batch + shape: [1,32,589,617,3], shape: [32,589,617,3]
original requested outputs:
requested outputs:
classes
scores

I1201 20:15:13.885874 1 python.cc:616] model 6230834ea7f575001e824ce9__isvc-14826e7e9a, instance 6230834ea7f575001e824ce9__isvc-14826e7e9a, executing 1 requests
I1201 20:15:13.947333 1 infer_response.cc:166] add response output: output: classes, type: BYTES, shape: [1,1]
I1201 20:15:13.947355 1 grpc_server.cc:2555] GRPC: using buffer for 'classes', size: 18, addr: 0x7f3aec004b90
I1201 20:15:13.947360 1 infer_response.cc:166] add response output: output: scores, type: FP32, shape: [1,1]
I1201 20:15:13.947363 1 grpc_server.cc:2555] GRPC: using buffer for 'scores', size: 4, addr: 0x7f3aec004d70
I1201 20:15:13.947367 1 grpc_server.cc:3618] ModelInferHandler::InferResponseComplete, 15 step ISSUED
I1201 20:15:13.947375 1 grpc_server.cc:2637] GRPC free: size 18, addr 0x7f3aec004b90
I1201 20:15:13.947379 1 grpc_server.cc:2637] GRPC free: size 4, addr 0x7f3aec004d70
I1201 20:15:13.947442 1 grpc_server.cc:3194] ModelInferHandler::InferRequestComplete
I1201 20:15:13.947451 1 python.cc:1960] TRITONBACKEND_ModelInstanceExecute: model instance name 6230834ea7f575001e824ce9__isvc-14826e7e9a released 1 requests
I1201 20:15:13.947455 1 grpc_server.cc:3466] Process for ModelInferHandler, rpc_ok=1, 15 step COMPLETE

Additional context

The model being used is a video sequence classification model and the input for it is a sequence of 32 cropped frames, hence such huge input size. I did try using encoding the cropped sequence into h.264 and decoding in the model.py but it adds a lot of overhead on inference speed. hence I am trying to infer using the large input tensor.

Payload logging/events
For various reasons including monitoring by external system for things like drift / outlier detection etc.

It should support CloudEvents and be compatible with the logger in KServe "classic", so that it can be used in a similar way, as illustrated in these samples:

https://github.com/kserve/kserve/tree/master/docs/samples/logger/basic

https://github.com/kserve/kserve/tree/master/docs/samples/drift-detection/alibi-detect/cifar10 / https://github.com/kserve/kserve/tree/master/docs/samples/outlier-detection/alibi-detect/cifar10

Some considerations / possible complications:

In KServe the logger can be configured per InferenceService. We need to decide whether we support this with model-mesh, or a simpler global configuration, or both. Another possibility could be allowing a logging destination to be configured globally and enabled/disabled per model.

Model-mesh doesn't really touch the payloads currently, but it only routes gRPC/protobuf. So we could emit the raw protobuf messages but this would differ from the existing KServe case and so would not necessarily be compatible with the same integrations. We could transcode to json on the fly, but this would introduce processing overhead that may be undesirable and affect data path performance.

The KServe examples are based on the V1 API, we should to check whether the existing logger works with the V2 API; the runtimes supported by model-mesh are primarily V2 based.

cc @rafvasq
Create isolation between serving runtimes
Is your feature request related to a problem? If so, please describe.

In my team’s use case, we are currently using the KServe V2 Inference Protocol REST API for sending inference requests. On top of this protocol, we also make use of multiple virtual services to direct traffic to the modelmesh-serving service such that each serving runtime should be mapped to one virtual service.

Our use case does not allow us to match the /v2/models/<model-id>/infer in our virtual service, and this creates a problem for us because requests sent to the virtual service for serving runtime X can end up reaching models loaded in serving runtime Y due to the fact that:

All serving runtimes share the same modelmesh-serving service

Users can set the model id to any existing inference service name in the request path

Describe your proposed solution

Since this is an unwanted behaviour for my team, we have 2 possible solutions for solving this.

Support mm-vmodel-id header in the REST API and allow it to take precedence over the model id specified in the V2 inference path

Create a dedicated service per serving runtime instead of having all serving runtimes share the same service

Describe alternatives you have considered

Additional context

Related tags

Web Frameworks modelmesh-serving

A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore

bookstore-sample-controller A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore. A resource cre

Jan 20, 2022

Controller Area Network (CAN) SDK for Go.

?? CAN Go CAN toolkit for Go programmers. can-go makes use of the Linux SocketCAN abstraction for CAN communication. (See the SocketCAN documentation

Dec 20, 2022

Annotated and kubez-autoscaler-controller will maintain the HPA automatically for kubernetes resources.

Kubez-autoscaler Overview kubez-autoscaler 通过为 deployment / statefulset 添加 annotations 的方式，自动维护对应 HorizontalPodAutoscaler 的生命周期. Prerequisites 在 kuber

Jan 2, 2023

network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of kubernetes.

Network Node Manager network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of ku

Dec 18, 2022

A controller to create K8s Ingresses for Openshift routes.

route-to-ingress-operator A controller to create corresponding ingress.networking.k8s.io/v1 resources for route.openshift.io/v1 TODO int port string p

Jan 7, 2022

A Kubernetes Terraform Controller

Terraform Controller Terraform Controller is a Kubernetes Controller for Terraform, which can address the requirement of Using Terraform HCL as IaC mo

Jan 2, 2023

Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes.

Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes. This project is inspired by agones. Introduction Genera

Nov 25, 2022

A fluxcd controller for managing remote manifests with kubecfg

kubecfg-operator A fluxcd controller for managing remote manifests with kubecfg This project is in very early stages proof-of-concept. Only latest ima

Nov 1, 2022

A fluxcd controller for managing manifests declared in jsonnet

jsonnet-controller A fluxcd controller for managing manifests declared in jsonnet. Kubecfg (and its internal libraries) as well as Tanka-style directo

Nov 1, 2022

Write controller-runtime based k8s controllers that read/write to git, not k8s

Git Backed Controller The basic idea is to write a k8s controller that runs against git and not k8s apiserver. So the controller is reading and writin

Dec 10, 2021

The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the operator-sdk or controller-runtime.

k8s-generic-webhook The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the opera

Nov 24, 2022

Controller for ModelMesh

ModelMesh Serving

Getting Started

Components and their Repositories

Core Components

Runtime Adapters

Model Serving runtimes

Supplementary

Libraries

Contributing

Building Images

Owner

KSERVE

Comments

"code":3,"message":"json: cannot unmarshal string into Go value of type main.RESTRequest"

Higher payload size not working as described in doc

feat: TorchServe support

Motivation

Modifications

Result

feat: storage phase 1 for inference service reconciler

Motivation

Modifications

Result

chore: Automatically set kube context in development container

Motivation

Modifications

Result

chore: Update GH Action workflows

fix: Ensure delete.sh script works properly with implicit namespace

Motivation

Modifications

Result

fvt: a bunch of FVT improvements

Motivation

Modifications

Result

feat: Update InferenceService reconciliation logic

Result

ModelMesh Release Tracker for KServe v0.7.0

Adjust FVT GH-Actions workflow

Motivation

Modifications

Result

test: Add TorchServe FVT

Motivation

Modifications

Result

Remove residual "Watson" references

Update Dockerfile packages

Motivation

Modifications

Result

OutOfDirectMemoryError on setting higher grpc input size

Payload logging/events

Create isolation between serving runtimes

Related tags

A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore

Controller Area Network (CAN) SDK for Go.

Annotated and kubez-autoscaler-controller will maintain the HPA automatically for kubernetes resources.

network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of kubernetes.

A controller to create K8s Ingresses for Openshift routes.

A Kubernetes Terraform Controller

Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes.

A fluxcd controller for managing remote manifests with kubecfg

A fluxcd controller for managing manifests declared in jsonnet

Write controller-runtime based k8s controllers that read/write to git, not k8s

The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the operator-sdk or controller-runtime.

the simplest testing framework for Kubernetes controller.

pubsub controller using kafka and base on sarama. Easy controll flow for actions streamming, event driven.

Kubernetes workload controller for container image deployment

Raspberry pi GPIO controller package(CGO)

Knative Controller which emits cloud events when Knative Resources change state

A controller managing namespaces deployments, statefulsets and cronjobs objects. Inspired by kube-downscaler.

K8s controller implementing Multi-Cluster Services API based on AWS Cloud Map.

A Pulumi NGINX Ingress Controller component

Create cluster to run ingress controller and set the dns resolver