Deploy, manage, and scale machine learning models in production


WebsiteSlackDocs


Deploy, manage, and scale machine learning models in production

Cortex is a cloud native model serving platform for machine learning engineering teams.


Use cases

  • Realtime machine learning - build NLP, computer vision, and other APIs and integrate them into any application.
  • Large-scale inference - scale realtime or batch inference workloads across hundreds or thousands of instances.
  • Consistent MLOps workflows - create streamlined and reproducible MLOps workflows for any machine learning team.

Deploy

  • Deploy TensorFlow, PyTorch, ONNX, and other models using a simple CLI or Python client.
  • Run realtime inference, batch inference, asynchronous inference, and training jobs.
  • Define preprocessing and postprocessing steps in Python and chain workloads seamlessly.
$ cortex deploy apis.yaml

• creating text-generator (realtime API)
• creating image-classifier (batch API)
• creating video-analyzer (async API)

all APIs are ready!

Manage

  • Create A/B tests and shadow pipelines with configurable traffic splitting.
  • Automatically stream logs from every workload to your favorite log management tool.
  • Monitor your workloads with pre-built Grafana dashboards and add your own custom dashboards.
$ cortex get

API                 TYPE        GPUs
text-generator      realtime    32
image-classifier    batch       64
video-analyzer      async       16

Scale

  • Configure workload and cluster autoscaling to efficiently handle large-scale production workloads.
  • Create clusters with different types of instances for different types of workloads.
  • Spend less on cloud infrastructure by letting Cortex manage spot or preemptible instances.
$ cortex cluster info

provider: aws
region: us-east-1
instance_types: [c5.xlarge, g4dn.xlarge]
spot_instances: true
min_instances: 10
max_instances: 100
Comments
  • Display realtime output

    Display realtime output

    I have a text-generator language model that is compressed and is in .bin format, and it can accessed by command line terminal. It generates about one word per second and prints out every word in realtime when accessed through command line. I would like to deploy my model using cortex but I'm struggling to get the output in realtime word-by-word. Right now my code prints out only one line a time.

    import subprocess
    def run_command(text):
        command = ['./mycommand', text]
        process = subprocess.Popen(command, stdout=subprocess.PIPE, universal_newlines=True, bufsize=-1)
        while True:
            output = process.stdout.readline()
            if output == '' and process.poll() is not None:
                break
            if output:
                print(output.strip())
    
    run_command('TEXT')
    

    One line may include about 20 words, so that means 1 line will be displayed every 20 seconds (since my model outputs roughly 1 word/ second). I would really like the output to be more dynamic and output one word at a time (as it does through command line) instead of just one line. Is there a way this can be achieved?

  • Persistent private instances

    Persistent private instances

    I would like to use Cortex functionality, to create an application where each user will be able to request and communicate with AWS instance for a period of time. In this scenario, data of each user will be processed and stored on one whole AWS instance. From the documentation, I understand that each API call will use an instance that it is not busy at the moment. It wouldn’t be ideal if by making an API call, a user would receive sensitive data stored by a another user on the same instance. Would it be possible to somehow mark an instance to which an API call is being made? That way the data of individual users wouldn’t be made accesible to everyone, but only to those users who request/use an instance.

  • Resource exhausted error

    Resource exhausted error

    I'm trying to send audio files, which are fairly large, to the server and am getting a resource exhausted error. Is there any way to configure the server in order to increase the maximum allowed message size?

    Here's the stack trace:

    2020-12-24 23:30:14.941839:cortex:pid-2247:INFO:500 Internal Server Error POST /
    2020-12-24 23:30:14.942071:cortex:pid-2247:ERROR:Exception in ASGI application
    Traceback (most recent call last):
      File "/opt/conda/envs/env/lib/python3.6/site-packages/uvicorn/protocols/http/httptools_impl.py", line
    390, in run_asgi
        result = await app(self.scope, self.receive, self.send)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
        return await self.app(scope, receive, send)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/fastapi/applications.py", line 181, in __call__
        await super().__call__(scope, receive, send)  # pragma: no cover
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/applications.py", line 111, in __call__
        await self.middleware_stack(scope, receive, send)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/errors.py", line 181, in __call__
        raise exc from None
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/errors.py", line 159, in __call__
        await self.app(scope, receive, _send)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 25, in __call__
        response = await self.dispatch_func(request, self.call_next)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/cortex_internal/serve/serve.py", line 187, in parse_payload
        return await call_next(request)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 45, in call_next
        task.result()
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 38, in coro
        await self.app(scope, receive, send)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 25, in __call__
        response = await self.dispatch_func(request, self.call_next)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/cortex_internal/serve/serve.py", line 134, in register_request
        response = await call_next(request)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 45, in call_next
        task.result()
     File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 38, in coro
        await self.app(scope, receive, send)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/exceptions.py", line 82, in __call__
        raise exc from None
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/exceptions.py", line 71, in __call__
        await self.app(scope, receive, sender)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/routing.py", line 566, in __call__
        await route.handle(scope, receive, send)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/routing.py", line 227, in handle
        await self.app(scope, receive, send)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/routing.py", line 41, in app
        response = await func(request)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/fastapi/routing.py", line 183, in app
        dependant=dependant, values=values, is_coroutine=is_coroutine
      File "/opt/conda/envs/env/lib/python3.6/site-packages/fastapi/routing.py", line 135, in run_endpoint_function
        return await run_in_threadpool(dependant.call, **values)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/concurrency.py", line 34, in run_in_threadpool
        return await loop.run_in_executor(None, func, *args)
      File "/opt/conda/envs/env/lib/python3.6/concurrent/futures/thread.py", line 56, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/cortex_internal/serve/serve.py", line 200, in predict
        prediction = predictor_impl.predict(**kwargs)
      File "/mnt/project/serving/cortex_server.py", line 10, in predict
        return self.client.predict({"waveform": np.array(payload["audio"]).astype("float32")})
      File "/opt/conda/envs/env/lib/python3.6/site-packages/cortex_internal/lib/client/tensorflow.py", line
    114, in predict
        return self._run_inference(model_input, consts.SINGLE_MODEL_NAME, model_version)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/cortex_internal/lib/client/tensorflow.py", line
    164, in _run_inference
        return self._client.predict(model_input, model_name, model_version)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/cortex_internal/lib/model/tfs.py", line 376, in
    predict
        response_proto = self._pred.Predict(prediction_request, timeout=timeout)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/grpc/_channel.py", line 826, in __call__
        return _end_unary_response_blocking(state, call, False, None)
      File "/opt/conda/envs/env/lib/python3.6/site-packages/grpc/_channel.py", line 729, in _end_unary_response_blocking
        raise _InactiveRpcError(state)
    grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
            status = StatusCode.RESOURCE_EXHAUSTED
            details = "Received message larger than max (102484524 vs. 4194304)"
            debug_error_string = "{"created":"@1608852614.937822193","description":"Received message larger
    than max (102484524 vs. 4194304)","file":"src/core/ext/filters/message_size/message_size_filter.cc","file_line":203,"grpc_status":8}"
    
  • Per Process GPU Ram

    Per Process GPU Ram

    As you have mentioned in docs gpus.md about the limiting the gpu ram, I know what exactly code snippet I have to use, but dont know exactly where I have to write that code snippet in cortex source code.

    mem_limit_mb = 1024 for gpu in tf.config.list_physical_devices("GPU"): tf.config.set_logical_device_configuration( gpu, [tf.config.LogicalDeviceConfiguration(memory_limit=mem_limit_mb)])

  • Support aws_session_token for CLI auth

    Support aws_session_token for CLI auth

    Description

    In order to authenticate with the Cortex operator, the Cortex CLI should be able to use aws_session_token (currently only static credentials are supported).

    Also, consider enabling auth via IAM role (e.g. inherited from Lambda, EC2)

  • Package Cortex library into .ZIP

    Package Cortex library into .ZIP

    I'm trying to create a microservice to manage my cluster via Cortex and Lambda. AWS Lambda requires python dependencies to be packaged and uploaded as a .zip files. How can I package Cortex library to .zip?

  • How to make Cortex XmlHttpRequest on HTTPS page?

    How to make Cortex XmlHttpRequest on HTTPS page?

    I have website which runs on https:// and I can't make Cortex API XmlHttpRequest requests over it.

    When running on localhost using http://, everything works fine:

    function postData(url = '', data = {}) {
      // Default options are marked with *
      const response = await fetch(url, {
        method: 'POST', // *GET, POST, PUT, DELETE, etc.
        mode: 'cors', // no-cors, *cors, same-origin
        cache: 'no-cache', // *default, no-cache, reload, force-cache, only-if-cached
        credentials: 'same-origin', // include, *same-origin, omit
        headers: {
          'Content-Type': 'application/json'
          // 'Content-Type': 'application/x-www-form-urlencoded',
        },
        redirect: 'follow', // manual, *follow, error
        referrerPolicy: 'no-referrer', // no-referrer, *no-referrer-when-downgrade, origin, origin-when-cross-origin, same-origin, strict-origin, strict-origin-when-cross-origin, unsafe-url
        body: JSON.stringify(data) // body data type must match "Content-Type" header
      });
      return response.json(); // parses JSON response into native JavaScript objects
    }
    

    But when making the same request on a https:// page gives following:

    Mixed Content: The page at 'https://www.@' was loaded over HTTPS, but requested an insecure XMLHttpRequest endpoint 'http://a6cc8d4dee22a448e81bb29862332bf0-93580d7c9d7d2256.elb.us-east-2.amazonaws.com/newtest-user'. This request has been blocked; the content must be served over HTTPS.

    How can I access Cortex API over HTTPS?

  • upstream connect error or disconnect/reset before headers. reset reason: connection failure

    upstream connect error or disconnect/reset before headers. reset reason: connection failure

    Version

    cli version: 0.18.1

    Description

    Intermittent 503 errors on AWS cluster.

    Configuration

    cortex.yaml

    # cortex.yaml
    
    - name: offer-features
      predictor:
        type: python
        path: predictor.py
        config:
          bucket: XXXXXXXXXXXXXXXXXXXX
      compute:
        cpu: 1  # CPU request per replica, e.g. 200m or 1 (200m is equivalent to 0.2) (default: 200m)
        gpu: 0  # GPU request per replica (default: 0)
        inf: 0 # Inferentia ASIC request per replica (default: 0)
        mem: 1Gi
      autoscaling:
        min_replicas: 2
        max_replicas: 3
        init_replicas: 2
        max_replica_concurrency: 13
        target_replica_concurrency: 5
        window: 1m0s
        downscale_stabilization_period: 5m0s
        upscale_stabilization_period: 1m0s
        max_downscale_factor: 0.75
        max_upscale_factor: 1.5
        downscale_tolerance: 0.05
        upscale_tolerance: 0.05
    
    # cluster.yaml
    
    # AWS credentials (if not specified, ~/.aws/credentials will be checked) (can be overridden by $AWS_ACCESS_KEY_ID and $AWS_SECRET_ACCESS_KEY)
    aws_access_key_id: XXXXXXXXXXXXXX
    aws_secret_access_key: XXXXXXXXXXXXXXXXX
    
    # optional AWS credentials for the operator which may be used to restrict its AWS access (defaults to the AWS credentials set above)
    cortex_aws_access_key_id: XXXXXXXXXXXXXXXX
    cortex_aws_secret_access_key: XXXXXXXXXXXXXXXXXXXXX
    
    # EKS cluster name for cortex (default: cortex)
    cluster_name: cortex
    
    # AWS region
    region: us-east-1
    
    # S3 bucket (default: <cluster_name>-<RANDOM_ID>)
    # note: your cortex cluster uses this bucket for metadata storage, and it should not be accessed directly (a separate bucket should be used for your models)
    bucket: # cortex-<RANDOM_ID>
    
    # list of availability zones for your region (default: 3 random availability zones from the specified region)
    availability_zones: # e.g. [us-east-1a, us-east-1b, us-east-1c]
    
    # instance type
    instance_type: t3.medium
    
    # minimum number of instances (must be >= 0)
    min_instances: 1
    
    # maximum number of instances (must be >= 1)
    max_instances: 5
    
    # disk storage size per instance (GB) (default: 50)
    instance_volume_size: 50
    
    # instance volume type [gp2, io1, st1, sc1] (default: gp2)
    instance_volume_type: gp2
    
    # instance volume iops (only applicable to io1 storage type) (default: 3000)
    # instance_volume_iops: 3000
    
    # whether the subnets used for EC2 instances should be public or private (default: "public")
    # if "public", instances will be assigned public IP addresses; if "private", instances won't have public IPs and a NAT gateway will be created to allow outgoing network requests
    # see https://docs.cortex.dev/v/0.18/miscellaneous/security#private-cluster for more information
    subnet_visibility: public  # must be "public" or "private"
    
    # whether to include a NAT gateway with the cluster (a NAT gateway is necessary when using private subnets)
    # default value is "none" if subnet_visibility is set to "public"; "single" if subnet_visibility is "private"
    nat_gateway: none  # must be "none", "single", or "highly_available" (highly_available means one NAT gateway per availability zone)
    
    # whether the API load balancer should be internet-facing or internal (default: "internet-facing")
    # note: if using "internal", APIs will still be accessible via the public API Gateway endpoint unless you also disable API Gateway in your API's configuration (if you do that, you must configure VPC Peering to connect to your APIs)
    # see https://docs.cortex.dev/v/0.18/miscellaneous/security#private-cluster for more information
    api_load_balancer_scheme: internet-facing  # must be "internet-facing" or "internal"
    
    # whether the operator load balancer should be internet-facing or internal (default: "internet-facing")
    # note: if using "internal", you must configure VPC Peering to connect your CLI to your cluster operator (https://docs.cortex.dev/v/0.18/guides/vpc-peering)
    # see https://docs.cortex.dev/v/0.18/miscellaneous/security#private-cluster for more information
    operator_load_balancer_scheme: internet-facing  # must be "internet-facing" or "internal"
    
    # CloudWatch log group for cortex (default: <cluster_name>)
    log_group: cortex
    
    # additional tags to assign to aws resources for labelling and cost allocation (by default, all resources will be tagged with cortex.dev/cluster-name=<cluster_name>)
    tags:  # <string>: <string> map of key/value pairs
    
    # whether to use spot instances in the cluster (default: false)
    # see https://docs.cortex.dev/v/0.18/cluster-management/spot-instances for additional details on spot configuration
    spot: false
    
    # see https://docs.cortex.dev/v/0.18/guides/custom-domain for instructions on how to set up a custom domain
    ssl_certificate_arn: XXXXXXXXXXXXXXXXXXXXXXXXXXXX
    
    

    Steps to reproduce

    • Spin up instances on AWS.
    • Wait a couple of days / hours (varies).
    • Notice sudden 503 errors

    Expected behavior

    It should work

    Actual behavior

    503 errors with the message

    upstream connect error or disconnect/reset before headers. reset reason: connection failure
    

    Screenshots

    NOTE: The endpoint stopped responding around 15:30 in the graphs below.

    Monitoring nr of bytes in:

    image

    Nr of requests:

    image

    Stack traces

    Nothing useful, just:

    2020-08-16 05:38:34.697979:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:38:37.643022:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:38:40.577522:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:38:42.008412:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:38:43.513294:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:38:45.425255:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:38:48.327276:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:38:51.316962:cortex:pid-447:INFO:200 OK POST /predict
    2020-08-16 05:38:54.009212:cortex:pid-447:INFO:200 OK POST /predict
    2020-08-16 05:38:55.852878:cortex:pid-447:INFO:200 OK POST /predict
    2020-08-16 05:38:57.525264:cortex:pid-447:INFO:200 OK POST /predict
    2020-08-16 05:39:00.795236:cortex:pid-447:INFO:200 OK POST /predict
    2020-08-16 05:39:04.437013:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:39:05.981920:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:39:09.314293:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:39:12.343143:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:39:15.821708:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:39:19.083554:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:39:22.048843:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:39:24.943968:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:39:26.613330:cortex:pid-448:INFO:200 OK POST /predict
    2020-08-16 05:39:29.702703:cortex:pid-448:INFO:200 OK POST /predict
    
    

    Additional context

    • The prediction takes about ~150ms on my Dell with Intel© Core™ i7-8750H CPU @ 2.20GHz × 6, 32GB Ram.
    • All the load balancer target are marked as "unhealthy", even when they work (i.e. I can send requests and receive 2XX responses)
    • The load balancer healthcheck endpoint returns the following
    /healthz
    {
            "service": {
                    "namespace": "istio-system",
                    "name": "ingressgateway-operator"
            },
            "localEndpoints": 0
    }%  
    

    Suggested solution

    (optional)

  • Add possibility to export environment variables with .env file

    Add possibility to export environment variables with .env file

    Description

    Add support for exporting environment variables from an .env file placed in the root directory of a Cortex project.

    Motivation

    In case the user doesn't want to export environment variables using the predictor:env field in cortex.yaml. A reason for that could be to keep the cortex.yaml deployment clean.

  • Is there a way to speed-up API deployment

    Is there a way to speed-up API deployment

    When deploying an API and observing logs, it seems that the most time-consuming part of deployment is:

    2021-01-25 18:37:27.401057:cortex:pid-1:INFO:downloading the project code
    2021-01-25 18:37:27.483562:cortex:pid-1:INFO:downloading the python serving image
    

    Is there a way to somehow make deploying an API quicker?

  • Why is min_replicas 0 not possible?

    Why is min_replicas 0 not possible?

    We are trying to deploy a text generation API on AWS. We do not expect the API to receive a lot of traffic initially and hence we would like to save some costs. My idea was that min_replicas can be set to 0 which would not keep an instance idle in case the traffic on the API is none. As soon as a new request would come in cortex would spawn a new instance and shut it down once the traffic goes back to 0.

    However, I noticed that setting min_replicas to 0 is invalid. Isn't the above use case a valid one for this? Also, is this a recent change? I vaguely(very) remember that this was possible to do in version 0.20(Please correct me if I'm wrong) but it seems like it is not in 0.26.

    cc @deliahu I opened a new thread here because - 1) It's a different issue than the other thread , 2) Other users might benefit from the conversation here.

  • Fix Grafana dashboard for AsyncAPIs

    Fix Grafana dashboard for AsyncAPIs

    Changes

    • Fix typo: async_queue_length -> async_queued so that the list of api_names is populated (currently empty)
    • Use =~ with api_name where missing to enable displaying multiple AsyncAPIs on a panel
    • For the "In-Flight Requests" panel include the api_name in the legend

    Testing

    I have made the corresponding updates manually through the Grafana UI for our deployed Cortex cluster. AsyncAPIs now list in the "Cortex / AsyncAPI" dashboard and the dashboard works when multiple AsyncAPIs are selected.


    checklist:

    • [ ] run make test and make lint
    • [ ] test manually (i.e. build/push all images, restart operator, and re-deploy APIs)
    • [ ] update examples
    • [ ] update docs and add any new files to summary.md (view in gitbook after merging)
    • [ ] cherry-pick into release branches if applicable
    • [ ] alert the dev team if the dev environment changed
  • Use of root url

    Use of root url

    I don't really know how to word it correctly, long story short, I need to use the "http://$URL/" instead of http://$URL/$API_NAME" for one of the multiple APIs inside the cluster, I haven't found any way to do it in the documentation, but surely it is implemented.

  • Bump sigs.k8s.io/aws-iam-authenticator from 0.5.3 to 0.5.9

    Bump sigs.k8s.io/aws-iam-authenticator from 0.5.3 to 0.5.9

    Bumps sigs.k8s.io/aws-iam-authenticator from 0.5.3 to 0.5.9.

    Release notes

    Sourced from sigs.k8s.io/aws-iam-authenticator's releases.

    v0.5.9

    Changelog

    • 1209cfe2 Bump version in Makefile
    • 029d1dcf Add query parameter validation for multiple parameters

    v0.5.7

    What's Changed

    New Contributors

    Full Changelog: https://github.com/kubernetes-sigs/aws-iam-authenticator/compare/v0.5.6...v0.5.7

    v0.5.6

    Changelog

    Docker Images

    Note: You must log in with the registry ID and your role must have the necessary ECR privileges:

    $(aws ecr get-login --no-include-email --region us-west-2 --registry-ids 602401143452)
    
    • docker pull 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-iam-authenticator:v0.5.6
    • docker pull 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-iam-authenticator:v0.5.6-arm64
    • docker pull 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-iam-authenticator:v0.5.6-amd64

    v0.5.5

    Changelog

    Docker Images

    Note: You must log in with the registry ID and your role must have the necessary ECR privileges:

    $(aws ecr get-login --no-include-email --region us-west-2 --registry-ids 602401143452)
    

    ... (truncated)

    Commits
    • 1209cfe Bump version in Makefile
    • 029d1dc Add query parameter validation for multiple parameters
    • 0a72c12 Merge pull request #455 from jyotimahapatra/rev2
    • 596a043 revert use of upstream yaml parsing
    • 2a9ee95 Merge pull request #448 from jngo2/master
    • fc4e6cb Remove unused imports
    • f0fe605 Remove duplicate InitMetrics
    • 99f04d6 Merge pull request #447 from nckturner/release-0.5.6
    • 9dcb6d1 Faster multiarch docker builds
    • a9cc81b Bump timeout for image build job
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

  • Consider using the CDK SDK for `cortex cluster up / down` commands

    Consider using the CDK SDK for `cortex cluster up / down` commands

    Description

    Replace cloud provider specific code in cortex cluster commands by using the CDK API.

    Motivation

    Make cluster management commands more independent from each cloud provider. Make it easier to use code to define the infrastructure (aka Cortex) in this case.

  • Restrict minimum EC2/EKS IAM policies by resource

    Restrict minimum EC2/EKS IAM policies by resource

    Description

    As it is described in https://docs.cortex.dev/clusters/management/auth#minimum-iam-policy, the current minimum IAM policy is to grant the cortex CLI (and by that extension to eskctl) full control over the EC2/EKS services.

    Motivation

    These should be restricted to a resource-based policy that would limit what an IAM role/user can do. This is especially helpful in bigger corporations where there are more than a handful of developers and the company's policy on what access its devs have is more stringent.

    Additional context

    This seems to be blocked on what eksctl requires: https://eksctl.io/usage/minimum-iam-policies/. Talk to the eksctl team to see if there's a way to further reduce the IAM policy requirements.

On-line Machine Learning in Go (and so much more)

goml Golang Machine Learning, On The Wire goml is a machine learning library written entirely in Golang which lets the average developer include machi

Jan 5, 2023
Self-contained Machine Learning and Natural Language Processing library in Go
Self-contained Machine Learning and Natural Language Processing library in Go

Self-contained Machine Learning and Natural Language Processing library in Go

Jan 8, 2023
Machine Learning for Go
Machine Learning for Go

GoLearn GoLearn is a 'batteries included' machine learning library for Go. Simplicity, paired with customisability, is the goal. We are in active deve

Jan 3, 2023
Gorgonia is a library that helps facilitate machine learning in Go.
Gorgonia is a library that helps facilitate machine learning in Go.

Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily

Dec 30, 2022
Machine Learning libraries for Go Lang - Linear regression, Logistic regression, etc.

package ml - Machine Learning Libraries ###import "github.com/alonsovidales/go_ml" Package ml provides some implementations of usefull machine learnin

Nov 10, 2022
Gorgonia is a library that helps facilitate machine learning in Go.
Gorgonia is a library that helps facilitate machine learning in Go.

Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily

Dec 27, 2022
Prophecis is a one-stop machine learning platform developed by WeBank
Prophecis is a one-stop machine learning platform developed by WeBank

Prophecis is a one-stop machine learning platform developed by WeBank. It integrates multiple open-source machine learning frameworks, has the multi tenant management capability of machine learning compute cluster, and provides full stack container deployment and management services for production environment.

Dec 28, 2022
Go Machine Learning Benchmarks
Go Machine Learning Benchmarks

Benchmarks of machine learning inference for Go

Dec 30, 2022
A High-level Machine Learning Library for Go
A High-level Machine Learning Library for Go

Overview Goro is a high-level machine learning library for Go built on Gorgonia. It aims to have the same feel as Keras. Usage import ( . "github.

Nov 20, 2022
Katib is a Kubernetes-native project for automated machine learning (AutoML).
Katib is a Kubernetes-native project for automated machine learning (AutoML).

Katib is a Kubernetes-native project for automated machine learning (AutoML). Katib supports Hyperparameter Tuning, Early Stopping and Neural Architec

Jan 2, 2023
PaddleDTX is a solution that focused on distributed machine learning technology based on decentralized storage.
PaddleDTX is a solution that focused on distributed machine learning technology based on decentralized storage.

中文 | English PaddleDTX PaddleDTX is a solution that focused on distributed machine learning technology based on decentralized storage. It solves the d

Dec 14, 2022
Example of Neural Network models of social and personality psychology phenomena

SocialNN Example of Neural Network models of social and personality psychology phenomena This repository gathers a collection of neural network models

Dec 5, 2022
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Learn how to design large-scale systems. Prep for the system design interview.  Includes Anki flashcards.

English ∙ 日本語 ∙ 简体中文 ∙ 繁體中文 | العَرَبِيَّة‎ ∙ বাংলা ∙ Português do Brasil ∙ Deutsch ∙ ελληνικά ∙ עברית ∙ Italiano ∙ 한국어 ∙ فارسی ∙ Polski ∙ русский язы

Jan 9, 2023
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.

End-to-end computer vision platform Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises. onepa

Dec 12, 2022
Spice.ai is an open source, portable runtime for training and using deep learning on time series data.
Spice.ai is an open source, portable runtime for training and using deep learning on time series data.

Spice.ai Spice.ai is an open source, portable runtime for training and using deep learning on time series data. ⚠️ DEVELOPER PREVIEW ONLY Spice.ai is

Dec 15, 2022
Reinforcement Learning in Go
Reinforcement Learning in Go

Overview Gold is a reinforcement learning library for Go. It provides a set of agents that can be used to solve challenges in various environments. Th

Dec 11, 2022
FlyML perfomant real time mashine learning libraryes in Go

FlyML perfomant real time mashine learning libraryes in Go simple & perfomant logistic regression (~100 LoC) Status: WIP! Validated on mushrooms datas

May 30, 2022
Go (Golang) encrypted deep learning library; Fully homomorphic encryption over neural network graphs

DC DarkLantern A lantern is a portable case that protects light, A dark lantern is one who's light can be hidden at will. DC DarkLantern is a golang i

Oct 31, 2022
A tool for building identical machine images for multiple platforms from a single source configuration
A tool for building identical machine images for multiple platforms from a single source configuration

Packer Packer is a tool for building identical machine images for multiple platforms from a single source configuration. Packer is lightweight, runs o

Oct 3, 2021