Fadvisor(FinOps Advisor) is a collection of exporters which collect cloud resource pricing and billing data guided by FinOps, insight cost allocation for containers and kubernetes resource

[TOC]

Fadvisor: FinOps Advisor


fadvisor(finops advisor) is used to solve the FinOps Observalibility, it can be integrated with Crane to help users to improve the cost visualization and optimization. Also, it can be integrated with your monitoring system as a metric exporter.

fadvisor has a collection of exporters to collect metrics about cost and other finops metrics.

  • exporters are used to collect metrics guided by FinOps.
    • now a cost-exporter is available, and other exporters can be integrated
  • apiserver used to do some logic for aggregate data and proxy

Concept

Fadvisor Cost model is a way to estimate and breakdown the resource price to each container or pod and other cloud native resource in kubernetes. Then, users can insight the costs by labels or other dimensions to view costs of what you care about.

This is an idea from FinOps, because the traditional billing and pricing system for cloud resource is not adaptive to cloud native resource.

Note Cost model now is just used to estimate cost not to replace the billing, because real billing depends on the billing system.

Model is an experimental implementation of the cost allocation and showback & chargeback from the FinOps.

1. The simplest cost model is to estimate a resource price of all nodes or pods by the same price.
   for example, when compute costs, you can assume all container's cpu & ram unit price is the same, 2$ Core/Hour, 0.3$ Gib/Hour

2. Advanced cost model is to estimate a resource price by cost breakdown.
   this theory is based on each cloud machine instance is different price with different instance type and charge type.
   so the containers in different node type or eks pod has different price

Tutorial

Now there is a cost-exporter is available, which now support tencent cloud provider to collect the cloud instance pricing metrics.

Deploy all components by one command

helm install fadvisor deploy/helm/fadvisor -n crane-system  --set cost-exporter.costExporterParam.secretid={{your cloud secret id}} --set cost-exporter.costExporterParam.secretkey={{your cloud secret key}}

Except cost-exporter, it will install following components in your system by default.

dependencies:
  - name: kube-state-metrics
    condition: fadvisor.kube-state-metrics.enabled,kube-state-metrics.enabled
    repository: file://./charts/kube-state-metrics
  - name: node-exporter
    condition: fadvisor.node-exporter.enabled,node-exporter.enabled
    repository: file://./charts/node-exporter
  - name: prometheus
    condition: fadvisor.prometheus.enabled,prometheus.enabled
    repository: file://./charts/prometheus
  - name: grafana
    condition: fadvisor.grafana.enabled,grafana.enabled
    repository: file://./charts/grafana

Install one by one

install cost-exporter, you must specify secretid and secretkey

helm install cost-exporter deploy/helm/fadvisor/charts/cost-exporter -n crane-system --set costExporterParam.secretid={{your cloud secret id}} --set costExporterParam.secretkey={{your cloud secret key}}

install other components

helm install kube-state-metrics deploy/helm/fadvisor/charts/kube-state-metrics -n crane-system
helm install node-exporter deploy/helm/fadvisor/charts/node-exporter -n crane-system
helm install prometheus deploy/helm/fadvisor/charts/prometheus -n crane-system
helm install grafana deploy/helm/fadvisor/charts/grafana -n crane-system

Integrated with existing monitoring components

If you has an prometheus and grafana, you can just only deploy the exporter and do some configure.

You can deploy the cost-exporter to your tke cluster to collect the metric, use prometheus to scrape the metric, and following dashboards can be used;

1. deploy cost-exporter

install by helm

helm install cost-exporter deploy/helm/fadvisor/charts/cost-exporter -n crane-system --set costExporterParam.secretid={{your cloud secret id}} --set costExporterParam.secretkey={{your cloud secret key}}

install by kubectl

NOTE you must specify your k8s secret id and secret key in yaml, this is used to access Tencent Cloud Cvm API.

kubectl create -f deploy/cost-exporter/ -n crane-system

the cost-exporter param has secretId and secretKey, you must provide your cloud provider secret

containers:
- name: fadvisor-cost-exporter
  image: docker.io/gocrane/fadvisor-cost-exporter:6927f01
  imagePullPolicy: IfNotPresent
  command:
    - /cost-exporter
    - --v=4
    - --secretId=
    - --secretKey=

2. configure the prometheus scrape config and rules

configure following scrape target to your prometheus.

- job_name: "fadvisor-cost-exporter"
    scrape_interval: 5m
    scheme: http
    metrics_path: /metrics
    static_configs:
      - targets: ['cost-exporter.crane-system.svc.cluster.local:8081']

NOTE, except cost-exporter, your prometheus must have scraped the kubernetes metrics including:

  • kubelet-cadvisor metrics.
  • node-exporter metrics. need to install node-exporter.
  • kube-state-metrics metrics. need to install kube-state-metrics.

configure some recording rules to your prometheus.

recording_rules.yml:
  groups:
    - name: costs.rules
      interval: 3600s
      rules:
        - expr: |
            sum(label_replace(irate(container_cpu_usage_seconds_total{container!="POD", container!="",image!=""}[1h]), "node", "$1", "instance",  "(.*)")) by (container, pod, node, namespace) * on (node) group_left() avg(avg_over_time(node_cpu_hourly_cost[1h])) by (node)
          record: namespace:container_cpu_usage_costs_hourly:sum_rate
        - expr: |
            sum(label_replace(avg_over_time(container_memory_working_set_bytes{container!="POD",container!="",image!=""}[1h]), "node", "$1", "instance",  "(.*)")) by (container, pod, node, namespace) / 1024.0 / 1024.0 / 1024.0 * on (node) group_left() avg(avg_over_time(node_ram_hourly_cost[1h])) by (node)
          record: namespace:container_memory_usage_costs_hourly:sum_rate
        - expr: |
            avg(avg_over_time(node_cpu_hourly_cost[1h])) by (node)
          record: node:node_cpu_hourly_cost:avg
        - expr: |
            avg(avg_over_time(node_ram_hourly_cost[1h])) by (node)
          record: node:node_ram_hourly_cost:avg
        - expr: |
            avg(avg_over_time(node_total_hourly_cost[1h])) by (node)
          record: node:node_total_hourly_cost:avg

3. import following grafana dashboards to your grafana

and there are some available grafana dashboards for you if you has installed grafana already.

./deploy/helm/fadvisor/charts/grafana/dashboards/cluster-costs.json
./deploy/helm/fadvisor/charts/grafana/dashboards/costs-dimension.json
./deploy/helm/fadvisor/charts/grafana/dashboards/namespace-costs.json

estimated cluster costs

namespace costs

topk container costs

Dependency

  • kube-state-metrics
  • node-exporter
  • prometheus
  • grafana
Owner
Crane
Cloud Resource Analytics and Economics
Crane
Comments
  • test-pkg-cloud-plugins

    test-pkg-cloud-plugins

    Signed-off-by: jxs1211 [email protected]

    Description ut

    Related Issues N/A

    New Behavior (screenshots if needed) PS D:\shen\go\open_source\fadvisor\pkg\cloud> go test -v -cover === RUN TestRegisterCloudProvider === RUN TestRegisterCloudProvider/base --- PASS: TestRegisterCloudProvider (0.00s) --- PASS: TestRegisterCloudProvider/base (0.00s) === RUN TestGetCloudProvider === RUN TestGetCloudProvider/base === RUN TestGetCloudProvider/found_provider_by_name --- PASS: TestGetCloudProvider (0.00s) --- PASS: TestGetCloudProvider/base (0.00s) --- PASS: TestGetCloudProvider/found_provider_by_name (0.00s) === RUN TestInitCloudProvider === RUN TestInitCloudProvider/base === RUN TestInitCloudProvider/not_CloudConfigFile_and_cloud_is_nil === RUN TestInitCloudProvider/base#01 --- PASS: TestInitCloudProvider (0.01s) --- PASS: TestInitCloudProvider/base (0.00s) --- PASS: TestInitCloudProvider/not_CloudConfigFile_and_cloud_is_nil (0.00s) --- PASS: TestInitCloudProvider/base#01 (0.00s) PASS coverage: 35.8% of statements

  • can fadvisor support close election?

    can fadvisor support close election?

    my k8s version is v1.13.12,fadvisor leaderelection error:error initially creating leader election record: the server could not find the requested resource,can fadvisor support close election?

  • refact fadvisor to support different providers

    refact fadvisor to support different providers

    refactored fadvisor to support different cloud providers now the provider specified configuration need to be passed as config file

    • the provider is default, if you don't specify anything
    • to enable other provider like qcloud, you need to specify the following parameters
        - --provider=qcloud
        - --cloudConfigFile=/etc/cloud/config
    

    and the content of the config file should be provider specified:

    [credentials]
    clusterId=cluster1
    appId=app1
    secretId=id1
    secretKey=key1
    [clientProfile]
    debug=true
    defaultLimit=1
    defaultLanguage=CH
    defaultTimeoutSeconds=10
    region=shanghai
    domainSuffix=cloud.tencent.com
    scheme=http
    
  • feat-pkg-cloudproviders-qcloud-tke

    feat-pkg-cloudproviders-qcloud-tke

    Signed-off-by: xjshen [email protected]

    Description ut

    Related Issues N/A

    New Behavior (screenshots if needed) === RUN TestTKEPlatform_PlatformCost === RUN TestTKEPlatform_PlatformCost/base === RUN TestTKEPlatform_PlatformCost/clusterRealNodes_is_less_than_default_unit_nodes === RUN TestTKEPlatform_PlatformCost/clusterRealNodes_is_larger_than_default_unit_nodes --- PASS: TestTKEPlatform_PlatformCost (0.00s) --- PASS: TestTKEPlatform_PlatformCost/base (0.00s) --- PASS: TestTKEPlatform_PlatformCost/clusterRealNodes_is_less_than_default_unit_nodes (0.00s) --- PASS: TestTKEPlatform_PlatformCost/clusterRealNodes_is_larger_than_default_unit_nodes (0.00s) PASS ok github.com/gocrane/fadvisor/pkg/cloudproviders/qcloud 11.431s

  • test-pkg-cloud-plugins

    test-pkg-cloud-plugins

    Signed-off-by: jxs1211 [email protected]

    Description ut

    Related Issues N/A

    New Behavior (screenshots if needed) PS D:\shen\go\open_source\fadvisor\pkg\cloud> go test -v -cover === RUN TestRegisterCloudProvider === RUN TestRegisterCloudProvider/base --- PASS: TestRegisterCloudProvider (0.00s) --- PASS: TestRegisterCloudProvider/base (0.00s) === RUN TestGetCloudProvider === RUN TestGetCloudProvider/base === RUN TestGetCloudProvider/found_provider_by_name --- PASS: TestGetCloudProvider (0.00s) --- PASS: TestGetCloudProvider/base (0.00s) --- PASS: TestGetCloudProvider/found_provider_by_name (0.00s) === RUN TestInitCloudProvider === RUN TestInitCloudProvider/base === RUN TestInitCloudProvider/not_CloudConfigFile_and_cloud_is_nil === RUN TestInitCloudProvider/base#01 --- PASS: TestInitCloudProvider (0.01s) --- PASS: TestInitCloudProvider/base (0.00s) --- PASS: TestInitCloudProvider/not_CloudConfigFile_and_cloud_is_nil (0.00s) --- PASS: TestInitCloudProvider/base#01 (0.00s) PASS coverage: 35.8% of statements

  • Test pkg cloud custom-pricing

    Test pkg cloud custom-pricing

    Signed-off-by: jxs1211 [email protected]

    Description ut

    Related Issues N/A

    New Behavior (screenshots if needed) PS D:\shen\go\open_source\fadvisor\pkg\cloud> go test -v -cover === RUN TestPriceConfig_UpdateConfigFromConfigMap === RUN TestPriceConfig_UpdateConfigFromConfigMap/base === RUN TestPriceConfig_UpdateConfigFromConfigMap/SetCustomPricing_raise_an_error === RUN TestPriceConfig_UpdateConfigFromConfigMap/SetCustomPricing_ok --- PASS: TestPriceConfig_UpdateConfigFromConfigMap (0.00s) --- PASS: TestPriceConfig_UpdateConfigFromConfigMap/base (0.00s) --- PASS: TestPriceConfig_UpdateConfigFromConfigMap/SetCustomPricing_raise_an_error (0.00s) --- PASS: TestPriceConfig_UpdateConfigFromConfigMap/SetCustomPricing_ok (0.00s) === RUN TestSetCustomPricing === RUN TestSetCustomPricing/base === RUN TestSetCustomPricing/value_is_not_float_type === RUN TestSetCustomPricing/value_is_float_type --- PASS: TestSetCustomPricing (0.00s) --- PASS: TestSetCustomPricing/base (0.00s) --- PASS: TestSetCustomPricing/value_is_not_float_type (0.00s) --- PASS: TestSetCustomPricing/value_is_float_type (0.00s) PASS coverage: 32.8% of statements ok github.com/gocrane/fadvisor/pkg/cloud 1.279s

  • test-pkg-cloud

    test-pkg-cloud

    Signed-off-by: jxs1211 [email protected]

    Description ut

    Related Issues N/A

    New Behavior (screenshots if needed) PS D:\shen\go\open_source\fadvisor\pkg\cloud> go test -v -cover === RUN TestDetectRegion === RUN TestDetectRegion/base === RUN TestDetectRegion/qcloud_with_no_node_labels === RUN TestDetectRegion/qcloud_with_node_labels --- PASS: TestDetectRegion (0.00s) --- PASS: TestDetectRegion/base (0.00s) --- PASS: TestDetectRegion/qcloud_with_no_node_labels (0.00s) --- PASS: TestDetectRegion/qcloud_with_node_labels (0.00s) === RUN TestDetectProvider === RUN TestDetectProvider/base === RUN TestDetectProvider/qcloud --- PASS: TestDetectProvider (0.00s) --- PASS: TestDetectProvider/base (0.00s) --- PASS: TestDetectProvider/qcloud (0.00s) === RUN TestNewProviderConfig === RUN TestNewProviderConfig/base --- PASS: TestNewProviderConfig (0.00s) --- PASS: TestNewProviderConfig/base (0.00s) === RUN TestPriceConfig_GetConfig === RUN TestPriceConfig_GetConfig/base --- PASS: TestPriceConfig_GetConfig (0.00s) --- PASS: TestPriceConfig_GetConfig/base (0.00s) PASS coverage: 22.4% of statements ok github.com/gocrane/fadvisor/pkg/cloud 1.394s PS D:\shen\go\open_source\fadvisor\pkg\cloud>

  • add ut for pkg/util/annotation

    add ut for pkg/util/annotation

    Signed-off-by: xian-jie.shen [email protected]

    Pre-Checklist

    Description

    add ut

    New Behavior (screenshots if needed)

    [root@dev testing]# go test annotation_test.go annotation.go -v -cover 
    === RUN   TestGetRegion
    === RUN   TestGetRegion/base
    === RUN   TestGetRegion/LabelZoneRegion
    === RUN   TestGetRegion/not_found
    --- PASS: TestGetRegion (0.00s)
        --- PASS: TestGetRegion/base (0.00s)
        --- PASS: TestGetRegion/LabelZoneRegion (0.00s)
        --- PASS: TestGetRegion/not_found (0.00s)
    === RUN   TestGetZone
    === RUN   TestGetZone/base
    === RUN   TestGetZone/LabelZoneFailureDomain
    === RUN   TestGetZone/not_found
    --- PASS: TestGetZone (0.00s)
        --- PASS: TestGetZone/base (0.00s)
        --- PASS: TestGetZone/LabelZoneFailureDomain (0.00s)
        --- PASS: TestGetZone/not_found (0.00s)
    === RUN   TestGetInstanceType
    === RUN   TestGetInstanceType/base
    === RUN   TestGetInstanceType/LabelInstanceTypeStable
    === RUN   TestGetInstanceType/not_found
    --- PASS: TestGetInstanceType (0.00s)
        --- PASS: TestGetInstanceType/base (0.00s)
        --- PASS: TestGetInstanceType/LabelInstanceTypeStable (0.00s)
        --- PASS: TestGetInstanceType/not_found (0.00s)
    === RUN   TestGetOperatingSystem
    === RUN   TestGetOperatingSystem/base
    === RUN   TestGetOperatingSystem/betaLabel
    === RUN   TestGetOperatingSystem/not_found
    --- PASS: TestGetOperatingSystem (0.00s)
        --- PASS: TestGetOperatingSystem/base (0.00s)
        --- PASS: TestGetOperatingSystem/betaLabel (0.00s)
        --- PASS: TestGetOperatingSystem/not_found (0.00s)
    PASS
    coverage: 100.0% of statements
    ok      command-line-arguments  0.010s  coverage: 100.0% of statements
    
Github billing exporter - Billing exporter for GitHub organizations

GitHub billing exporter Forked From: https://github.com/borisputerka/github_bill

Nov 2, 2022
The GCP Enterprise Cloud Cost Optimiser, or gecco for short, helps teams optimise their cloud project costs.
The GCP Enterprise Cloud Cost Optimiser, or gecco for short, helps teams optimise their cloud project costs.

gecco helps teams optimise their cloud resource costs. Locate abandoned, idle, and inefficiently configured resources quickly. gecco helps teams build

Jan 9, 2022
Kubedock is a minimal implementation of the docker api that will orchestrate containers on a Kubernetes cluster, rather than running containers locally.

Kubedock Kubedock is an minimal implementation of the docker api that will orchestrate containers on a kubernetes cluster, rather than running contain

Nov 11, 2022
Operator Permissions Advisor is a CLI tool that will take a catalog image and statically parse it to determine what permissions an Operator will request of OLM during an install

Operator Permissions Advisor is a CLI tool that will take a catalog image and statically parse it to determine what permissions an Operator will request of OLM during an install. The permissions are aggregated from the following sources:

Apr 22, 2022
provide api for cloud service like aliyun, aws, google cloud, tencent cloud, huawei cloud and so on

cloud-fitter 云适配 Communicate with public and private clouds conveniently by a set of apis. 用一套接口,便捷地访问各类公有云和私有云 对接计划 内部筹备中,后续开放,有需求欢迎联系。 开发者社区 开发者社区文档

Dec 20, 2022
Open Source runtime scanner for Linux containers (LXD), It performs security audit checks based on CIS Linux containers Benchmark specification
Open Source runtime scanner for Linux containers (LXD), It performs security audit checks based on CIS Linux containers  Benchmark specification

lxd-probe Scan your Linux container runtime !! Lxd-Probe is an open source audit scanner who perform audit check on a linux container manager and outp

Dec 26, 2022
AWS Data Transfer Cost Explorer
AWS Data Transfer Cost Explorer

The AWS Data Transfer Cost Explorer The AWS Data Transfer Cost Explorer tool analyzes the billed Data Transfer items in your AWS account and presents

Jul 18, 2022
The metrics-agent collects allocation metrics from a Kubernetes cluster system and sends the metrics to cloudability

metrics-agent The metrics-agent collects allocation metrics from a Kubernetes cluster system and sends the metrics to cloudability to help you gain vi

Jan 14, 2022
nano-gpu-agent is a Kubernetes device plugin for GPU resources allocation on node.
nano-gpu-agent is a Kubernetes device plugin for GPU resources allocation on node.

Nano GPU Agent About this Project Nano GPU Agent is a Kubernetes device plugin implement for gpu allocation and use in container. It runs as a Daemons

Dec 29, 2022
A Terraform module that creates AWS alerts billing for your resources.

terraform-aws-billing-alarms terraform-aws-billing-alarms for project Replace name project to New Project agr 'terraform-aws-billing-alarms' 'new-pr

Oct 20, 2021
Prometheus exporter for IAAS daily billing information
Prometheus exporter for IAAS daily billing information

Multi-iaas-daily-billing-exporter Multi-iaas-daily-billing-exporter enables to collect, unify and expose daily billing from AWS and GCP providers. The

Dec 14, 2021
Assignment - make a billing machine for the XYZ Cafe

Assignment In this assignment, you have to make a billing machine for the XYZ Cafe. The machine consist of a group of buttons each representing a uniq

Feb 9, 2022
A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore

bookstore-sample-controller A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore. A resource cre

Jan 20, 2022
Collect data about your dependencies

Collect data about your dependencies Features and Data Sources: Go modules, runs tests, detects tests and benchmarks Flexible rendering with Graphviz,

Dec 20, 2022
Terraform Provider for Azure (Resource Manager)Terraform Provider for Azure (Resource Manager)
Terraform Provider for Azure (Resource Manager)Terraform Provider for Azure (Resource Manager)

Terraform Provider for Azure (Resource Manager) Version 2.x of the AzureRM Provider requires Terraform 0.12.x and later, but 1.0 is recommended. Terra

Oct 16, 2021
Apachedist-resource - A concourse resource to track updates of an apache distribution, e.g. tomcat

Apache Distribution Resource A concourse resource that can track information abo

Feb 2, 2022
cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resource objects related of Kubernetes Cluster API.

Overview cluster-api-state-metrics (CASM) is a service that listens to the Kubernetes API server and generates metrics about the state of custom resou

Oct 27, 2022
Cost-aware network traffic analysis

Traffic Refinery Overview Traffic Refinery is a cost-aware network traffic analysis library implemented in Go For a project overview, installation inf

Nov 21, 2022
Cloud-Z gathers information and perform benchmarks on cloud instances in multiple cloud providers.

Cloud-Z Cloud-Z gathers information and perform benchmarks on cloud instances in multiple cloud providers. Cloud type, instance id, and type CPU infor

Jun 8, 2022