What type of PR is this?
Uncomment only one /kind
line, and delete the rest.
For example, > /kind bug
would simply become: /kind bug
/kind bug
/kind chore
/kind cleanup
/kind failing-test
/kind enhancement
/kind documentation
/kind code-refactoring
What does this PR do / why we need it:
This PR exposes new metrics to track the statuses of the argo-cd workloads. It defines the following metrics:
argocd_application_controller_status Describes the status of the application controller workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
argocd_applicationset_controller_status Describes the status of the applicationSet controller workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
argocd_dex_status Describes the status of the dex workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
argocd_phase Describes the phase of argo-cd instance [2='Pending', 4='Available']
argocd_redis_status Describes the status of the redis workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
argocd_repo_server_status Describes the status of the repo server workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
argocd_server_status Describes the status of the argo-cd server workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
It spins up a new metrics server that listens on port 8085
workload status reconciliation logic is updated to reflect the status within the newly defined metrics and write them out to the /metrics
endpoint. Individual workload statuses can be queried by specifying the namespace of the instance that the workload is a part of.
NOTE: this PR assumes there will not be more than 1 argo-cd instance in any namespace as that is an anti-pattern.
This PR should be merged after https://github.com/argoproj-labs/argocd-operator/pull/829
Have you updated the necessary documentation?
- [ ] Documentation update is required by this PR.
- [ ] Documentation has been updated.
Which issue(s) this PR fixes:
Fixes https://issues.redhat.com/browse/GITOPS-2456
How to test changes / Special notes to the reviewer:
- Deploy operator locally
- Create argo-cd instance in
argocd
namespace
- Run a
GET
query against localhost:8085/metrics
in postman/curl
- Verify that the response looks something like this :
# HELP argocd_application_controller_status Describes the status of the application controller workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
# TYPE argocd_application_controller_status gauge
argocd_application_controller_status{namespace="argocd"} 3
# HELP argocd_applicationset_controller_status Describes the status of the applicationSet controller workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
# TYPE argocd_applicationset_controller_status gauge
argocd_applicationset_controller_status{namespace="argocd"} 0
# HELP argocd_dex_status Describes the status of the dex workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
# TYPE argocd_dex_status gauge
argocd_dex_status{namespace="argocd"} 0
# HELP argocd_phase Describes the phase of argo-cd instance [2='Pending', 4='Available']
# TYPE argocd_phase gauge
argocd_phase{namespace="argocd"} 4
# HELP argocd_redis_status Describes the status of the redis workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
# TYPE argocd_redis_status gauge
argocd_redis_status{namespace="argocd"} 3
# HELP argocd_repo_server_status Describes the status of the repo server workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
# TYPE argocd_repo_server_status gauge
argocd_repo_server_status{namespace="argocd"} 3
# HELP argocd_server_status Describes the status of the argo-cd server workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
# TYPE argocd_server_status gauge
argocd_server_status{namespace="argocd"} 3
- Edit Argo CD CR to enable notifications, and replace appset image by invalid one to create pending status
apiVersion: argoproj.io/v1alpha1
kind: ArgoCD
metadata:
name: example-argocd
spec:
applicationSet:
image: quay.io/argoproj/argocd@sha256:8283a9f06033c2377dc61b03daf49
notifications:
enabled: true
- Query again and verify that response now contains pending status for appset and running status for notifications controller
# HELP argocd_applicationset_controller_status Describes the status of the applicationSet controller workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
# TYPE argocd_applicationset_controller_status gauge
argocd_applicationset_controller_status{namespace="argocd"} 2
...
# HELP argocd_notifications_controller_status Describes the status of the notifications controller workload [0='Unknown', 1='Failed', 2='Pending', 3='Running']
# TYPE argocd_notifications_controller_status gauge
argocd_notifications_controller_status{namespace="argocd"} 3