Instant Kubernetes-Native Application Observability

Last update: Dec 31, 2022

Comments: 17

Pixie is an open source observability tool for Kubernetes applications. Use Pixie to view the high-level state of your cluster (service maps, cluster resources, application traffic) and also drill-down into more detailed views (pod state, flame graphs, individual full-body application requests).

Why Pixie?

Three features enable Pixie's magical developer experience:

Auto-telemetry: Pixie uses eBPF to automatically collect telemetry data such as full-body requests, resource and network metrics, application profiles, and more. See the full list of data sources here.
In-Cluster Edge Compute: Pixie collects, stores and queries all telemetry data locally in the cluster. Pixie uses less than 5% of cluster CPU, and in most cases less than 2%.
Scriptability: PxL, Pixie’s flexible Pythonic query language, can be used across Pixie’s UI, CLI, and client APIs.

Use Cases

Network Monitoring

Use Pixie to monitor your network, including:

The flow of network traffic within your cluster.
The flow of DNS requests within your cluster.
Individual full-body DNS requests and responses.
A Map of TCP drops and TCP retransmits across your cluster.

For more details, check out the tutorial or watch an overview.

Infrastructure Health

Monitor your infrastructure alongside your network and application layer, including:

Resource usage by Pod, Node, Namespace.
CPU flamegraphs per Pod, Node.

For more details, check out the tutorial or watch an overview.

Service Performance

Pixie automatically traces a variety of protocols. Get immediate visibility into the health of your services, including:

The flow of traffic between your services.
Latency per service and endpoint.
Sample of the slowest requests for an individual service.

For more details, check out the tutorial or watch an overview.

Database Query Profiling

Pixie automatically traces a number of different database protocols. Use Pixie to monitor the performance of your database requests:

Latency, error and throughput (LET) rate for all pods.
LET rate per normalized query.
Latency per individual full body query.
Individual full-body requests and responses.

For more details, check out the tutorial or watch an overview.

Request Tracing

Pixie makes debugging this communication between microservices easy by providing immediate and deep (full-body) visibility into requests flowing through your cluster. See:

Full-body requests and response for supported protocols.
Error rate per Service, Pod.

For more details, check out the tutorial or watch an overview.

Continuous Application Profiling

Use Pixie's continuous profiling feature to identify performance issues within application code.

For more details, check out the tutorial or watch an overview.

Distributed bpftrace Deployment

Use Pixie to deploy a bpftrace program to all of the nodes in your cluster. After deploying the program, Pixie captures the output into a table and makes the data available to be queried and visualized int he Pixie UI. TCP Drops pictured. For more details, check out the tutorial or watch an overview.

Dynamic Go Logging

Debug Go binaries deployed in production environments without needing to recompile and redeploy. For more details, check out the tutorial or watch an overview.

Get Started

It takes just a few minutes to install Pixie. To get started, check out the Install Guides.

Once installed, you can interact with Pixie using the:

Get Involved

Pixie is a community driven project; we welcome your contribution! For code contributions, please read our contribution guide.

File a GitHub issue to report a bug or request a feature.
Join our Slack for live conversations and quick questions.
Follow us on Twitter and YouTube.
Join our monthly community meetings.
Provide feedback on our roadmap.

About Pixie

Pixie was contributed by New Relic, Inc. to the Cloud Native Computing Foundation as a Sandbox project in June 2021.

License

Pixie is licensed under Apache License, Version 2.0.

Owner

Pixie

Pixie

https://github.com/pixie-io/pixie https://px.dev

Comments

Self-Hosted Pixie Install Script

Is your feature request related to a problem? Please describe. We would like to have an install experience for the self-hosted version of Pixie that is as easy to use as the one hosted on withpixie.ai.

Additional context Our team has been busy at work this month open sourcing Pixie's source code, docs, website, and other assets, We are also actively applying to be a CNCF sandbox project!

One of our last remaining items is to publish an install script to deploy a self-hosted version of Pixie.

Who offers a hosted version of Pixie?

New Relic currently offers a 100% free hosted version of Pixie Cloud. This hosting has no contingencies and will be offered indefinitely to the Pixie Community. All the code used for hosting is open source, including out production manifest files.

What will the Self-Hosted install script do?

The Self-Hosted install script will deploy Pixie Cloud so that you can use Pixie without any external dependencies. This is the exact version of Pixie Cloud we deploy, so it'll behave exactly as the hosted version, but will require management/configuration.

What is the timeline?

Good question. :) We had planned to open source this script by 5/4. Unfortunately, we didn’t make it. We need more time to ensure that the Pixie Cloud deploy script will be just as easy to install Pixie Cloud as it is to install the hosted version of Pixie (in < 2 minutes!)

But I really want to run a Self-Hosted Pixie...now!

Technically you can build and run a self-hosted Pixie using Skaffold. Check out:

https://github.com/pixie-labs/pixie/blob/main/skaffold/skaffold_cloud.yaml https://github.com/pixie-labs/pixie/tree/main/k8s/cloud https://github.com/pixie-labs/pixie/tree/main/k8s/cloud_deps

These directions are not fully documented and the team is choosing to focus on quickly delivering the self-hosted install script. We'll constantly be iterating on the documentation to make the project more open source friendly.
google login hangs
Trying the pixie online installer. After signing up with google, login hangs forever with:

Authenticating Logging in...

To Reproduce Steps to reproduce the behavior:

Go to signup, use google

Login with google

Expected behavior To be logged in
Modify `dns_data` table to record DNS requests with no response

Problem: Sometimes DNS fails without the pod receiving DNS errors (e.g. if the network packets carrying the DNS response are being dropped).

Solution: Change the dns_data table to record DNS requests without responses, so that you can track the number of DNS requests that remain unanswered over time.
[Doc issue] no ingress installed so dev_dns_updater did nothing
Describe the bug I've been followed the document to deploy pixie cloud, and setup-dns section would update /etc/hosts if there is any ingress rules in k8s cluster. But it didn't have!

➜ pixie git:(main) ✗ kubectl get ing No resources found in default namespace. ➜ pixie git:(main) ✗ kubectl get ing -n plc No resources found in plc namespace.

And it of course doesn't change anything:

➜ pixie git:(main) ✗ ./dev_dns_updater --domain-name="dev.withpixie.dev" --kubeconfig=$HOME/.kube/config --n=plc INFO[0000] DNS Entries entries="dev.withpixie.dev, work.dev.withpixie.dev, segment.dev.withpixie.dev, docs.dev.withpixie.dev" service=cloud-proxy-service INFO[0000] DNS Entries entries=cloud.dev.withpixie.dev service=vzconn-service

It didn't change /etc/hosts file!

To Reproduce

Expected behavior Should update /etc/hosts and we could visit dev.withpixie.dev in browser.

Screenshots

Logs

App information (please complete the following information):

Pixie version: master branch

K8s cluster version: minikube on macOS 10.15.7 k8s version v1.22.2

Additional context

Can't install pixie to completely air gapped environment

Describe the bug Can't install pixie to completely air gapped environment.

To Reproduce Currently I'm trying to install it via YAML scheme. I've already pushed all images mentioned in manifests generated on extract manifests step to my local artifactory and replaced original images links with local ones, but during installation pixie still tries to download some images (e.g. busybox:1.28.0-glibc and nats:1.3.0) from the internet.

Expected behavior Be able to install pixie to self-hosted k8s cluster with no access to the internet.

Logs Please attach the logs by running the following command:

[root@localhost pixie_yamls]# kubectl get pods -n pl
NAME                                      READY   STATUS                       RESTARTS   AGE
etcd-operator-6c6f8cb48d-q5t8q            1/1     Running                      0          43m
kelvin-6c67584687-pwlrg                   0/1     Init:0/1                     0          42m
nats-operator-7bbff5c756-tt2rl            1/1     Running                      0          43m
pl-etcd-zs25zbm5ln                        0/1     Init:ImagePullBackOff        0          41m
pl-nats-1                                 0/1     ImagePullBackOff             0          42m
vizier-certmgr-58d97fd6b5-8wp9n           0/1     CreateContainerConfigError   0          42m
vizier-cloud-connector-74c5c84487-m4bmq   1/1     Running                      1          42m
vizier-metadata-6bc96dd78-g9brg           0/1     Init:0/2                     0          42m
vizier-pem-bv858                          0/1     Init:0/1                     0          42m
vizier-pem-dktqv                          0/1     Init:0/1                     0          42m
vizier-pem-ftd66                          0/1     Init:0/1                     0          42m
vizier-pem-gmrfq                          0/1     Init:0/1                     0          42m
vizier-pem-j7xmx                          0/1     Init:0/1                     0          42m
vizier-pem-jxl7j                          0/1     Init:0/1                     0          42m
vizier-pem-kcfbf                          0/1     Init:0/1                     0          42m
vizier-pem-mgzgj                          0/1     Init:0/1                     0          42m
vizier-pem-v7k7q                          0/1     Init:0/1                     0          42m
vizier-proxy-8568c9bd48-fdccm             0/1     CreateContainerConfigError   0          42m
vizier-query-broker-7b74f9cbdc-265m4      0/1     Init:0/1                     0          42m

[root@localhost pixie_yamls]# kc describe pod pl-etcd-zs25zbm5ln -n pl
Name:         pl-etcd-zs25zbm5ln
Namespace:    pl
...
Events:
  Type     Reason     Age                  From                             Message
  ----     ------     ----                 ----                             -------
  Normal   Scheduled  56m                  default-scheduler                Successfully assigned pl/pl-etcd-zs25zbm5ln to xxx
  Warning  Failed     55m                  kubelet, xxx  Failed to pull image "busybox:1.28.0-glibc": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.33:34516->23.23.116.141:443: read: connection reset by peer
  Warning  Failed     55m                  kubelet, xxx  Failed to pull image "busybox:1.28.0-glibc": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.33:59176->54.224.119.26:443: read: connection reset by peer
  Warning  Failed     55m                  kubelet, xxx  Failed to pull image "busybox:1.28.0-glibc": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.33:42888->107.23.149.57:443: read: connection reset by peer
  Warning  Failed     54m (x4 over 55m)    kubelet, xxx  Error: ErrImagePull
  Normal   Pulling    54m (x4 over 55m)    kubelet, xxx  Pulling image "busybox:1.28.0-glibc"
  Warning  Failed     54m                  kubelet, xxx  Failed to pull image "busybox:1.28.0-glibc": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.33:41714->34.238.187.50:443: read: connection reset by peer
  Normal   BackOff    45m (x43 over 55m)   kubelet, xxx  Back-off pulling image "busybox:1.28.0-glibc"
  Warning  Failed     48s (x234 over 55m)  kubelet, xxx  Error: ImagePullBackOff


[root@localhost pixie_yamls]# kc describe pod pl-nats-1 -n pl
Name:         pl-nats-1
Namespace:    pl
...
Events:
  Type     Reason       Age                    From                             Message
  ----     ------       ----                   ----                             -------
  Normal   Scheduled    57m                    default-scheduler                Successfully assigned pl/pl-nats-1 to yyy
  Warning  FailedMount  57m (x6 over 57m)      kubelet, yyy  MountVolume.SetUp failed for volume "server-tls-certs" : secret "service-tls-certs" not found
  Warning  Failed       56m                    kubelet, yyy  Failed to pull image "nats:1.3.0": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.18:32860->3.220.36.210:443: read: connection reset by peer
  Warning  Failed       56m                    kubelet, yyy  Failed to pull image "nats:1.3.0": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.18:52026->107.23.149.57:443: read: connection reset by peer
  Warning  Failed       2m26s (x227 over 56m)  kubelet, yyy Error: ImagePullBackOff

App information (please complete the following information):

Pixie version: Pixie CLI 0.5.8+Distribution.a09aa96.20210506210658.1
K8s cluster version: Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-09T11:26:42Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:04:18Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Compile error, missing HTTP Tables.
Describe the bug Cannot run any scripts due to a HTTP Event module not found?

Script compilation failed: L222 : C22 Table 'http_events' not found.\n

To Reproduce Steps to reproduce the behavior: Install fresh version of Pixie on Minikube Cluster

Expected behavior Pixie scripts to execute

Screenshots

Logs Please attach the logs by running the following command:

./px collect-logs (See Zip File)

pixie_logs_20210505024739.zip App information (please complete the following information):

Pixie version: 0.5.3+Distribution.0ff53f6.20210503183144.1

K8s cluster version: v1.20.2
No Kafka data available
Describe the bug After installing Pixie in our Kubernetes cluster there seems to be no data available when running the Kafka scripts, it's empty. I used the installation scenario for Helm described in the documentation https://docs.px.dev/installing-pixie/install-schemes/helm and I can see data from other scripts than Kafka, we are running Kafka using Strimzi. At the moment the rates for Kafka are like the following: 65kb/s incoming 80kb/s outgoing ~270 messages per second incoming

To Reproduce Steps to reproduce the behavior:

Install Pixie according to Helm instructions on https://docs.px.dev/installing-pixie/install-schemes/helm

Go to https://work.withpixie.ai/ and run Kafka scripts

Expected behavior I expected to see data from the Kafka cluster that we are running in the k8s cluster when running the Kafka scripts in the same way I can when running other scripts available.

Screenshots

Logs pixie_logs_20220305212109.zip

**App information (please complete the following information):** - Pixie operator v0.0.19 - K8s 1.22.4 (AKS) - Node Kernel version 5.4.0-1065-azure - Browser Google chrome 98.0.4758.80 (Official Build) (64-bit) - Strimzi Kafka 0.24.0 and Kafka version 2.7.0 **Additional context** Add any other context about the problem here.
JAVA profiling is not enabled by default as expected.

I followed the tutorial. Passed the java -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=9999 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -XX:+PreserveFramePointer

Also while compiling used the apply plugin: 'java-library' compileJava { options.debug = true options.debugOptions.debugLevel = "source,lines,vars" }

to enable java debug symbols.

Still getting the hexadecimal values instead of the method names.

Add responseless dns requests to dns_data table.

This pull request is related to https://github.com/pixie-io/pixie/issues/418 which relates to including those DNS requests into the dns_data table for which a matching response was not found.

Testing Done

[X] Verified that tests in stitcher_test.cc are passed.

noman@px-dev-docker:/pl/src/px.dev/pixie$ bazel test src/stirling/source_connectors/socket_tracer/protocols/dns/...
INFO: Analyzed 7 targets (0 packages loaded, 0 targets configured).
INFO: Found 4 targets and 3 test targets...
INFO: Elapsed time: 3.078s, Critical Path: 2.69s
INFO: 11 processes: 1 internal, 10 processwrapper-sandbox.
INFO: Build completed successfully, 11 total actions
//src/stirling/source_connectors/socket_tracer/protocols/dns:parse_test  PASSED in 0.0s
//src/stirling/source_connectors/socket_tracer/protocols/dns:stitcher_test PASSED in 0.0s
//src/stirling/source_connectors/socket_tracer/protocols/dns:types_test  PASSED in 0.0s

INFO: Build completed successfully, 11 total actions
noman@px-dev-docker:/pl/src/px.dev/pixie$ cat   /home/noman/.cache/bazel/_bazel_noman/4c31fb537ca0f31ab15bbd6a8445d3b6/execroot/px/bazel-out/k8-fastbuild/testlogs/src/stirling/source_connectors/socket_tracer/protocols/dns/stitcher_test/test.log
exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //src/stirling/source_connectors/socket_tracer/protocols/dns:stitcher_test
I20221010 11:30:13.268741  5618 env.cc:47] Started: /home/noman/.cache/bazel/_bazel_noman/4c31fb537ca0f31ab15bbd6a8445d3b6/sandbox/processwrapper-sandbox/21/execroot/px/bazel-out/k8-fastbuild/bin/src/stirling/source_connectors/socket_tracer/protocols/dns/stitcher_test.runfiles/px/src/stirling/source_connectors/socket_tracer/protocols/dns/stitcher_test
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from DnsStitcherTest
[ RUN      ] DnsStitcherTest.RecordOutput
[       OK ] DnsStitcherTest.RecordOutput (0 ms)
[ RUN      ] DnsStitcherTest.OutOfOrderMatching
[       OK ] DnsStitcherTest.OutOfOrderMatching (0 ms)
[----------] 2 tests from DnsStitcherTest (0 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (0 ms total)
[  PASSED  ] 2 tests.
I20221010 11:30:13.269136  5618 env.cc:51] Shutting down

Pixie is missing data about many pods and services in the cluster

Describe the bug

I encountered an issue in a self-hosted installation where Pixie is missing information about the cluster

E.g. When I checked the pods in a namespace using the px/namespace script from the UI and CLI, only 8 pods were shown. But when I checked from kubectl , I saw 90+ pods. Similarly, Pixie showed 6 services whereas kubectl showed 40+ services.

Also, at times, when I try to view details of a Pod in the Pixie UI, there is no data for it. E.g. I selected a running pod from the cluster and entered it's name in the px/pod script in the UI. But nothing was shown for it. I could only see a No data available for inbound_requests table message. (All the widgets in px/pod had the same no data available error message). The start time I set in the Pixie UI was less than the pod's uptime as well. Screenshot 2022-08-10 at 20 13 25

I also noticed that autocomplete in the Pixie UI doesn't show the correct resource at times. E.g. In px/pod, the pod that is shown by autocomplete does not exist in the cluster (Probably replaced by a new pod).

I noticed the following in the deployment vizier-metadata pod and vizier-cloud-connector had many restarts. When I checked the pod, the state change reason for the container was shown as Error

At times, newly created pods appear in Pixie. So this doesn't seem to be a case where Pixie is unable to get any information at all about new pods

To Reproduce Not sure how to reproduce this

Expected behavior Expected to see all pods and services of the cluster in Pixie

Logs Log containing "error" in vizier-metadata. I am including all the repeated log lines as I want to show that they have been printed within a short interval (Multiple lines during some seconds as well)

kubectl logs -f vizier-metadata-0 -n pl | grep -i Error
time="2022-07-13T17:34:32Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:32Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:35Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:35Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:35Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:35Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:36Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:36Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:34:39Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:56:40Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T17:56:42Z" level=info msg="Failed to get update version for topic" error="<nil>"
time="2022-07-13T18:04:05Z" level=info msg="Failed to get update version for topic" error="<nil>"

vizier-cloud-connector had the following error repeated multiple times

time="2022-07-13T18:34:46Z" level=info msg="failed vizier health check, will restart healthcheck" error="rpc error: code = Internal desc = stream terminated by RST_STREAM with error code: INTERNAL_ERROR"
time="2022-07-13T18:34:46Z" level=info msg="failed vizier health check, will restart healthcheck" error="context canceled"

App information (please complete the following information):

Pixie version: 0.7.14
K8s cluster version: 1.21.9
Node Kernel version:
Browser version: Chrome Version 103.0.5060.114 (Official Build) (x86_64)

Additional context

gRPC-c data parsing
Stirling now registers on perf buffers where the gRPC-c eBPF module writes data to. There are 3 buffers:

gRPC events

gRPC headers

close events

The logic of handling gRPC sessions works for Golang events. This logic is now used for gRPC-c events as well. The data that the gRPC-c eBPF module passes to the user-space differs from the data that the Golang gRPC eBPF module passes to the user-space. This PR is basically an abstraction layer that "translates" gRPC-c eBPF events to the known format of Golang gRPC events.

gRPC-c events are still not enabled; They will be enabled in the next PR, where the needed probes will be attached by the UProbeManager. However, the gRPC-c eBPF program is now compiled; because in order for the code to find the perf buffers, they must exist.
Failed Deploy Pixie using YAML

Describe the bug I used self-hosted pixie. when i get to the third step “Deploy Pixie”

i deploy Pixie using YAML

when i Excuting an order

kubectl apply -f pixie_yamls/

the job 5ff7d47213a8875e3f1827d728b149498a0fdef08ba74499866d7d51a4b0147 in px-operator is not ready

The reason I found this error was ImagePullBackOff。

how can i modify this job yaml??

I tried to modify the job batch but got an error

I just want to modify imagePullPolicy or image. how can i modify this job yaml??
https://github.com/pixie-io/pixie/blob/main/DEVELOPMENT.md - issue with the cloud secrets
Describe the bug Need addition to the documentation. Following the steps blindly doesn't work. After running the following command: ./scripts/deploy_cloud_prereqs.sh plc-dev dev getting the following error: Credentials path "/private/credentials/k8s/dev" does not exist. Did you slect the right secret type?

To Reproduce Steps to reproduce the behavior:

Checkout pixie repo

Follow the steps mentioned here: https://github.com/pixie-io/pixie/blob/main/DEVELOPMENT.md

Under the step titled 'Load the config maps and secrets.', run the command: ./scripts/deploy_cloud_prereqs.sh plc-dev dev

We get the following error: Credentials path "/private/credentials/k8s/dev" does not exist. Did you slect the right secret type?

Expected behavior The setup steps should work seamlessly.
Add global.cluster recognition for clusterName
Signed-off-by: Max Lemieux [email protected]

What type of PR is this?

/kind feature

What this PR does / why we need it:

Currently, when this chart is installed as a subchart, the pixie-chart.clusterName value must still be set, although other charts may already set a required value global.cluster to store the cluster name.

As a person installing the chart bundle as a subchart, this is not a great experience as I must enter the same value twice - only for this value.

What is the testplan for the PR:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

This change makes it unnecessary to set pixie-chart.clusterName for the chart bundle, as long as global.cluster is set.

Does this PR introduce a user-facing change?

Additional documentation:
Custom registry not used for vizier-operator pod
Describe the bug Custom registry, although set, is not used for vizier-operator pod. It uses this image instead:

gcr.io/pixie-oss/pixie-prod/operator/operator_image:0.0.34

When pixie-chart.registry is set, all pods should use the custom image registry. The vizier-operator pod in the px-operator namespace does not use the custom registry.

To Reproduce Steps to reproduce the behavior:

Install Pixie with custom registry

View description of vizier-operator pod

Notice it doesn't use custom registry (and there are probably other errors related to this in a restricted environment)

Expected behavior vizier-operator pod uses custom registry to host its app container image.

App information (please complete the following information):

Pixie version: Latest New Relic bundle chart 5.0.2 with Pixie operator version 0.0.34

K8s cluster version: 1.25/EKS

Node Kernel version: 5.4.219-126.411.amzn2.x86_64

Browser version: Chrome latest

Additional context I was able to host the custom image repository at $registry/gcr.io-pixie-oss-pixie-prod-operator-operator_image (as with the vizier images) and update the ClusterServiceVersion from the px-operator namespace to use the custom image. This seems to work, the vizier-operator pod uses the expected image, but I'm not sure if this is the right way to do it.
failed to connect to DB

Describe the bug some pods are not Running

kubectl logs pods

logs: time="2022-12-22T09:20:38Z" level=error msg="failed to connect to DB, retrying" error="FATAL: password authentication failed for user "pl" (SQLSTATE 28P01)"

kubectl get secrets pl-db-secrets Opaque 3 25m

screenshot

Remark I used tag is release/cloud/prod/1670351615
Self-Hosted Pixie deploy, The provided credentials are invalid
Describe the bug Self-Hosted Pixie deploy, all pods are running. I use chrome to visit https://work.dev.withpixie.dev/. Login to the admin account using [email protected] for the email and admin for the password. notify：The provided credentials are invalid, check for spelling mistakes in your password or username, email address, or phone number.

I don't modify k8s/cloud/base/ory_auth/kratos/kratos_deployment.yaml.

To Reproduce login https://work.dev.withpixie.dev/

Expected behavior Login succeed

Screenshots

App information (please complete the following information): Version：release/cloud/prod/1670351615

gcr.io/pixie-oss/pixie-prod/cloud/auth_server_image:1670351615

gcr.io/pixie-oss/pixie-prod/cloud/api_server_image:1670351615

...

The OCI Service Operator for Kubernetes (OSOK) makes it easy to connect and manage OCI services from a cloud native application running in a Kubernetes environment.

OCI Service Operator for Kubernetes Introduction The OCI Service Operator for Kubernetes (OSOK) makes it easy to create, manage, and connect to Oracle

Sep 27, 2022

Hubble - Network, Service & Security Observability for Kubernetes using eBPF

Network, Service & Security Observability for Kubernetes What is Hubble? Getting Started Features Service Dependency Graph Metrics & Monitoring Flow V

Jan 2, 2023

Litmus helps Kubernetes SREs and developers practice chaos engineering in a Kubernetes native way.

Litmus Cloud-Native Chaos Engineering Read this in other languages. ???? ???? ???? ???? Overview Litmus is a toolset to do cloud-native chaos engineer

Jan 1, 2023

Secure Distributed Thanos Deployment using an Observability Cluster

Atlas Status: BETA - I don't expect breaking changes, but still possible. Atlas, forced by Zeus to support the heavens and the skies on his shoulders.

Jun 11, 2022

ip-masq-agent-v2 aims to solve more specific networking cases, allow for more configuration options, and improve observability compared to the original.

ip-masq-agent-v2 Based on the original ip-masq-agent, v2 aims to solve more specific networking cases, allow for more configuration options, and impro

Aug 31, 2022

Measure the overheads of various observability tools, especially profilers.

strong: WIP - NOT READY TO LOOK AT go-observability-bench Terminology Workload: A Go function performing a small task (< 100ms) like parsing a big blo

Apr 23, 2022

An open-source, distributed, cloud-native CD (Continuous Delivery) product designed for developersAn open-source, distributed, cloud-native CD (Continuous Delivery) product designed for developers

Developer-oriented Continuous Delivery Product ⁣ English | 简体中文 Table of Contents Zadig Table of Contents What is Zadig Quick start How to use? How to

Oct 19, 2021

Kubernetes OS Server - Kubernetes Extension API server exposing OS configuration like sysctl via Kubernetes API

KOSS is a Extension API Server which exposes OS properties and functionality using Kubernetes API, so it can be accessed using e.g. kubectl. At the moment this is highly experimental and only managing sysctl is supported. To make things actually usable, you must run KOSS binary as root on the machine you will be managing.

May 19, 2021

The Cloud Native Application Proxy

Traefik (pronounced traffic) is a modern HTTP reverse proxy and load balancer that makes deploying microservices easy. Traefik integrates with your ex

Jan 9, 2023

This is a cloud-native application that focuses on the DevOps area.

Get started Install KubeSphere via kk (or other ways). This is an optional step, basically we need a Kubernetes Cluster and the front-end of DevOps. I

Jan 5, 2023

cloud native application deploy flow

Triton-io/Triton English | 简体中文 Introduction Triton provides a cloud-native DeployFlow, which is safe, controllable, and policy-rich. For more introdu

May 28, 2022

This is a cloud-native application that focuses on the DevOps area.

KubeSphere DevOps integrates popular CI/CD tools, provides CI/CD Pipelines based on Jenkins, offers automation toolkits including Binary-to-Image (B2I

Jan 5, 2023

A cloud-native application simulator for golang

Build and upload Docker images Build docker images for main application and work

Aug 10, 2022

Kubernetes Operator for a Cloud-Native OpenVPN Deployment.

Meerkat is a Kubernetes Operator that facilitates the deployment of OpenVPN in a Kubernetes cluster. By leveraging Hashicorp Vault, Meerkat securely manages the underlying PKI.

Jan 4, 2023

Kubernetes Native Policy Management

Kyverno Kubernetes Native Policy Management Kyverno is a policy engine designed for Kubernetes. It can validate, mutate, and generate configurations u

Jan 2, 2023

Kubernetes Native Serverless Framework

kubeless is a Kubernetes-native serverless framework that lets you deploy small bits of code without having to worry about the underlying infrastructu

Dec 25, 2022

Cloud Native Configurations for Kubernetes

CNCK CNCK = Cloud Native Configurations for Kubernetes Make your Kubernetes applications more cloud native by injecting runtime cluster information in

Nov 4, 2021

gokp aims to install a GitOps Native Kubernetes Platform

gokp gokp aims to install a GitOps Native Kubernetes Platform. This project is a Proof of Concept centered around getting a GitOps aware Kubernetes Pl

Nov 4, 2022

OpenYurt - Extending your native Kubernetes to edge(project under CNCF)

openyurtio/openyurt English | 简体中文 What is NEW! Latest Release: September 26th, 2021. OpenYurt v0.5.0. Please check the CHANGELOG for details. First R

Jan 7, 2023