Instant Kubernetes-Native Application Observability

Pixie!


Docs Slack Mentioned in Awesome Kubernetes Mentioned in Awesome Go Build Status


What is Pixie?

Pixie

Pixie gives you instant visibility by giving access to metrics, events, traces and logs without changing code.

We're building up Pixie for broad use by the end of 2020. If you are interested, feel free to try our community beta and join our community on slack.


Table of contents

Quick Start

Review Pixie's requirements to make sure that your Kubernetes cluster is supported.

Signup

Visit our product page and signup with your google account.

Install CLI

Run the command below:

bash -c "$(curl -fsSL https://withpixie.ai/install.sh)"

Or see our Installation Docs to install Pixie using Docker, Debian, RPM or with the latest binary.

(optional) Setup a sandbox

If you don't already have a K8s cluster available, you can use Minikube to set-up a local environment:

  • On Linux, run minikube start --cpus=4 --memory=6000 --driver=kvm2 -p=<cluster-name>. The default docker driver is not currently supported, so using the kvm2 driver is important.

  • On Mac, run minikube start --cpus=4 --memory=6000 -p=<cluster-name>.

More detailed instructions are available here.

Start a demo-app:

🚀 Deploy Pixie

Use the CLI to deploy the Pixie Platform in your K8s cluster by running:

px deploy

Alternatively, you can deploy with YAML or Helm.


Check out our install guides and walkthrough videos for alternate install schemes.

Get Instant Auto-Telemetry

Run scripts with px CLI

CLI Demo


Service SLA:

px run px/service_stats


Node health:

px run px/node_stats


MySQL metrics:

px run px/mysql_stats


Explore more scripts by running:

px scripts list


Check out our pxl_scripts repo for more examples.


View machine generated dashboards with Live views

CLI Demo

The Pixie Platform auto-generates "Live View" dashboard to visualize script results.

You can view them by clicking on the URLs prompted by px or by visiting:

https://work.withpixie.ai/live


Pipe Pixie dust into any tool

CLI Demo

You can transform and pipe your script results into any other system or workflow by consuming px results with tools like jq.

Example with http_data:

px run px/http_data -o json| jq -r .

More examples here


To see more script examples and learn how to write your own, check out our docs for more guides


Contributing

We're excited to have you contribute to Pixie. Our community has adopted the Contributor Covenant as its code of conduct, and we expect all participants to adhere to it. Please report any violations to [email protected]. All code contributions require the Contributor License Agreement. The CLA can be signed when creating your first PR.

There are many ways to contribute to Pixie:

  • Bugs: Something not working as expected? Send a bug report.
  • Features: Need new Pixie capabilities? Send a feature request.
  • Views & Scripts Requests: Need help building a live view or pxl scripts? Send a live view request.
  • PxL Scripts: PxL scripts are used to extend Pixie functionality. They are an excellent way to contribute to golden debugging workflows. Look here for more information.
  • Pixienaut Community: Interested in becoming a Pixienaut and in helping shape our community? Apply here.

Open Source

Along with building Pixie as a freemium SaaS product, contributing open and accessible projects to the broader developer community is integral to our roadmap.

We plan to contribute in two ways:

  • Open Sourced Pixie Platform Primitives: We plan to open-source components of the Pixie Platform which can be independently useful to developers after our Beta. These include our Community Pxl Scripts, Pixie CLI, eBPF Collectors etc. If you are interested in contributing during our Beta, email us.
  • Unlimited Pixie Community Access: Our Pixie Community product is a completely free offering with all core features of the Pixie developer experience. We will invest in this offering for the long term to give developers across the world an easy and zero cost way to use Pixie.

Under the Hood

Three fundamental innovations enable Pixie's magical developer experience:

Progressive Instrumentation: Pixie Edge Modules (“PEMs”) collect full body request traces (via eBPF), system metrics & K8s events without the need for code-changes and at less than 5% overhead. Custom metrics, traces & logs can be integrated into the Pixie Command Module.

In-Cluster Edge Compute: The Pixie Command Module is deployed in your K8s cluster to isolate data storage and computation within your environment for drastically better intelligence, performance & security.

Command Driven Interfaces: Programmatically access data via the Pixie CLI and Pixie UI which are designed ground-up to allow you to run analysis & debug scenarios faster than any other developer tool.

For more information on Pixie Platform's architecture, check out our docs or overview deck

Resources

About Us

Founded in late 2018, we are a San Francisco based stealth machine intelligence startup. Our north star is to build a new generation of intelligent products which empower developers to engineer the future.

We're heads down building Pixie and excited to share it broadly with the community later this year. If you're interested in learning more about us or in our current job openings, we'd love to hear from you.

License

Pixie Community is the free offering of Pixie's proprietary SaaS product catalogue.

Our PxL Scripts are licensed under Apache License, Version 2.0.

Other Pixie Platform components such as Pixie CLI and eBPF based Data Collectors will also be licensed under the Apache License, Version 2.0. Contribution of these are planned for Oct 2020.

Owner
Pixie Labs
Engineers use Pixie’s auto-telemetry to debug distributed environments in real-time
Pixie Labs
Comments
  • Self-Hosted Pixie Install Script

    Self-Hosted Pixie Install Script

    Is your feature request related to a problem? Please describe. We would like to have an install experience for the self-hosted version of Pixie that is as easy to use as the one hosted on withpixie.ai.

    Additional context Our team has been busy at work this month open sourcing Pixie's source code, docs, website, and other assets, We are also actively applying to be a CNCF sandbox project!

    One of our last remaining items is to publish an install script to deploy a self-hosted version of Pixie.

    Who offers a hosted version of Pixie?

    New Relic currently offers a 100% free hosted version of Pixie Cloud. This hosting has no contingencies and will be offered indefinitely to the Pixie Community. All the code used for hosting is open source, including out production manifest files.

    What will the Self-Hosted install script do?

    The Self-Hosted install script will deploy Pixie Cloud so that you can use Pixie without any external dependencies. This is the exact version of Pixie Cloud we deploy, so it'll behave exactly as the hosted version, but will require management/configuration.

    What is the timeline? 

    Good question. :) We had planned to open source this script by 5/4. Unfortunately, we didn’t make it. We need more time to ensure that the Pixie Cloud deploy script will be just as easy to install Pixie Cloud as it is to install the hosted version of Pixie (in < 2 minutes!)

    But I really want to run a Self-Hosted Pixie...now!

    Technically you can build and run a self-hosted Pixie using Skaffold. Check out:

    https://github.com/pixie-labs/pixie/blob/main/skaffold/skaffold_cloud.yaml https://github.com/pixie-labs/pixie/tree/main/k8s/cloud https://github.com/pixie-labs/pixie/tree/main/k8s/cloud_deps

    These directions are not fully documented and the team is choosing to focus on quickly delivering the self-hosted install script. We'll constantly be iterating on the documentation to make the project more open source friendly.

  • google login hangs

    google login hangs

    Trying the pixie online installer. After signing up with google, login hangs forever with:

    Authenticating
    Logging in...
    

    To Reproduce Steps to reproduce the behavior:

    1. Go to signup, use google
    2. Login with google

    Expected behavior To be logged in

  • Modify `dns_data` table to record DNS requests with no response

    Modify `dns_data` table to record DNS requests with no response

    Problem: Sometimes DNS fails without the pod receiving DNS errors (e.g. if the network packets carrying the DNS response are being dropped).

    Solution: Change the dns_data table to record DNS requests without responses, so that you can track the number of DNS requests that remain unanswered over time.

  • [Doc issue] no ingress installed so dev_dns_updater did nothing

    [Doc issue] no ingress installed so dev_dns_updater did nothing

    Describe the bug I've been followed the document to deploy pixie cloud, and setup-dns section would update /etc/hosts if there is any ingress rules in k8s cluster. But it didn't have!

    ➜  pixie git:(main) ✗ kubectl get ing
    No resources found in default namespace.
    ➜  pixie git:(main) ✗ kubectl get ing -n plc
    No resources found in plc namespace.
    

    And it of course doesn't change anything:

    ➜  pixie git:(main) ✗ ./dev_dns_updater --domain-name="dev.withpixie.dev"  --kubeconfig=$HOME/.kube/config --n=plc
    INFO[0000] DNS Entries                                   entries="dev.withpixie.dev, work.dev.withpixie.dev, segment.dev.withpixie.dev, docs.dev.withpixie.dev" service=cloud-proxy-service
    INFO[0000] DNS Entries                                   entries=cloud.dev.withpixie.dev service=vzconn-service
    

    It didn't change /etc/hosts file!

    To Reproduce

    Expected behavior Should update /etc/hosts and we could visit dev.withpixie.dev in browser.

    Screenshots

    Logs

    App information (please complete the following information):

    • Pixie version: master branch
    • K8s cluster version: minikube on macOS 10.15.7 k8s version v1.22.2

    Additional context

  • Can't install pixie to completely air gapped environment

    Can't install pixie to completely air gapped environment

    Describe the bug Can't install pixie to completely air gapped environment.

    To Reproduce Currently I'm trying to install it via YAML scheme. I've already pushed all images mentioned in manifests generated on extract manifests step to my local artifactory and replaced original images links with local ones, but during installation pixie still tries to download some images (e.g. busybox:1.28.0-glibc and nats:1.3.0) from the internet.

    Expected behavior Be able to install pixie to self-hosted k8s cluster with no access to the internet.

    Logs Please attach the logs by running the following command:

    [root@localhost pixie_yamls]# kubectl get pods -n pl
    NAME                                      READY   STATUS                       RESTARTS   AGE
    etcd-operator-6c6f8cb48d-q5t8q            1/1     Running                      0          43m
    kelvin-6c67584687-pwlrg                   0/1     Init:0/1                     0          42m
    nats-operator-7bbff5c756-tt2rl            1/1     Running                      0          43m
    pl-etcd-zs25zbm5ln                        0/1     Init:ImagePullBackOff        0          41m
    pl-nats-1                                 0/1     ImagePullBackOff             0          42m
    vizier-certmgr-58d97fd6b5-8wp9n           0/1     CreateContainerConfigError   0          42m
    vizier-cloud-connector-74c5c84487-m4bmq   1/1     Running                      1          42m
    vizier-metadata-6bc96dd78-g9brg           0/1     Init:0/2                     0          42m
    vizier-pem-bv858                          0/1     Init:0/1                     0          42m
    vizier-pem-dktqv                          0/1     Init:0/1                     0          42m
    vizier-pem-ftd66                          0/1     Init:0/1                     0          42m
    vizier-pem-gmrfq                          0/1     Init:0/1                     0          42m
    vizier-pem-j7xmx                          0/1     Init:0/1                     0          42m
    vizier-pem-jxl7j                          0/1     Init:0/1                     0          42m
    vizier-pem-kcfbf                          0/1     Init:0/1                     0          42m
    vizier-pem-mgzgj                          0/1     Init:0/1                     0          42m
    vizier-pem-v7k7q                          0/1     Init:0/1                     0          42m
    vizier-proxy-8568c9bd48-fdccm             0/1     CreateContainerConfigError   0          42m
    vizier-query-broker-7b74f9cbdc-265m4      0/1     Init:0/1                     0          42m
    
    [root@localhost pixie_yamls]# kc describe pod pl-etcd-zs25zbm5ln -n pl
    Name:         pl-etcd-zs25zbm5ln
    Namespace:    pl
    ...
    Events:
      Type     Reason     Age                  From                             Message
      ----     ------     ----                 ----                             -------
      Normal   Scheduled  56m                  default-scheduler                Successfully assigned pl/pl-etcd-zs25zbm5ln to xxx
      Warning  Failed     55m                  kubelet, xxx  Failed to pull image "busybox:1.28.0-glibc": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.33:34516->23.23.116.141:443: read: connection reset by peer
      Warning  Failed     55m                  kubelet, xxx  Failed to pull image "busybox:1.28.0-glibc": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.33:59176->54.224.119.26:443: read: connection reset by peer
      Warning  Failed     55m                  kubelet, xxx  Failed to pull image "busybox:1.28.0-glibc": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.33:42888->107.23.149.57:443: read: connection reset by peer
      Warning  Failed     54m (x4 over 55m)    kubelet, xxx  Error: ErrImagePull
      Normal   Pulling    54m (x4 over 55m)    kubelet, xxx  Pulling image "busybox:1.28.0-glibc"
      Warning  Failed     54m                  kubelet, xxx  Failed to pull image "busybox:1.28.0-glibc": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.33:41714->34.238.187.50:443: read: connection reset by peer
      Normal   BackOff    45m (x43 over 55m)   kubelet, xxx  Back-off pulling image "busybox:1.28.0-glibc"
      Warning  Failed     48s (x234 over 55m)  kubelet, xxx  Error: ImagePullBackOff
    
    
    [root@localhost pixie_yamls]# kc describe pod pl-nats-1 -n pl
    Name:         pl-nats-1
    Namespace:    pl
    ...
    Events:
      Type     Reason       Age                    From                             Message
      ----     ------       ----                   ----                             -------
      Normal   Scheduled    57m                    default-scheduler                Successfully assigned pl/pl-nats-1 to yyy
      Warning  FailedMount  57m (x6 over 57m)      kubelet, yyy  MountVolume.SetUp failed for volume "server-tls-certs" : secret "service-tls-certs" not found
      Warning  Failed       56m                    kubelet, yyy  Failed to pull image "nats:1.3.0": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.18:32860->3.220.36.210:443: read: connection reset by peer
      Warning  Failed       56m                    kubelet, yyy  Failed to pull image "nats:1.3.0": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: read tcp 192.168.0.18:52026->107.23.149.57:443: read: connection reset by peer
      Warning  Failed       2m26s (x227 over 56m)  kubelet, yyy Error: ImagePullBackOff
    

    App information (please complete the following information):

    • Pixie version: Pixie CLI 0.5.8+Distribution.a09aa96.20210506210658.1
    • K8s cluster version: Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-09T11:26:42Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:04:18Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
  • Compile error, missing HTTP Tables.

    Compile error, missing HTTP Tables.

    Describe the bug Cannot run any scripts due to a HTTP Event module not found?

    Script compilation failed: L222 : C22  Table 'http_events' not found.\n
    

    To Reproduce Steps to reproduce the behavior: Install fresh version of Pixie on Minikube Cluster

    Expected behavior Pixie scripts to execute

    Screenshots image image

    Logs Please attach the logs by running the following command:

    ./px collect-logs (See Zip File) 
    

    pixie_logs_20210505024739.zip App information (please complete the following information):

    • Pixie version: 0.5.3+Distribution.0ff53f6.20210503183144.1
    • K8s cluster version: v1.20.2
  • No Kafka data available

    No Kafka data available

    Describe the bug After installing Pixie in our Kubernetes cluster there seems to be no data available when running the Kafka scripts, it's empty. I used the installation scenario for Helm described in the documentation https://docs.px.dev/installing-pixie/install-schemes/helm and I can see data from other scripts than Kafka, we are running Kafka using Strimzi. At the moment the rates for Kafka are like the following: 65kb/s incoming 80kb/s outgoing ~270 messages per second incoming

    To Reproduce Steps to reproduce the behavior:

    1. Install Pixie according to Helm instructions on https://docs.px.dev/installing-pixie/install-schemes/helm
    2. Go to https://work.withpixie.ai/ and run Kafka scripts

    Expected behavior I expected to see data from the Kafka cluster that we are running in the k8s cluster when running the Kafka scripts in the same way I can when running other scripts available.

    Screenshots pixie_kafka

    Logs pixie_logs_20220305212109.zip

    
    **App information (please complete the following information):**
    - Pixie operator v0.0.19
    - K8s 1.22.4 (AKS)
    - Node Kernel version 5.4.0-1065-azure
    - Browser Google chrome 98.0.4758.80 (Official Build) (64-bit)
    - Strimzi Kafka 0.24.0 and Kafka version 2.7.0
    
    **Additional context**
    Add any other context about the problem here.
    
    
    
    
  • JAVA profiling is not enabled by default as expected.

    JAVA profiling is not enabled by default as expected.

    I followed the tutorial. Passed the java -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=9999 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -XX:+PreserveFramePointer

    Also while compiling used the apply plugin: 'java-library' compileJava { options.debug = true options.debugOptions.debugLevel = "source,lines,vars" }

    to enable java debug symbols.

    Still getting the hexadecimal values instead of the method names.

  • Add responseless dns requests to dns_data table.

    Add responseless dns requests to dns_data table.

    This pull request is related to https://github.com/pixie-io/pixie/issues/418 which relates to including those DNS requests into the dns_data table for which a matching response was not found.

    Testing Done

    • [X] Verified that tests in stitcher_test.cc are passed.

    noman@px-dev-docker:/pl/src/px.dev/pixie$ bazel test src/stirling/source_connectors/socket_tracer/protocols/dns/...
    INFO: Analyzed 7 targets (0 packages loaded, 0 targets configured).
    INFO: Found 4 targets and 3 test targets...
    INFO: Elapsed time: 3.078s, Critical Path: 2.69s
    INFO: 11 processes: 1 internal, 10 processwrapper-sandbox.
    INFO: Build completed successfully, 11 total actions
    //src/stirling/source_connectors/socket_tracer/protocols/dns:parse_test  PASSED in 0.0s
    //src/stirling/source_connectors/socket_tracer/protocols/dns:stitcher_test PASSED in 0.0s
    //src/stirling/source_connectors/socket_tracer/protocols/dns:types_test  PASSED in 0.0s
    
    INFO: Build completed successfully, 11 total actions
    noman@px-dev-docker:/pl/src/px.dev/pixie$ cat   /home/noman/.cache/bazel/_bazel_noman/4c31fb537ca0f31ab15bbd6a8445d3b6/execroot/px/bazel-out/k8-fastbuild/testlogs/src/stirling/source_connectors/socket_tracer/protocols/dns/stitcher_test/test.log
    exec ${PAGER:-/usr/bin/less} "$0" || exit 1
    Executing tests from //src/stirling/source_connectors/socket_tracer/protocols/dns:stitcher_test
    I20221010 11:30:13.268741  5618 env.cc:47] Started: /home/noman/.cache/bazel/_bazel_noman/4c31fb537ca0f31ab15bbd6a8445d3b6/sandbox/processwrapper-sandbox/21/execroot/px/bazel-out/k8-fastbuild/bin/src/stirling/source_connectors/socket_tracer/protocols/dns/stitcher_test.runfiles/px/src/stirling/source_connectors/socket_tracer/protocols/dns/stitcher_test
    [==========] Running 2 tests from 1 test suite.
    [----------] Global test environment set-up.
    [----------] 2 tests from DnsStitcherTest
    [ RUN      ] DnsStitcherTest.RecordOutput
    [       OK ] DnsStitcherTest.RecordOutput (0 ms)
    [ RUN      ] DnsStitcherTest.OutOfOrderMatching
    [       OK ] DnsStitcherTest.OutOfOrderMatching (0 ms)
    [----------] 2 tests from DnsStitcherTest (0 ms total)
    
    [----------] Global test environment tear-down
    [==========] 2 tests from 1 test suite ran. (0 ms total)
    [  PASSED  ] 2 tests.
    I20221010 11:30:13.269136  5618 env.cc:51] Shutting down
    
  • Pixie is missing data about many pods and services in the cluster

    Pixie is missing data about many pods and services in the cluster

    Describe the bug

    I encountered an issue in a self-hosted installation where Pixie is missing information about the cluster

    E.g. When I checked the pods in a namespace using the px/namespace script from the UI and CLI, only 8 pods were shown. But when I checked from kubectl , I saw 90+ pods. Similarly, Pixie showed 6 services whereas kubectl showed 40+ services.

    Also, at times, when I try to view details of a Pod in the Pixie UI, there is no data for it. E.g. I selected a running pod from the cluster and entered it's name in the px/pod script in the UI. But nothing was shown for it. I could only see a No data available for inbound_requests table message. (All the widgets in px/pod had the same no data available error message). The start time I set in the Pixie UI was less than the pod's uptime as well. Screenshot 2022-08-10 at 20 13 25

    I also noticed that autocomplete in the Pixie UI doesn't show the correct resource at times. E.g. In px/pod, the pod that is shown by autocomplete does not exist in the cluster (Probably replaced by a new pod).

    I noticed the following in the deployment vizier-metadata pod and vizier-cloud-connector had many restarts. When I checked the pod, the state change reason for the container was shown as Error

    At times, newly created pods appear in Pixie. So this doesn't seem to be a case where Pixie is unable to get any information at all about new pods

    To Reproduce Not sure how to reproduce this

    Expected behavior Expected to see all pods and services of the cluster in Pixie

    Logs Log containing "error" in vizier-metadata. I am including all the repeated log lines as I want to show that they have been printed within a short interval (Multiple lines during some seconds as well)

    kubectl logs -f vizier-metadata-0 -n pl | grep -i Error
    time="2022-07-13T17:34:32Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:32Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:33Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:34Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:35Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:35Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:35Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:35Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:36Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:36Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:34:39Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:56:40Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T17:56:42Z" level=info msg="Failed to get update version for topic" error="<nil>"
    time="2022-07-13T18:04:05Z" level=info msg="Failed to get update version for topic" error="<nil>"
    

    vizier-cloud-connector had the following error repeated multiple times

    time="2022-07-13T18:34:46Z" level=info msg="failed vizier health check, will restart healthcheck" error="rpc error: code = Internal desc = stream terminated by RST_STREAM with error code: INTERNAL_ERROR"
    time="2022-07-13T18:34:46Z" level=info msg="failed vizier health check, will restart healthcheck" error="context canceled"
    

    App information (please complete the following information):

    • Pixie version: 0.7.14
    • K8s cluster version: 1.21.9
    • Node Kernel version:
    • Browser version: Chrome Version 103.0.5060.114 (Official Build) (x86_64)

    Additional context

  • gRPC-c data parsing

    gRPC-c data parsing

    Stirling now registers on perf buffers where the gRPC-c eBPF module writes data to. There are 3 buffers:

    1. gRPC events
    2. gRPC headers
    3. close events

    The logic of handling gRPC sessions works for Golang events. This logic is now used for gRPC-c events as well. The data that the gRPC-c eBPF module passes to the user-space differs from the data that the Golang gRPC eBPF module passes to the user-space. This PR is basically an abstraction layer that "translates" gRPC-c eBPF events to the known format of Golang gRPC events.

    gRPC-c events are still not enabled; They will be enabled in the next PR, where the needed probes will be attached by the UProbeManager. However, the gRPC-c eBPF program is now compiled; because in order for the code to find the perf buffers, they must exist.

  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

  • CQL Socket Tracer does not capture INSERT

    CQL Socket Tracer does not capture INSERT

    Describe the bug I'm building a feature on top of CQL data and recently discovered that the socket-tracer does not capture INSERT calls. I'm not sure the exact possible cause, but I'm able to pick up SELECT * calls in the tracer, but never INSERT calls. I also am able to see each result of the insert call in the returned values of subsequent SELECT calls, but only SELECT is recorded by Pixie.

    To Reproduce Steps to reproduce the behavior:

    1. Install latest Pixie
    2. Install the k8ssandra demo: https://docs-v2.k8ssandra.io/install/local/single-cluster-helm/
    3. Run queries against the k8ssandra setup.
    4. Open cql_data, search for insert calls and find none.

    I also ran the following script just to filter for any Execute calls, or maybe a req_body or something like that but nothing comes up.

    import px
    
    df = px.DataFrame(table='cql_events', start_time='-5m')
    df.req_cmd = px.cql_opcode_name(df.req_op)
    df = df[df.req_cmd == 'Execute' or (df.req_op == 10 or px.contains(px.tolower(df.req_body), 'insert')) ]
    px.display(df)
    

    Expected behavior I expect to see 1 record with the insert body after running INSERT.

  • Create a normalize_cql for Cassandra queries

    Create a normalize_cql for Cassandra queries

    Is your feature request related to a problem? Please describe. We have a normalize_pgsql and a normalize_mysql that we use to clean up the queries into a unified format that ignores arguments passed to queries. It would be nice to have the same thing for CQL (Cassandra) as well. A work-around is to use the pgsql function and just default to the req_cmd like so:

          def cql_opname(df, input_col_name, output_col_name):
              df[output_col_name] = 'Unknown'
              df[output_col_name] = px.select(df[input_col_name]==1, 'Startup', df[output_col_name])
              df[output_col_name] = px.select(df[input_col_name]==5, 'Options', df[output_col_name])
              df[output_col_name] = px.select(df[input_col_name]==7, 'Query', df[output_col_name])
              df[output_col_name] = px.select(df[input_col_name]==9, 'Prepare', df[output_col_name])
              df[output_col_name] = px.select(df[input_col_name]==10, 'Execute', df[output_col_name])
              df[output_col_name] = px.select(df[input_col_name]==11, 'Register', df[output_col_name])
              df[output_col_name] = px.select(df[input_col_name]==13, 'Batch', df[output_col_name])
              df[output_col_name] = px.select(df[input_col_name]==15, 'AuthResponse', df[output_col_name])
              return df
          df = ...
          ...
          df = cql_opname(df, 'req_op', 'req_cmd')
          df.query_struct = px.normalize_pgsql(df.req_body, df.req_cmd)
          df.query = px.select(px.pluck(df.query_struct, 'error') == '', px.pluck(df.query_struct, 'query'), df.req_cmd)
    

    Describe the solution you'd like I'd like a px.normalize_cql(<req_body>, <req_cmd>) that parses the cql natively instead.

  • Add k8ssandra to the demos

    Add k8ssandra to the demos

    Is your feature request related to a problem? Please describe. k8ssandra is a great way to demo cassandra data. We should add it to the demos and our release test process. https://docs-v2.k8ssandra.io/install/local/single-cluster-helm/

    Describe the solution you'd like

    px demo deploy k8ssandra
    
  • /proc/PID/stat mishandles whitespace

    /proc/PID/stat mishandles whitespace

    ProcParser::ParseProcPIDStat() mishandles whitespace characters in the comm field in multiple ways:

    if (std::getline(ifs, line)) {
    

    This reads only one line, but comm can contain newline characters.

    std::vector<std::string_view> split = absl::StrSplit(line, " ", absl::SkipWhitespace());
    

    As I understand it, with SkipWhitespace, multiple consecutive spaces will treated as if it was just one. But for command_offset, every space is counted.

    if (split.size() < kProcStatNumFields) {
    

    This doesn't take spaces into account at all.

  • Failed Deploy Pixie using YAML

    Failed Deploy Pixie using YAML

    Describe the bug I used self-hosted pixie. when i get to the third step “Deploy Pixie”

    i deploy Pixie using YAML

    when i Excuting an order

    kubectl apply -f pixie_yamls/ 图片

    the job 5ff7d47213a8875e3f1827d728b149498a0fdef08ba74499866d7d51a4b0147 in px-operator is not ready 图片

    The reason I found this error was ImagePullBackOff。

    图片

    how can i modify this job yaml??

    I tried to modify the job batch but got an error

    图片

    I just want to modify imagePullPolicy or image. how can i modify this job yaml??

Open Service Mesh (OSM) is a lightweight, extensible, cloud native service mesh that allows users to uniformly manage, secure, and get out-of-the-box observability features for highly dynamic microservice environments.
Open Service Mesh (OSM) is a lightweight, extensible, cloud native service mesh that allows users to uniformly manage, secure, and get out-of-the-box observability features for highly dynamic microservice environments.

Open Service Mesh (OSM) Open Service Mesh (OSM) is a lightweight, extensible, Cloud Native service mesh that allows users to uniformly manage, secure,

Jan 2, 2023
🔥 🔥 Open source cloud native security observability platform. Linux, K8s, AWS Fargate and more. 🔥 🔥
🔥 🔥   Open source cloud native security observability platform. Linux, K8s, AWS Fargate and more. 🔥 🔥

CVE-2021-44228 Log4J Vulnerability can be detected at runtime and attack paths can be visualized by ThreatMapper. Live demo of Log4J Vulnerability her

Jan 1, 2023
The OCI Service Operator for Kubernetes (OSOK) makes it easy to connect and manage OCI services from a cloud native application running in a Kubernetes environment.

OCI Service Operator for Kubernetes Introduction The OCI Service Operator for Kubernetes (OSOK) makes it easy to create, manage, and connect to Oracle

Sep 27, 2022
SigNoz helps developers monitor their applications & troubleshoot problems, an open-source alternative to DataDog, NewRelic, etc. 🔥 🖥. 👉 Open source Application Performance Monitoring (APM) & Observability tool
SigNoz helps developers monitor their applications & troubleshoot problems, an open-source alternative to DataDog, NewRelic, etc. 🔥 🖥.   👉  Open source Application Performance Monitoring (APM) & Observability tool

Monitor your applications and troubleshoot problems in your deployed applications, an open-source alternative to DataDog, New Relic, etc. Documentatio

Sep 24, 2021
Go-watchdog - a web application observability tool built for Go
Go-watchdog - a web application observability tool built for Go

Go-watchdog is a web application observability tool built for Go, it exposes a status endpoint for application services like databases, caches, message-brokers, mails and storages.

Jul 11, 2022
Hubble - Network, Service & Security Observability for Kubernetes using eBPF
Hubble - Network, Service & Security Observability for Kubernetes using eBPF

Network, Service & Security Observability for Kubernetes What is Hubble? Getting Started Features Service Dependency Graph Metrics & Monitoring Flow V

Jan 2, 2023
pREST (PostgreSQL REST), simplify and accelerate development, ⚡ instant, realtime, high-performance on any Postgres application, existing or new

pREST pREST (PostgreSQL REST), simplify and accelerate development, instant, realtime, high-performance on any Postgres application, existing or new P

Jan 9, 2023
:rocket: Instant live visualization of your Go application runtime statistics (GC, MemStats, etc.) in the browser
:rocket: Instant live visualization of your Go application runtime statistics (GC, MemStats, etc.) in the browser

Statsviz Instant live visualization of your Go application runtime statistics (GC, MemStats, etc.). Import "github.com/arl/statsviz" Register statsviz

Jan 3, 2023
Sample cloud-native application with 10 microservices showcasing Kubernetes, Istio, gRPC and OpenCensus.
Sample cloud-native application with 10 microservices showcasing Kubernetes, Istio, gRPC and OpenCensus.

Online Boutique is a cloud-native microservices demo application. Online Boutique consists of a 10-tier microservices application. The application is

Dec 31, 2022
Litmus helps Kubernetes SREs and developers practice chaos engineering in a Kubernetes native way.
Litmus helps Kubernetes SREs and developers practice chaos engineering in a Kubernetes native way.

Litmus Cloud-Native Chaos Engineering Read this in other languages. ???? ???? ???? ???? Overview Litmus is a toolset to do cloud-native chaos engineer

Jan 1, 2023
eBPF based TCP observability.
eBPF based TCP observability.

TCPDog is a total solution from exporting TCP statistics from Linux kernel by eBPF very efficiently to store them at your Elasticsearch or InfluxDB da

Jan 3, 2023
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

The open-source platform for monitoring and observability. Grafana allows you to query, visualize, alert on and understand your metrics no matter wher

Jan 3, 2023
Open source Observability Platform. 👉 SigNoz helps developers find issues in their deployed applications & solve them quickly
Open source Observability Platform. 👉 SigNoz helps developers find issues in their deployed applications & solve them quickly

SigNoz SigNoz is an opensource observability platform. SigNoz uses distributed tracing to gain visibility into your systems and powers data using Kafk

Jan 4, 2023
A distributed, fault-tolerant pipeline for observability data

Table of Contents What Is Veneur? Use Case See Also Status Features Vendor And Backend Agnostic Modern Metrics Format (Or Others!) Global Aggregation

Dec 25, 2022
TCPProbe is a modern TCP tool and service for network performance observability.
TCPProbe is a modern TCP tool and service for network performance observability.

TCPProbe is a modern TCP tool and service for network performance observability. It exposes information about socket’s underlying TCP session, TLS and HTTP (more than 60 metrics). you can run it through command line or as a service. the request is highly customizable and you can integrate it with your application through gRPC. it runs in a Kubernetes cluster as cloud native application and by adding annotations on pods allow a fine control of the probing process.

Dec 15, 2022
An observability database aims to ingest, analyze and store Metrics, Tracing and Logging data.
An observability database aims to ingest, analyze and store Metrics, Tracing and Logging data.

BanyanDB BanyanDB, as an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data. It's designed to handle observab

Dec 31, 2022
Secure Distributed Thanos Deployment using an Observability Cluster

Atlas Status: BETA - I don't expect breaking changes, but still possible. Atlas, forced by Zeus to support the heavens and the skies on his shoulders.

Jun 11, 2022
Recipes for observability solutions at AWS

AWS o11y recipes See aws-observability.github.io/aws-o11y-recipes/. Security See CONTRIBUTING for more information. License This library is licensed u

Nov 30, 2022
ip-masq-agent-v2 aims to solve more specific networking cases, allow for more configuration options, and improve observability compared to the original.

ip-masq-agent-v2 Based on the original ip-masq-agent, v2 aims to solve more specific networking cases, allow for more configuration options, and impro

Aug 31, 2022
Measure the overheads of various observability tools, especially profilers.

strong: WIP - NOT READY TO LOOK AT go-observability-bench Terminology Workload: A Go function performing a small task (< 100ms) like parsing a big blo

Apr 23, 2022