Website | Community | Blog
Visual Data Preparation (VDP) is an open-source tool to streamline the end-to-end visual data processing pipeline:
- Ingest unstructured visual data from data sources such as data lakes or IoT devices;
- Transform visual data to meaningful structured data representations by Vision AI models;
- Load the structured data into warehouses, applications, or other destinations.
The goal of VDP is to seamlessly bring Vision AI into the modern data stack with a standardised framework. Check our blog post Missing piece in modern data stack: visual data preparation on how this tool is proposed to streamline unstructured visual data processing across different stakeholders.
Table of contents
Code in the main branch tracks under-development progress towards the next release and may not work as expected. If you are looking for a stable alpha version, please use latest release.
How VDP works
The core concept of VDP is pipeline. A pipeline is an end-to-end workflow that automates a sequence of tasks to process visual data. Each pipeline consists of three ordered components:
- data source: where the pipeline starts. It connects the source of image and video data to be processed.
- model: a deployed Vision AI model to process the ingested visual data and generate structured outputs
- data destination: where to send the structured outputs
Based on the mode of a pipeline, it will ingest and process the visual data, send the outputs to the destination every time the trigger event occurs.
We use data connector as a general term to represent data source and data destination. Please find the supported data connectors here.
Quick start
Download and run VDP locally
Execute the following commands to start pre-built images with all the dependencies:
$ git clone https://github.com/instill-ai/vdp.git && cd vdp
# Build instill/vdp:dev local development image
$ make build
# Launch all services.
$ make all
Run the samples to trigger an object detection pipeline
We provide sample codes on how to build and trigger an object detection pipeline. Run it with the local VDP:
$ cd examples-go
# Download a YOLOv4 ONNX model for object detection task (GPU not required)
$ curl -o yolov4-onnx-cpu.zip https://artifacts.instill.tech/vdp/sample-models/yolov4-onnx-cpu.zip
# [optional] Download a test image or use your own images
$ curl -o dog.jpg https://artifacts.instill.tech/dog.jpg
# Deploy the model
$ go run deploy-model/main.go --model-path yolov4-onnx-cpu.zip --model-name yolov4
# Test the model
$ go run test-model/main.go --model-name yolov4 --test-image dog.jpg
# Create an object detection pipeline
$ go run create-pipeline/main.go --pipeline-name hello-pipeline --model-name yolov4
# Trigger the pipeline by using the same test image
$ go run trigger-pipeline/main.go --pipeline-name hello-pipeline --test-image dog.jpg
Create a pipeline with your own models
Please follow the guideline "Prepare your own model to deploy on VDP ". Based on the above sample codes, you can deploy a prepared model and create your own pipeline.
Clean up
To clean up all running services:
$ make prune
Documentation
The gRPC protocols in protobufs provide the single source of truth for the VDP APIs. To view the generated OpenAPI spec on http://localhost:3000:
$ make doc
Community support
For general help using VDP, you can use one of these channels:
- GitHub (bug reports, feature requests, project discussions and contributions)
- Discord (live discussion with the community and the Instill AI Team)
License
See the LICENSE file for licensing information.