Go sqlite3 http vfs: query sqlite databases over http with range headers

sqlite3vfshttp: a Go sqlite VFS for querying databases over http(s)

sqlite3vfshttp is a sqlite3 VFS for querying remote databases over http(s). This allows you to perform queries without needing to download the complete database first.

Your database must be hosted on a webserver that supports HTTP range requests (such as Amazon S3).

Example

See sqlitehttpcli/sqlitehttpcli.go for a simple CLI tool that is able to query a remotely hosted sqlite database.

Usage

	vfs := sqlite3vfshttp.New(*url)

	err := sqlite3vfs.RegisterVFS("httpvfs", vfs)
	if err != nil {
		log.Fatalf("register vfs err: %s", err)
	}

	db, err := sql.Open("sqlite3", "not_a_read_name.db?vfs=httpvfs&mode=ro")
	if err != nil {
		log.Fatalf("open db err: %s", err)
	}

Querying a database in S3

The original purpose of this library was to allow an AWS Lambda function to be able to query a sqlite database stored in S3 without downloading the entire database.

This is possible even for private files stored in S3 by generating a presigned URL and passing that to this library. That allows the client to make HTTP Get range requests without it needing to know how to sign S3 requests.

Building a loadable extension for the sqlite3 cli

The sqlite3 cli supports runtime loadable extensions. We can build sqlite3vfshttp as a shared library, and then load it from the sqlite3 cli to interactively query sqlite databases over http connections. The shared library code is located in the sqlite3http-ext directory. See the sqlite3http-ext/README.md for more details.

Demo

I've uploaded a 30MB sqlite database to a publicly accessible webserver for testing, based on "Balance of payments international investment position: March 2021 quarter – CSV" from https://www.stats.govt.nz/large-datasets/csv-files-for-download/. The schema is:

CREATE TABLE csv (series_reference,
period,
data_value,
suppressed,
status,
units,
magntude,
subject,
grp,
series_title_1);

You can query this dataset from the sqlite3 cli tool using the shared library extension:

$ cd sqlite3http-ext

# build httpvfs.so shared library
$ make
go build -tags SQLITE3VFS_LOADABLE_EXT -o sqlite3http_ext.a -buildmode=c-archive sqlite3http_ext.go
gcc -g -fPIC -shared -o httpvfs.so sqlite3http_ext.c sqlite3http_ext.a

# set url of sqlite3 db as environment variable SQLITE3VFSHTTP_URL:
$ export SQLITE3VFSHTTP_URL='https://www.sanford.io/demo.db'

$ sqlite3
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> -- load extention
sqlite> .load ./httpvfs
sqlite> -- open db using vfs=httpvfs, note you must use the sqlite uri syntax which starts with file://
sqlite> .open file:///foo.db?vfs=httpvfs
sqlite> -- query from remote db
sqlite> SELECT * from csv where period > '2010' limit 10;
series_reference      period      data_value  suppressed  status      units       magntude    subject                    grp                                                   series_title_1
--------------------  ----------  ----------  ----------  ----------  ----------  ----------  -------------------------  ----------------------------------------------------  --------------
BOPQ.S06AC000000000A  2010.03     17463                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2010.06     17260                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2010.09     15419                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2010.12     17088                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.03     18516                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.06     18835                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.09     16390                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.12     18748                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2012.03     18477                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2012.06     18270                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual

Alternatively, you can query this dataset using the sqlitehttpcli example tool:

# query the sqlite schema table
$ ./sqlitehttpcli -url 'https://www.sanford.io/demo.db' -query 'select * from main.sqlite_master'
row: [table csv csv 2 CREATE TABLE csv (series_reference,
period,
data_value,
suppressed,
status,
units,
magntude,
subject,
grp,
series_title_1)]

# get 10 rows from the dataset
./sqlitehttpcli -url 'https://www.sanford.io/demo.db' -query "select * from csv limit 10"
row: [BOPQ.S06AC000000000A 1971.06 426  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1971.09 435  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1971.12 360  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.03 417  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.06 528  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.09 471  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.12 437  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1973.03 607  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1973.06 666  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1973.09 578  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]

# get 10 rows where the period is after 2010
$ ./sqlitehttpcli -url 'https://www.sanford.io/demo.db' -query "select * from csv where period > '2010' limit 10"
row: [BOPQ.S06AC000000000A 2010.03 17463  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2010.06 17260  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2010.09 15419  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2010.12 17088  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.03 18516  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.06 18835  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.09 16390  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.12 18748  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2012.03 18477  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2012.06 18270  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]

Similar Resources

A tool I made to quickly store bug bounty program scopes in a local sqlite3 database

GoScope A tool I made to quickly store bug bounty program scopes in a local sqlite3 database. Download or copy a Burpsuite configuration file from the

Nov 18, 2021

Go-Postgresql-Query-Builder - A query builder for Postgresql in Go

Postgresql Query Builder for Go This query builder aims to make complex queries

Nov 17, 2022

SQL API is designed to be able to run queries on databases without any configuration by simple HTTP call.

SQL API SQL API is designed to be able to run queries on databases without any configuration by simple HTTP call. The request contains the DB credenti

Dec 2, 2022

Convert data exports from various services to a single SQLite database

Convert data exports from various services to a single SQLite database

Bionic Bionic is a tool to convert data exports from web apps to a single SQLite database. Bionic currently supports data exports from Google, Apple H

Dec 9, 2022

Pure Go SQLite file reader

Package SQLittle provides pure Go, read-only, access to SQLite (version 3) database files. What SQLittle reads SQLite3 tables and indexes. It iterates

Oct 12, 2022

Low-level Go interface to SQLite 3

zombiezen.com/go/sqlite This package provides a low-level Go interface to SQLite 3. It is a fork of crawshaw.io/sqlite that uses modernc.org/sqlite, a

Dec 21, 2022

Streaming replication for SQLite.

Litestream Litestream is a standalone streaming replication tool for SQLite. It runs as a background process and safely replicates changes incremental

Jan 9, 2023

Experimental implementation of a SQLite backend for go-mysql-server

go-mysql-sqlite-server This is an experimental implementation of a SQLite backend for go-mysql-server from DoltHub. The go-mysql-server is a "frontend

Dec 23, 2022

Golang database driver for SQLite

go-sqlite Golang database driver for SQLite. Does not use cgo. This driver is based on pure-Go SQLite implementation (https://gitlab.com/cznic/sqlite)

Dec 30, 2022
Comments
  • Update roundTripper in sqlitehttpcli.go to support file:// URIs

    Update roundTripper in sqlitehttpcli.go to support file:// URIs

    For example:

    > ~/go/bin/go1.17rc2 run sqlitehttpcli/sqlitehttpcli.go -url file:///usr/local/data/sfomuseum-architecture.db -query 'SELECT id FROM geojson LIMIT 10'
    row: [1159157037]
    row: [1159157039]
    row: [1159157041]
    row: [1159157045]
    row: [1159157047]
    row: [1159157049]
    row: [1159157051]
    row: [1159157053]
    row: [1159157055]
    row: [1159157057]
    

    I considered moving roundTripper in to its own public-facing package but decided against it for this PR. "One thing at a time" and all that :-)

  • Improve gh workflow logic

    Improve gh workflow logic

    Use the same build tag that sqlite3vfs uses for extension builds for the loadable module. This allows us to ignore this package for normal builds via go build ./....

SQLite extension for accessing other SQL databases

dblite SQLite extension for accessing other SQL databases, in SQLite. Similar to how Postgres Foreign Data Wrappers enable access to other databases i

Dec 23, 2022
BQB is a lightweight and easy to use query builder that works with sqlite, mysql, mariadb, postgres, and others.

Basic Query Builder Why Simple, lightweight, and fast Supports any and all syntax by the nature of how it works Doesn't require learning special synta

Dec 7, 2022
Sqlair - SQLite Query Layer With Golang

sqlair SQLite Query Layer Creates an abstract over the go sql package to provide

Feb 18, 2022
OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.
OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.

OctoSQL OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases, streaming sources and file formats using

Dec 29, 2022
Use SQL to instantly query instances, networks, databases, and more from Scaleway. Open source CLI. No DB required.
Use SQL to instantly query instances, networks, databases, and more from Scaleway. Open source CLI. No DB required.

Scaleway Plugin for Steampipe Use SQL to query infrastructure servers, networks, databases and more from your Scaleway project. Get started → Document

Nov 16, 2022
Use SQL to query databases, logs and more from PlanetScale

Use SQL to instantly query PlanetScale databases, branches and more. Open source CLI. No DB required.

Sep 30, 2022
This is the code example how to use SQL to query data from any relational databases in Go programming language.

Go with SQL example This is the code example how to use SQL to query data from any relational databases in Go programming language. To start, please m

Mar 12, 2022
Create key value sqlite3 database from tabular data, fast.
Create key value sqlite3 database from tabular data, fast.

Turn tabular data into a lookup table using sqlite3. This is a working PROTOTYPE with limitations, e.g. no customizations, the table definition is fixed, etc.

Apr 2, 2022
Make a sqlite3 database from tabular data, fast.
Make a sqlite3 database from tabular data, fast.

MAKTA make a database from tabular data Turn tabular data into a lookup table using sqlite3. This is a working PROTOTYPE with limitations, e.g. no cus

Apr 2, 2022