The lightweight, distributed relational database built on SQLite

Last update: Jan 2, 2023

Comments: 13

rqlite is a lightweight, distributed relational database, which uses SQLite as its storage engine. Forming a cluster is very straightforward, it gracefully handles leader elections, and tolerates failures of machines, including the leader. rqlite is available for Linux, macOS, and Microsoft Windows.

Check out the rqlite FAQ.

Why?

rqlite gives you the functionality of a rock solid, fault-tolerant, replicated relational database, but with very easy installation, deployment, and operation. With it you've got a lightweight and reliable distributed relational data store. Think etcd or Consul, but with relational data modelling also available.

You could use rqlite as part of a larger system, as a central store for some critical relational data, without having to run larger, more complex distributed databases.

Finally, if you're interested in understanding how distributed systems actually work, rqlite is a good example to study. Much thought has gone into its design and implementation, with clear separation between the various components, including storage, distributed consensus, and API.

How?

rqlite uses Raft to achieve consensus across all the instances of the SQLite databases, ensuring that every change made to the system is made to a quorum of SQLite databases, or none at all. You can learn more about the design here.

Key features

Trivially easy to deploy, with no need to separately install SQLite.
Fully replicated production-grade SQL database.
Production-grade distributed consensus system.
An easy-to-use HTTP(S) API, including leader-redirection and bulk-update support. A command-line interface is also available, as are various client libraries.
Discovery Service support, allowing clusters to be dynamically created.
Extensive security and encryption support, including node-to-node encryption.
Choice of read consistency levels.
Optional read-only (non-voting) nodes, which can add read scalability to the system.
A form of transaction support.
Hot backups.

Quick Start

Detailed documentation is available. You may also wish to check out the rqlite Google Group.

The quickest way to get running on macOS and Linux is to download a pre-built release binary. You can find these binaries on the Github releases page. If you prefer Windows you can download the latest build here. Once installed, you can start a single rqlite node like so:

rqlited -node-id 1 ~/node.1

Setting -node-id isn't strictly necessary at this time, but highly recommended. It makes cluster management much clearer.

This single node automatically becomes the leader. You can pass -h to rqlited to list all configuration options.

Docker

Alternatively you can pull the latest release via docker pull rqlite/rqlite.

Forming a cluster

While not strictly necessary to run rqlite, running multiple nodes means you'll have a fault-tolerant cluster. Start two more nodes, allowing the cluster to tolerate failure of a single node, like so:

rqlited -node-id 2 -http-addr localhost:4003 -raft-addr localhost:4004 -join http://localhost:4001 ~/node.2
rqlited -node-id 3 -http-addr localhost:4005 -raft-addr localhost:4006 -join http://localhost:4001 ~/node.3

This demonstration shows all 3 nodes running on the same host. In reality you probably wouldn't do this, and then you wouldn't need to select different -http-addr and -raft-addr ports for each rqlite node.

With just these few steps you've now got a fault-tolerant, distributed relational database. For full details on creating and managing real clusters, including running read-only nodes, check out this documentation.

Cluster Discovery

There is also a rqlite Discovery Service, allowing nodes to automatically connect and form a cluster. This can be much more convenient, allowing clusters to be dynamically created. Check out the documentation for more details.

Inserting records

Let's insert some records via the rqlite CLI, using standard SQLite commands. Once inserted, these records will be replicated across the cluster, in a durable and fault-tolerant manner. Your 3-node cluster can suffer the failure of a single node without any loss of functionality or data.

$ rqlite
127.0.0.1:4001> CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT)
0 row affected (0.000668 sec)
127.0.0.1:4001> .schema
+-----------------------------------------------------------------------------+
| sql                                                                         |
+-----------------------------------------------------------------------------+
| CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT)               |
+-----------------------------------------------------------------------------+
127.0.0.1:4001> INSERT INTO foo(name) VALUES("fiona")
1 row affected (0.000080 sec)
127.0.0.1:4001> SELECT * FROM foo
+----+-------+
| id | name  |
+----+-------+
| 1  | fiona |
+----+-------+

Data API

rqlite has a rich HTTP API, allowing full control over writing to, and querying from, rqlite. Check out the documentation for full details. There are also client libraries available.

Performance

rqlite replicates SQLite for fault-tolerance. It does not replicate it for performance. In fact performance is reduced somewhat due to the network round-trips.

Depending on your machine (particularly its IO performance) and network, individual INSERT performance could be anything from 10 operations per second to more than 200 operations per second. However, by using the bulk API, transactions, or both, throughput will increase significantly, often by 2 orders of magnitude. This speed-up is due to the way Raft and SQLite work. So for high throughput, execute as many operations as possible within a single transaction.

In-memory databases

By default rqlite uses an in-memory SQLite database to maximise performance. In this mode no actual SQLite file is created and the entire database is stored in memory. If you wish rqlite to use an actual file-based SQLite database, pass -on-disk to rqlite on start-up.

Does using an in-memory database put my data at risk?

No.

Since the Raft log is the authoritative store for all data, and it is written to disk by each node, an in-memory database can be fully recreated on start-up. Using an in-memory database does not put your data at risk.

Limitations

Only SQL statements that are deterministic are safe to use with rqlite, because statements are committed to the Raft log before they are sent to each node. In other words, rqlite performs statement-based replication. For example, the following statement could result in a different SQLite database under each node:

INSERT INTO foo (n) VALUES(random());

Technically this is not supported, but you can directly read the SQLite under any node at anytime, assuming you run in "on-disk" mode. However there is no guarantee that the SQLite file reflects all the changes that have taken place on the cluster unless you are sure the host node itself has received and applied all changes.
In case it isn't obvious, rqlite does not replicate any changes made directly to any underlying SQLite file, when run in "on disk" mode. If you change the SQLite file directly, you will cause rqlite to fail. Only modify the database via the HTTP API.
SQLite dot-commands such as .schema or .tables are not directly supported by the API, but the rqlite CLI supports some very similar functionality. This is because those commands are features of the sqlite3 command, not SQLite itself.

Status and Diagnostics

You can learn how to check status and diagnostics here.

Backup and restore

Learn how to hot backup your rqlite cluster here. You can also load data directly from a SQLite dump file.

Security

You can learn about securing access, and restricting users' access, to rqlite here.

Google Group

There is a Google Group dedicated to discussion of rqlite.

Pronunciation?

How do I pronounce rqlite? For what it's worth I try to pronounce it "ree-qwell-lite". But it seems most people, including me, often pronouce it "R Q lite".

Owner

rqlite

The lightweight, distributed relational database built on SQLite.

https://github.com/rqlite/rqlite http://www.rqlite.com

Comments

rqlite response with "database or disk is full" on insert or create index

I started rqlite using default config as cluster on server ( 3 nodes) , the server is 8 cores , 32Gb memory . right now still 15G memory available (free -mh) , but can not insert any more rows into db , with error message "database or disk is full" . I don't know why , how to deal with this ? here is the status result:

{"build":{"branch":"master","build_time":"2021-08-05T21:24:10-0400","commit":"7103d425c8a50a24ffa81812d85c45d5fc26b15d7","version":"v6.1.0"},"cluster":{"addr":"127.0.0.1:4002","api_addr":"localhost:4001","https":"false","timeout":"10s"},"http":{"addr":"127.0.0.1:4001","auth":"disabled"},"node":{"start_time":"2021-08-17T14:52:54.366931301+08:00","uptime":"26h53m16.453137467s"},"runtime":{"GOARCH":"amd64","GOMAXPROCS":8,"GOOS":"linux","num_cpu":8,"num_goroutine":24,"version":"go1.15"},"store":{"addr":"127.0.0.1:4002","apply_timeout":"10s","db_applied_index":135308,"db_conf":{"Memory":true},"dir":"/root/node.1","dir_size":1220944362,"election_timeout":"1s","fsm_index":135308,"heartbeat_timeout":"1s","leader":{"addr":"127.0.0.1:4002","node_id":"1"},"node_id":"1","nodes":[{"id":"1","addr":"127.0.0.1:4002","suffrage":"Voter"},{"id":"2","addr":"127.0.0.1:4004","suffrage":"Voter"},{"id":"3","addr":"127.0.0.1:4006","suffrage":"Voter"}],"raft":{"applied_index":135308,"commit_index":135308,"fsm_pending":0,"last_contact":0,"last_log_index":135308,"last_log_term":3,"last_snapshot_index":133380,"last_snapshot_term":3,"latest_configuration":"[{Suffrage:Voter ID:1 Address:127.0.0.1:4002} {Suffrage:Voter ID:2 Address:127.0.0.1:4004} {Suffrage:Voter ID:3 Address:127.0.0.1:4006}]","latest_configuration_index":0,"log_size":807632896,"num_peers":2,"protocol_version":3,"protocol_version_max":3,"protocol_version_min":0,"snapshot_version_max":1,"snapshot_version_min":0,"state":"Leader","term":3},"request_marshaler":{"compression_batch":5,"compression_size":150,"force_compression":false},"snapshot_interval":30000000000,"snapshot_threshold":8192,"sqlite3":{"compile_options":["COMPILER=gcc-7.5.0","DEFAULT_WAL_SYNCHRONOUS=1","ENABLE_DBSTAT_VTAB","ENABLE_FTS3","ENABLE_FTS3_PARENTHESIS","ENABLE_JSON1","ENABLE_RTREE","ENABLE_UPDATE_DELETE_LIMIT","OMIT_DEPRECATED","OMIT_SHARED_CACHE","SYSTEM_MALLOC","THREADSAFE=1"],"conn_pool_stats":{"ro":{"max_open_connections":0,"open_connections":1,"in_use":0,"idle":1,"wait_count":0,"wait_duration":0,"max_idle_closed":0,"max_idle_time_closed":0,"max_lifetime_closed":0},"rw":{"max_open_connections":1,"open_connections":1,"in_use":0,"idle":1,"wait_count":0,"wait_duration":0,"max_idle_closed":0,"max_idle_time_closed":0,"max_lifetime_closed":0}},"db_size":1073741824,"path":":memory:","ro_dsn":"file:/aODMOaApCNMdLmBlgfKq?mode=ro\u0026vfs=memdb\u0026_txlock=deferred","rw_dsn":"file:/aODMOaApCNMdLmBlgfKq?mode=rw\u0026vfs=memdb\u0026_txlock=immediate","version":"3.36.0"},"trailing_logs":10240}}

Transactions?

Hi,

I'm interested in some sort of transaction functionality beyond what rqlite currently provides today. In particular, I'd like to be able to get the results of a query in the middle of a transaction, as opposed to just "process this set of queries", the primitive that rqlite provides today.

We could do transactions as the leader, applying them to the sqlite database and accumulating the queries to apply until the transaction was committed, but I don't think that would actually guarantee correctness (someone else who isn't in the transaction could take e.g. a primary key ID that was allocated in the transaction, since we're not committing things to the raft logs as they're added to the tx).

Do you have any ideas on how we'd go about doing this, or if it's even possible?

Thanks

cannot run quries to cluster members - only the leader responds

after a while (a couple of hours/days) our rqlite clusters (V5.11.1) reach a point where we can only query the leader:

for example running the query using any member other than the leader returns:

bash-4.4$ /rqlite -H rqlite-0.rqlite
Welcome to the rqlite CLI. Enter ".help" for usage hints.
Connected to version v5.11.1
rqlite-0.rqlite:4001> select * from rules
ERR! server responded with 503 Service Unavailable: not leader

rqlite-0.rqlite:4001> select * from  issues
ERR! server responded with 503 Service Unavailable: not leader

the status response from one of the followers is :


curl http://rqlite-0.rqlite:4001/status?pretty


{
    "build": {
        "branch": "master",
        "build_time": "2021-04-13T10:24:42-0400",
        "commit": "927611c82c72056a99e20cde3279fac7fdf51484",
        "version": "v5.11.1"
    },
    "http": {
        "addr": "10.233.122.246:4001",
        "auth": "disabled",
        "redirect": ""
    },
    "node": {
        "start_time": "2021-04-22T19:32:14.932309505Z",
        "uptime": "16h38m28.56345708s"
    },
    "runtime": {
        "GOARCH": "amd64",
        "GOMAXPROCS": 8,
        "GOOS": "linux",
        "num_cpu": 8,
        "num_goroutine": 17,
        "version": "go1.15.7"
    },
    "store": {
        "addr": "10.233.122.246:4002",
        "apply_timeout": "10s",
        "db_conf": {
            "DSN": "",
            "Memory": true
        },
        "dir": "/node",
        "dir_size": 14560570,
        "election_timeout": "5s",
        "heartbeat_timeout": "4s",
        "leader": {
            "addr": "10.233.92.133:4002",
            "node_id": "rqlite-4"
        },
        "metadata": {
            "rqlite-0": {
                "api_addr": "rqlite-0.rqlite:4001",
                "api_proto": "http"
            },
            "rqlite-1": {
                "api_addr": "rqlite-1.rqlite:4001",
                "api_proto": "http"
            },
            "rqlite-2": {
                "api_addr": "rqlite-2.rqlite:4001",
                "api_proto": "http"
            }
        },
        "node_id": "rqlite-0",
        "nodes": [
            {
                "id": "rqlite-0",
                "addr": "10.233.122.246:4002"
            },
            {
                "id": "rqlite-1",
                "addr": "10.233.89.103:4002"
            },
            {
                "id": "rqlite-2",
                "addr": "10.233.67.182:4002"
            },
            {
                "id": "rqlite-3",
                "addr": "10.233.100.143:4002"
            },
            {
                "id": "rqlite-4",
                "addr": "10.233.92.133:4002"
            }
        ],
        "raft": {
            "applied_index": 496115,
            "commit_index": 496115,
            "fsm_pending": 0,
            "last_contact": "21.858677ms",
            "last_log_index": 496115,
            "last_log_term": 28,
            "last_snapshot_index": 494286,
            "last_snapshot_term": 22,
            "latest_configuration": "[{Suffrage:Voter ID:rqlite-3 Address:10.233.100.143:4002} {Suffrage:Voter ID:rqlite-4 Address:10.233.92.133:4002} {Suffrage:Voter ID:rqlite-0 Address:10.233.122.246:4002} {Suffrage:Voter ID:rqlite-2 Address:10.233.67.182:4002} {Suffrage:Voter ID:rqlite-1 Address:10.233.89.103:4002}]",
            "latest_configuration_index": 0,
            "log_size": 8388608,
            "num_peers": 4,
            "protocol_version": 3,
            "protocol_version_max": 3,
            "protocol_version_min": 0,
            "snapshot_version_max": 1,
            "snapshot_version_min": 0,
            "state": "Follower",
            "term": 28
        },
        "request_marshaler": {
            "compression_batch": 5,
            "compression_size": 150,
            "force_compression": false
        },
        "snapshot_interval": 30000000000,
        "snapshot_threshold": 4096,
        "sqlite3": {
            "db_size": 22736896,
            "dsn": "",
            "fk_constraints": "disabled",
            "path": ":memory:",
            "version": "3.34.0"
        },
        "trailing_logs": 5120
    }
}

please notice that in the status metadata there are only 3 nodes and not all 5 members appear under:

.store.metadata

they do however appear under:

.store.nodes

the leader actually does not appear in .store.metadata.

Can yo please assist?

Thanks in advance, L

Resuming instance from different network address with existing raft.db triggers unnecessary election and fails to join

What version are you running? Tested 5.9.0 and 6.0.0 on OSX. Was also having the same issues with the rqlite/rqlite:6.0.0 docker image.

What did you do? Started 3 rqlited's locally on different ports to form a 3 proc cluster, say node-ids rqlite-0, rqlite-1 and rqlite-2.

My intent was to remove rqlite-0 running on port 4001+4002 and re-add it on another port, say 4007+4008 to simulate a different network address like one may find when pods migrate around in Kubernetes.

What did you expect to happen?

When rejoining the cluster by either using the network addresses inside the existing raft log or by using the new -join params provided it would connect to the other members and catch its log up to the latest point and continue along its merry way at the new network addresses without needing to replicate the entire raft log.

What happened instead?

No matter what I tried it would try to trigger a new vote that the other nodes would reject because there was already a healthy leader.

The only way to get it to rejoin was by wiping out the raft.db for that process and having it join fresh.

Please include the Status, Nodes, and Expvar output from each node (or at least the Leader!)

Start 3 rqlited's. Have the 2nd and 3rd join the first to form the cluster.

$ rqlited -node-id=rqlite-0 -http-addr=localhost:4001 -raft-addr=localhost:4002 0
$ rqlited -node-id=rqlite-1 -http-addr=localhost:4003 -raft-addr=localhost:4004 -join localhost:4001 1
$ rqlited -node-id=rqlite-2 -http-addr=localhost:4005 -raft-addr=localhost:4006 -join localhost:4001 2

Check cluster state:

$ curl "http://localhost:4003/nodes?pretty"
{
    "rqlite-0": {
        "api_addr": "http://localhost:4001",
        "addr": "127.0.0.1:4002",
        "reachable": true,
        "leader": true
    },
    "rqlite-1": {
        "api_addr": "http://localhost:4003",
        "addr": "127.0.0.1:4004",
        "reachable": true,
        "leader": false
    },
    "rqlite-2": {
        "api_addr": "http://localhost:4005",
        "addr": "127.0.0.1:4006",
        "reachable": true,
        "leader": false
    }
}

Kill rqlite-0 and check to see that the leader has moved.

$ curl "http://localhost:4003/nodes?pretty"
{
    "rqlite-0": {
        "addr": "127.0.0.1:4002",
        "reachable": false,
        "leader": false
    },
    "rqlite-1": {
        "api_addr": "http://localhost:4003",
        "addr": "127.0.0.1:4004",
        "reachable": true,
        "leader": false
    },
    "rqlite-2": {
        "api_addr": "http://localhost:4005",
        "addr": "127.0.0.1:4006",
        "reachable": true,
        "leader": true
    }
}

Remove rqlite-0 from the cluster

$ MASTER=$(curl -s "http://localhost:4001/nodes" "http://localhost:4003/nodes" "http://localhost:4005/nodes" | jq -r '.[] | select(.leader==true) | .api_addr' | head -n 1)
$ curl -XDELETE -Lv $MASTER/remove -d "{\"id\": \"rqlite-0\"}"

Check cluster state:

$ curl "http://localhost:4003/nodes?pretty"
{
  "rqlite-1": {
    "api_addr": "http://localhost:4003",
    "addr": "127.0.0.1:4004",
    "reachable": true,
    "leader": false
  },
  "rqlite-2": {
    "api_addr": "http://localhost:4005",
    "addr": "127.0.0.1:4006",
    "reachable": true,
    "leader": true
  }
}

Restart rqlite-0 and you'll see that it tries to trigger a vote forever

[rqlited] 2021/06/16 10:11:48 rqlited starting, version 6, commit unknown, branch unknown
[rqlited] 2021/06/16 10:11:48 go1.16.3, target architecture is amd64, operating system target is darwin
[rqlited] 2021/06/16 10:11:48 launch command: rqlited -node-id=rqlite-0 -http-addr=localhost:4001 -raft-addr=localhost:4002 0
[mux] 2021/06/16 10:11:48 received handler registration request for header 1
[mux] 2021/06/16 10:11:48 received handler registration request for header 2
[cluster] 2021/06/16 10:11:48 service listening on 127.0.0.1:4002
[mux] 2021/06/16 10:11:48 mux serving on 127.0.0.1:4002, advertising 127.0.0.1:4002
[rqlited] 2021/06/16 10:11:48 preexisting node state detected in /Users/brentaylor/rqlite-data/0
[rqlited] 2021/06/16 10:11:48 no join addresses set
[store] 2021/06/16 10:11:48 opening store with node ID rqlite-0
[store] 2021/06/16 10:11:48 ensuring directory at /Users/brentaylor/rqlite-data/0 exists
[store] 2021/06/16 10:11:48 0 preexisting snapshots present
[store] 2021/06/16 10:11:48 first log index: 1, last log index: 4, last command log index: 0:
2021-06-16T10:11:48.141-0700 [INFO]  raft: initial configuration: index=4 servers="[{Suffrage:Voter ID:rqlite-0 Address:127.0.0.1:4002} {Suffrage:Voter ID:rqlite-1 Address:127.0.0.1:4004} {Suffrage:Voter ID:rqlite-2 Address:127.0.0.1:4006}]"
[store] 2021/06/16 10:11:48 no cluster bootstrap requested
2021-06-16T10:11:48.141-0700 [INFO]  raft: entering follower state: follower="Node at 127.0.0.1:4002 [Follower]" leader=
2021-06-16T10:11:50.046-0700 [WARN]  raft: heartbeat timeout reached, starting election: last-leader=
2021-06-16T10:11:50.046-0700 [INFO]  raft: entering candidate state: node="Node at 127.0.0.1:4002 [Candidate]" term=3
2021-06-16T10:11:50.239-0700 [INFO]  raft: entering follower state: follower="Node at 127.0.0.1:4002 [Follower]" leader=
2021-06-16T10:11:52.016-0700 [WARN]  raft: heartbeat timeout reached, starting election: last-leader=
2021-06-16T10:11:52.016-0700 [INFO]  raft: entering candidate state: node="Node at 127.0.0.1:4002 [Candidate]" term=5
2021-06-16T10:11:53.850-0700 [WARN]  raft: Election timeout reached, restarting election
2021-06-16T10:11:53.850-0700 [INFO]  raft: entering candidate state: node="Node at 127.0.0.1:4002 [Candidate]" term=6
2021-06-16T10:11:55.282-0700 [WARN]  raft: Election timeout reached, restarting election
2021-06-16T10:11:55.282-0700 [INFO]  raft: entering candidate state: node="Node at 127.0.0.1:4002 [Candidate]" term=7
... etc etc
[rqlited] 2021/06/16 10:13:48 leader did not appear within timeout: timeout expired

And the other nodes will reject the request for a vote since a leader is already present and healthy

2021-06-16T10:13:15.361-0700 [WARN]  raft: rejecting vote request since we have a leader: from=127.0.0.1:4002 leader=127.0.0.1:4006

Kill all the rqlited's and delete the raft state on disk. Restart the cluster fresh. Once cluster is formed kill rqlite-0, remove the node and start it on new ports and tell it to join the others:

$ MASTER=$(curl -s "http://localhost:4001/nodes" "http://localhost:4003/nodes" "http://localhost:4005/nodes" | jq -r '.[] | select(.leader==true) | .api_addr' | head -n 1)
$ curl -XDELETE -Lv $MASTER/remove -d "{\"id\": \"rqlite-0\"}"
rqlited -node-id=rqlite-0 -http-addr=localhost:4007 -raft-addr=localhost:4008 -join localhost:4003,localhost:4005 0

The same thing happens as before where rqlite-0 triggers a vote and the others reject. Eventually rqlite-0 fails. The other nodes never consider it joined.

$ curl "http://localhost:4003/nodes?pretty"
{
  "rqlite-1": {
    "api_addr": "http://localhost:4003",
    "addr": "127.0.0.1:4004",
    "reachable": true,
    "leader": true
  },
  "rqlite-2": {
    "api_addr": "http://localhost:4005",
    "addr": "127.0.0.1:4006",
    "reachable": true,
    "leader": false
  }
}

If we remove rqlite-0's state it can rejoin at the new address without issue and it just replicates the raft log from the other members. This does not require the node to be removed, it gets replaced properly.

$ rm -rf 0
rqlited -node-id=rqlite-0 -http-addr=localhost:4007 -raft-addr=localhost:4008 -join localhost:4003,localhost:4005 0
[rqlited] 2021/06/16 10:28:04 rqlited starting, version 6, commit unknown, branch unknown
[rqlited] 2021/06/16 10:28:04 go1.16.3, target architecture is amd64, operating system target is darwin
[rqlited] 2021/06/16 10:28:04 launch command: rqlited -node-id=rqlite-0 -http-addr=localhost:4007 -raft-addr=localhost:4008 -join localhost:4003,localhost:4005 0
[mux] 2021/06/16 10:28:04 received handler registration request for header 1
[mux] 2021/06/16 10:28:04 received handler registration request for header 2
[cluster] 2021/06/16 10:28:04 service listening on 127.0.0.1:4008
[mux] 2021/06/16 10:28:04 mux serving on 127.0.0.1:4008, advertising 127.0.0.1:4008
[rqlited] 2021/06/16 10:28:04 no preexisting node state detected in /Users/brentaylor/rqlite-data/0, node may be bootstrapping
[rqlited] 2021/06/16 10:28:04 join addresses specified, node is not bootstrapping
[store] 2021/06/16 10:28:04 opening store with node ID rqlite-0
[store] 2021/06/16 10:28:04 ensuring directory at /Users/brentaylor/rqlite-data/0 exists
[store] 2021/06/16 10:28:04 0 preexisting snapshots present
[store] 2021/06/16 10:28:04 first log index: 0, last log index: 0, last command log index: 0:
2021-06-16T10:28:04.534-0700 [INFO]  raft: initial configuration: index=0 servers=[]
[store] 2021/06/16 10:28:04 no cluster bootstrap requested
[rqlited] 2021/06/16 10:28:04 join addresses are: [localhost:4003 localhost:4005]
2021-06-16T10:28:04.534-0700 [INFO]  raft: entering follower state: follower="Node at 127.0.0.1:4008 [Follower]" leader=
[rqlited] 2021/06/16 10:28:04 successfully joined cluster at http://localhost:4003/join
2021-06-16T10:28:04.675-0700 [WARN]  raft: failed to get previous log: previous-index=7 last-index=0 error="log not found"
[store] 2021/06/16 10:28:04 waiting for up to 2m0s for application of initial logs
[rqlited] 2021/06/16 10:28:04 store has reached consensus
[http] 2021/06/16 10:28:04 service listening on 127.0.0.1:4007
[rqlited] 2021/06/16 10:28:04 node is ready

Write throughput when globally distributed

Using this as a place to further discuss the idea of having an option to allow rqlite cache write transactions for ~100ms in order to decrease the amount of round trips per transaction (which could significantly improve write throughout).

Are there any other thoughts since the last discussion about this?

If we come up with a solution I’d be willing to fund the development.
Problems quering from other nodes

Hi. I'm having problems with the cluster connection, im following the steps in the documentation, and when I connect directly to the leader through rqlite cmd and execute queries everything is ok, but when i do the same thing from other nodes, the answer to all the queries is: ERR! unexpected end of JSON input

I thinks this is related to the recent changes made. because i use curl to make a query in a non-leader node and this is what i get:

curl -G 'localhost:4003/db/query?pretty&timings' --data-urlencode 'q=.tables' Moved Permanently.

btw, this is my configuration :

rqlited ./node.1 rqlited -http localhost:4003 -raft localhost:4004 -join http://localhost:4001 ./node.2 rqlited -http localhost:4005 -raft localhost:4006 -join http://localhost:4001 ./node.3

This is my first step on the leader node

$ rqlite 127.0.0.1:4001> CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT) 0 row affected (0.000668 sec) 127.0.0.1:4001> .schema +-----------------------------------------------------------------------------+ | sql | +-----------------------------------------------------------------------------+ | CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT) | +-----------------------------------------------------------------------------+ 127.0.0.1:4001> INSERT INTO foo(name) VALUES("fiona") 1 row affected (0.000080 sec) 127.0.0.1:4001> SELECT * FROM foo +----+-------+ | id | name | +----+-------+ | 1 | fiona | +----+-------+

When i try to do the same on a non-leader node, this is the answer

$ rqlite -H localhost -p 4003 localhost:4003> .schema ERR! unexpected end of JSON input

I tried to make a curl request and this is what i get $ curl -G 'localhost:4003/db/query?pretty&timings' --data-urlencode 'q=select * from foo' Moved Permanently.

This behavior was not present on the previous releases. I hope you can help me with this.

Random database disk image is malformed errors

Can you help with this case where randomly some queries are ruturning with the {"error": "database disk image is malformed"} error?

What version are you running? v7.6.1

Are you using Docker or Kubernetes to run your system? No

Are you running a single node or a cluster? Cluster with 3 nodes.

What did you do? A python application using sqlalchemy-rqlite and pyrqlite performing simple SELECT query in just one table.

What did you expect to happen? Get the result back from the query.

What happened instead? Randomly get the following error: {"error": "database disk image is malformed"}

Please include the Status, Nodes, and Expvar output from each node (or at least the Leader!)

Status:

curl 192.168.100.116:4001/status?pretty
{
    "build": {
        "branch": "master",
        "build_time": "2022-08-07T09:23:22-0400",
        "commit": "58568a4577940bf3adf8794c9e6e977a36e9e5db",
        "compiler": "gc",
        "version": "v7.6.1"
    },
    "cluster": {
        "addr": "192.168.100.116:4002",
        "api_addr": "192.168.100.116:4001",
        "https": "false"
    },
    "http": {
        "auth": "disabled",
        "bind_addr": "192.168.100.116:4001",
        "cluster": {
            "conn_pool_stats": {
                "192.168.100.114:4002": {
                    "idle": 4,
                    "max_open_connections": 64,
                    "open_connections": 4
                },
                "192.168.100.189:4002": {
                    "idle": 3,
                    "max_open_connections": 64,
                    "open_connections": 4
                }
            },
            "local_node_addr": "192.168.100.116:4002",
            "timeout": 30000000000
        },
        "queue": {
            "_default": {
                "batch_size": 16,
                "max_size": 128,
                "sequence_number": 0,
                "timeout": 50000000
            }
        }
    },
    "last_backup_time": "2022-11-01T03:00:04.12518951Z",
    "node": {
        "current_time": "2022-11-01T16:41:43.449966112Z",
        "start_time": "2022-10-28T17:26:47.632592627Z",
        "uptime": "95h14m55.817374557s"
    },
    "os": {
        "executable": "/usr/bin/rqlited",
        "hostname": "nodew-2",
        "page_size": 4096,
        "pid": 14912,
        "ppid": 14911
    },
    "runtime": {
        "GOARCH": "amd64",
        "GOMAXPROCS": 32,
        "GOOS": "linux",
        "num_cpu": 32,
        "num_goroutine": 62,
        "version": "go1.18"
    },
    "store": {
        "addr": "192.168.100.116:4002",
        "apply_timeout": "10s",
        "db_applied_index": 332019,
        "db_conf": {
            "memory": true,
            "fk_constraints": true
        },
        "dir": "/var/rqlited",
        "dir_size": 84744671,
        "election_timeout": "1s",
        "fsm_index": 332019,
        "heartbeat_timeout": "1s",
        "leader": {
            "addr": "192.168.100.116:4002",
            "node_id": "nodew-2"
        },
        "no_freelist_sync": false,
        "node_id": "nodew-2",
        "nodes": [
            {
                "id": "nodeh-2",
                "addr": "192.168.100.189:4002",
                "suffrage": "Voter"
            },
            {
                "id": "nodep-2",
                "addr": "192.168.100.114:4002",
                "suffrage": "Voter"
            },
            {
                "id": "nodew-2",
                "addr": "192.168.100.116:4002",
                "suffrage": "Voter"
            }
        ],
        "observer": {
            "dropped": 0,
            "observed": 3
        },
        "raft": {
            "applied_index": 332019,
            "bolt": {
                "FreePageN": 3694,
                "PendingPageN": 12,
                "FreeAlloc": 15179776,
                "FreelistInuse": 29664,
                "TxN": 91963,
                "OpenTxN": 0,
                "TxStats": {
                    "PageCount": 429033,
                    "PageAlloc": 1757319168,
                    "CursorCount": 280132,
                    "NodeCount": 136762,
                    "NodeDeref": 0,
                    "Rebalance": 16908,
                    "RebalanceTime": 14209681,
                    "Split": 8192,
                    "Spill": 136448,
                    "SpillTime": 2367516609,
                    "Write": 200584,
                    "WriteTime": 13715151494
                }
            },
            "commit_index": 332019,
            "fsm_pending": 0,
            "last_contact": 0,
            "last_log_index": 332019,
            "last_log_term": 21,
            "last_snapshot_index": 325181,
            "last_snapshot_term": 21,
            "latest_configuration": "[{Suffrage:Voter ID:nodep-2 Address:192.168.100.114:4002} {Suffrage:Voter ID:nodew-2 Address:192.168.100.116:4002} {Suffrage:Voter ID:nodeh-2 Address:192.168.100.189:4002}]",
            "latest_configuration_index": 0,
            "log_size": 41955328,
            "num_peers": 2,
            "protocol_version": 3,
            "protocol_version_max": 3,
            "protocol_version_min": 0,
            "snapshot_version_max": 1,
            "snapshot_version_min": 0,
            "state": "Leader",
            "term": 21
        },
        "request_marshaler": {
            "compression_batch": 5,
            "compression_size": 150,
            "force_compression": false
        },
        "snapshot_interval": 30000000000,
        "snapshot_threshold": 8192,
        "sqlite3": {
            "compile_options": [
                "ATOMIC_INTRINSICS=1",
                "COMPILER=gcc-9.4.0",
                "DEFAULT_AUTOVACUUM",
                "DEFAULT_CACHE_SIZE=-2000",
                "DEFAULT_FILE_FORMAT=4",
                "DEFAULT_JOURNAL_SIZE_LIMIT=-1",
                "DEFAULT_MMAP_SIZE=0",
                "DEFAULT_PAGE_SIZE=4096",
                "DEFAULT_PCACHE_INITSZ=20",
                "DEFAULT_RECURSIVE_TRIGGERS",
                "DEFAULT_SECTOR_SIZE=4096",
                "DEFAULT_SYNCHRONOUS=2",
                "DEFAULT_WAL_AUTOCHECKPOINT=1000",
                "DEFAULT_WAL_SYNCHRONOUS=1",
                "DEFAULT_WORKER_THREADS=0",
                "ENABLE_DBSTAT_VTAB",
                "ENABLE_FTS3",
                "ENABLE_FTS3_PARENTHESIS",
                "ENABLE_RTREE",
                "ENABLE_UPDATE_DELETE_LIMIT",
                "MALLOC_SOFT_LIMIT=1024",
                "MAX_ATTACHED=10",
                "MAX_COLUMN=2000",
                "MAX_COMPOUND_SELECT=500",
                "MAX_DEFAULT_PAGE_SIZE=8192",
                "MAX_EXPR_DEPTH=1000",
                "MAX_FUNCTION_ARG=127",
                "MAX_LENGTH=1000000000",
                "MAX_LIKE_PATTERN_LENGTH=50000",
                "MAX_MMAP_SIZE=0x7fff0000",
                "MAX_PAGE_COUNT=1073741823",
                "MAX_PAGE_SIZE=65536",
                "MAX_SQL_LENGTH=1000000000",
                "MAX_TRIGGER_DEPTH=1000",
                "MAX_VARIABLE_NUMBER=32766",
                "MAX_VDBE_OP=250000000",
                "MAX_WORKER_THREADS=8",
                "MUTEX_PTHREADS",
                "OMIT_DEPRECATED",
                "OMIT_LOAD_EXTENSION",
                "OMIT_SHARED_CACHE",
                "SYSTEM_MALLOC",
                "TEMP_STORE=1",
                "THREADSAFE=1"
            ],
            "conn_pool_stats": {
                "ro": {
                    "max_open_connections": 0,
                    "open_connections": 2,
                    "in_use": 0,
                    "idle": 2,
                    "wait_count": 0,
                    "wait_duration": 0,
                    "max_idle_closed": 566,
                    "max_idle_time_closed": 0,
                    "max_lifetime_closed": 0
                },
                "rw": {
                    "max_open_connections": 1,
                    "open_connections": 1,
                    "in_use": 0,
                    "idle": 1,
                    "wait_count": 0,
                    "wait_duration": 0,
                    "max_idle_closed": 0,
                    "max_idle_time_closed": 0,
                    "max_lifetime_closed": 0
                }
            },
            "db_size": 129212416,
            "mem_stats": {
                "cache_size": -2000,
                "freelist_count": 0,
                "hard_heap_limit": 0,
                "max_page_count": 1073741823,
                "page_count": 31546,
                "page_size": 4096,
                "soft_heap_limit": 0
            },
            "path": ":memory:",
            "ro_dsn": "file:/FiGbErArFfLlJdfkfsfB?mode=ro\u0026vfs=memdb\u0026_txlock=deferred\u0026_fk=true",
            "rw_dsn": "file:/FiGbErArFfLlJdfkfsfB?mode=rw\u0026vfs=memdb\u0026_txlock=immediate\u0026_fk=true",
            "version": "3.38.5"
        },
        "startup_on_disk": false,
        "trailing_logs": 10240
    }
}

Nodes:

curl 192.168.100.116:4001/nodes?pretty
{
    "nodeh-2": {
        "api_addr": "http://192.168.100.189:4001",
        "addr": "192.168.100.189:4002",
        "reachable": true,
        "leader": false,
        "time": 0.001642888
    },
    "nodep-2": {
        "api_addr": "http://192.168.100.114:4001",
        "addr": "192.168.100.114:4002",
        "reachable": true,
        "leader": false,
        "time": 0.001582912
    },
    "nodew-2": {
        "api_addr": "http://192.168.100.116:4001",
        "addr": "192.168.100.116:4002",
        "reachable": true,
        "leader": true,
        "time": 0.00006786
    }
}

Expvar:

curl 192.168.100.116:4001/debug/vars
{
"cluster": {"num_execute_req": 21509, "num_get_node_api_req": 2296, "num_get_node_api_req_local": 1182, "num_get_node_api_resp": 2296, "num_query_req": 112289},
"cmdline": ["/usr/bin/rqlited","-fk=true","-node-id","nodew-2","-http-addr","192.168.100.116:4001","-raft-addr","192.168.100.116:4002","-bootstrap-expect","3","-join","http://192.168.100.116:4001,http://192.168.100.114:4001,http://192.168.100.189:4001","/var/rqlited"],
"db": {"execute_transactions": 40001, "execution_errors": 1, "executions": 40002, "queries": 164992, "query_errors": 15, "query_transactions": 0},
"http": {"authFail": 0, "authOK": 65019, "backups": 12, "executions": 10294, "joins": 0, "leader_not_found": 0, "loads": 0, "notifies": 0, "num_readyz": 1144, "num_status": 2, "queries": 53528, "queued_executions": 0, "queued_executions_failed": 0, "queued_executions_ok": 0, "queued_executions_wait": 0, "remote_executions": 161, "remote_queries": 843},
"memstats": {"Alloc":5877768,"TotalAlloc":58410672832,"Sys":1250371352,"Lookups":0,"Mallocs":448749443,"Frees":448721958,"HeapAlloc":5877768,"HeapSys":1166606336,"HeapIdle":1157619712,"HeapInuse":8986624,"HeapReleased":1154220032,"HeapObjects":27485,"StackInuse":3604480,"StackSys":3604480,"MSpanInuse":491232,"MSpanSys":17233920,"MCacheInuse":38400,"MCacheSys":46800,"BuckHashSys":1729007,"GCSys":53580296,"OtherSys":7570513,"NextGC":10118288,"LastGC":1667321352097784485,"PauseTotalNs":1900149399,"PauseNs":[115601,141338,102226,140742,133325,102017,85790,139333,88605,91160,124217,102892,131807,174539,118341,128988,134937,108283,144730,105161,107145,110191,121275,110677,143066,123505,104821,133136,111502,103979,131773,100611,91631,132327,122775,151810,171199,88860,150985,142711,119758,121693,152044,149378,96463,93229,85040,114028,93450,155367,142086,109016,184573,153716,138629,117482,114063,143868,87134,103374,117381,146351,128852,116867,111673,131205,114590,188150,201372,120573,97841,126588,115544,139381,117155,115755,129005,97070,156077,153443,162651,96896,84659,114337,105722,97995,109939,92664,89050,127997,90148,101404,108345,116850,97484,83734,110461,82684,113191,122000,84732,92627,80491,81548,100423,118579,120854,109330,108833,111159,114634,118509,111001,99303,108104,115408,104605,90924,91487,94839,117224,154048,100023,101576,106167,116064,136277,144991,96687,94546,102006,116674,120768,112272,140898,106286,105060,105517,95563,123494,99914,117605,99831,100371,104078,118491,98807,102337,120118,143598,116640,108378,123267,150354,160713,131751,118198,94555,126112,133958,139208,95251,104930,103016,166341,254268,138391,110508,104297,168643,113313,125604,85942,200608,157345,77744,130682,96529,103470,121816,126064,127075,120373,108314,105796,83776,135351,102572,116496,149920,134474,112555,97864,124826,111746,104015,79432,173966,122732,103729,115700,105048,91718,101058,109053,104194,118503,193869,84386,150663,100871,150374,115895,105968,125450,159866,111838,90968,133471,98444,96435,108621,107147,128511,124097,125273,114787,109758,82270,103898,111645,103270,103368,119237,115831,123950,139330,223225,144705,112503,137238,112699,118674,184792,122124,88639,118498,127474,136092,157636,113028,109714,112668,109917,103970,162788],"PauseEnd":[1667317808988571856,1667317809426336191,1667317809758711745,1667317810445360146,1667317812681230792,1667317814680659415,1667317844885309764,1667317845429294851,1667317849118110037,1667317850667376784,1667317851383893248,1667317852219180243,1667317852777320057,1667317853571394705,1667317854293679191,1667317854971331844,1667317855188774780,1667317855624011011,1667317855990764755,1667317856410095597,1667317856713575800,1667317857204436438,1667317857462645680,1667317857938967444,1667317858217350441,1667317858512428955,1667317858794935106,1667317859211946444,1667317859756777546,1667317860182908540,1667317860597929442,1667317860707317466,1667317861605597717,1667317861949770832,1667317862470779391,1667317862836589350,1667317863231728963,1667317863709813258,1667317864110715404,1667317865208453613,1667317865911386944,1667317866705960496,1667317868429652915,1667317878233248165,1667317889637092181,1667317904390389728,1667317937340229624,1667317947433777656,1667317947979898250,1667317949010602794,1667317950765204741,1667317951782251602,1667317956312864301,1667317959380690178,1667317960939840531,1667318019496513374,1667318039074033692,1667318054293873234,1667318081483307212,1667318135001679783,1667318164004340485,1667318185322430786,1667318197869209823,1667318240173884699,1667318254591167300,1667318304136369419,1667318305325884928,1667318306539150218,1667318315210917463,1667318323424950228,1667318387533599721,1667318410265211332,1667318461540891817,1667318533126814634,1667318602535487074,1667318604244268617,1667318635221884110,1667318682013656646,1667318703091079732,1667318731442098020,1667318742408470168,1667318777506018969,1667318783972500668,1667318785463378757,1667318812113509614,1667318858575258081,1667318868843406873,1667318923098432679,1667318996573109861,1667319071113972941,1667319094661912469,1667319098943420906,1667319125987446792,1667319181068877022,1667319204082944420,1667319218245222390,1667319224408610269,1667319301225433692,1667319326926768377,1667319369598260321,1667319370864557315,1667319373691249763,1667319376535729162,1667319401015897835,1667319430610054357,1667319483620377698,1667319494119639406,1667319508868175340,1667319562860910393,1667319595639522786,1667319642844869636,1667319648566924728,1667319695779416106,1667319728540907059,1667319774253250378,1667319828606216540,1667319879464652957,1667319934243911904,1667319935913681365,1667319937465896888,1667319961512223725,1667319985790204531,1667320014534884502,1667320036951741050,1667320044804705289,1667320084256318709,1667320086595294914,1667320087855855866,1667320092153829202,1667320101857041045,1667320193148194981,1667320263481793353,1667320275277315133,1667320318624856362,1667320373236986464,1667320406929593145,1667320462500285205,1667320533766583794,1667320586203190390,1667320628629314397,1667320682846664629,1667320731960635046,1667320761990454527,1667320787302274236,1667320841764508344,1667320907987956139,1667320963248174745,1667320986690011309,1667321015421821017,1667321061469444355,1667321085388450526,1667321119695180334,1667321120088074700,1667321148784661376,1667321161365191891,1667321163303309025,1667321164220002458,1667321165712150981,1667321167105381907,1667321184810430429,1667321240466393922,1667321301625886733,1667321321523945188,1667321323973429356,1667321324874023198,1667321326837298468,1667321329163477023,1667321330863963509,1667321331848687810,1667321332391085056,1667321333038880858,1667321333555938238,1667321334381955804,1667321335151015480,1667321336320111936,1667321337459935940,1667321337660125738,1667321338199944321,1667321338957390274,1667321339456422195,1667321340059348592,1667321340604417210,1667321341356999421,1667321342281568120,1667321343789259603,1667321344199338345,1667321345353314579,1667321346877030368,1667321349997855552,1667321352097784485,1667317235336438600,1667317235594678471,1667317235967244233,1667317236426428151,1667317237101314974,1667317237319629326,1667317237954993712,1667317239338297796,1667317239854242763,1667317240181751378,1667317240811843824,1667317241231857232,1667317241907962119,1667317242712194940,1667317243172775081,1667317243876055029,1667317244105669938,1667317244903338755,1667317245352393038,1667317245802453985,1667317246398232122,1667317247154864192,1667317248055687950,1667317248292485335,1667317249247475206,1667317249703555392,1667317253003310954,1667317255199965622,1667317255926164468,1667317258197622597,1667317265227674859,1667317301152119376,1667317320571384312,1667317321894496200,1667317323190756603,1667317324123407186,1667317324987827525,1667317326457079231,1667317327577447314,1667317329338715039,1667317332516847296,1667317343817699968,1667317354274988689,1667317383552923648,1667317391063700818,1667317406914278558,1667317435289882604,1667317452631417812,1667317458719885743,1667317495540460373,1667317536717864035,1667317557241992422,1667317623066155369,1667317686660504895,1667317722257456445,1667317725954616738,1667317766501861208,1667317782447269530,1667317801541498601,1667317804217728114,1667317804819891159,1667317805254213951,1667317805973778163,1667317806831678984,1667317807951751003,1667317808219174758],"NumGC":14782,"NumForcedGC":0,"GCCPUFraction":0.000017445228744738093,"EnableGC":true,"DebugGC":false,"BySize":[{"Size":0,"Mallocs":0,"Frees":0},{"Size":8,"Mallocs":1726395,"Frees":1725012},{"Size":16,"Mallocs":100906050,"Frees":100902027},{"Size":24,"Mallocs":74656261,"Frees":74652591},{"Size":32,"Mallocs":31421455,"Frees":31419907},{"Size":48,"Mallocs":27136355,"Frees":27134001},{"Size":64,"Mallocs":26235030,"Frees":26233373},{"Size":80,"Mallocs":85331291,"Frees":85327571},{"Size":96,"Mallocs":33588578,"Frees":33584142},{"Size":112,"Mallocs":362368,"Frees":362219},{"Size":128,"Mallocs":745592,"Frees":745456},{"Size":144,"Mallocs":9613885,"Frees":9613260},{"Size":160,"Mallocs":185096,"Frees":184997},{"Size":176,"Mallocs":204956,"Frees":204317},{"Size":192,"Mallocs":127771,"Frees":127734},{"Size":208,"Mallocs":963978,"Frees":963856},{"Size":224,"Mallocs":320715,"Frees":320705},{"Size":240,"Mallocs":95852,"Frees":95846},{"Size":256,"Mallocs":143368,"Frees":143311},{"Size":288,"Mallocs":154842,"Frees":154504},{"Size":320,"Mallocs":140598,"Frees":140532},{"Size":352,"Mallocs":600177,"Frees":600039},{"Size":384,"Mallocs":71452,"Frees":71429},{"Size":416,"Mallocs":1926159,"Frees":1924953},{"Size":448,"Mallocs":24252,"Frees":24219},{"Size":480,"Mallocs":43847,"Frees":43839},{"Size":512,"Mallocs":9207775,"Frees":9207207},{"Size":576,"Mallocs":49963,"Frees":49842},{"Size":640,"Mallocs":17227,"Frees":17207},{"Size":704,"Mallocs":42702,"Frees":42687},{"Size":768,"Mallocs":41451,"Frees":41447},{"Size":896,"Mallocs":68435,"Frees":68414},{"Size":1024,"Mallocs":68978,"Frees":68920},{"Size":1152,"Mallocs":67525,"Frees":67512},{"Size":1280,"Mallocs":67801,"Frees":67788},{"Size":1408,"Mallocs":93856,"Frees":93852},{"Size":1536,"Mallocs":37231,"Frees":37230},{"Size":1792,"Mallocs":9659,"Frees":9648},{"Size":2048,"Mallocs":19722,"Frees":19716},{"Size":2304,"Mallocs":104562,"Frees":104549},{"Size":2688,"Mallocs":52935,"Frees":52925},{"Size":3072,"Mallocs":9302,"Frees":9300},{"Size":3200,"Mallocs":3177,"Frees":3176},{"Size":3456,"Mallocs":4301,"Frees":4299},{"Size":4096,"Mallocs":139433,"Frees":139391},{"Size":4864,"Mallocs":40237,"Frees":40235},{"Size":5376,"Mallocs":360,"Frees":356},{"Size":6144,"Mallocs":4485,"Frees":4481},{"Size":6528,"Mallocs":2326,"Frees":2326},{"Size":6784,"Mallocs":1603,"Frees":1603},{"Size":6912,"Mallocs":795,"Frees":795},{"Size":8192,"Mallocs":7717,"Frees":7710},{"Size":9472,"Mallocs":7828,"Frees":7793},{"Size":9728,"Mallocs":1550,"Frees":1550},{"Size":10240,"Mallocs":3112,"Frees":3110},{"Size":10880,"Mallocs":3438,"Frees":3438},{"Size":12288,"Mallocs":1062,"Frees":1062},{"Size":13568,"Mallocs":973,"Frees":973},{"Size":14336,"Mallocs":598,"Frees":598},{"Size":16384,"Mallocs":1565,"Frees":1560},{"Size":18432,"Mallocs":1563,"Frees":1563}]},
"mux": {"num_connections_handled": 10, "num_unregistered_handlers": 0},
"proto": {"num_compressed_bytes": 11466091, "num_compressed_requests": 31641, "num_compression_misses": 0, "num_precompressed_bytes": 16462122, "num_requests": 0, "num_uncompressed_bytes": 8, "num_uncompressed_requests": 1},
"store": {"leader_changes_dropped": 0, "leader_changes_observed": 0, "num_backups": 12, "num_compressed_commands": 31641, "num_ignored_joins": 0, "num_joins": 0, "num_recoveries": 0, "num_removed_before_joins": 0, "num_restores": 1, "num_snapshots": 4, "num_uncompressed_commands": 1, "snapshot_create_duration": 291, "snapshot_db_ondisk_size": 21502211, "snapshot_db_serialized_size": 128036864, "snapshot_persist_duration": 6115}
}

rqlite nodes crashine with no apparent reason

What version are you running? verision 6.0.1

What did you do? ran a cluster of 5 rqlite nodes on k8s

What did you expect to happen? rqlite nodes should be stable

What happened instead? nodes crash with the following log:

2021-07-21T18:04:09.076Z [WARN]  raft: heartbeat timeout reached, starting election: last-leader=10.244.250.158:4002
2021-07-21T18:04:09.076Z [INFO]  raft: entering candidate state: node="Node at 10.244.242.39:4002 [Candidate]" term=3
2021-07-21T18:04:09.082Z [INFO]  raft: election won: tally=2
2021-07-21T18:04:09.082Z [INFO]  raft: entering leader state: leader="Node at 10.244.242.39:4002 [Leader]"
2021-07-21T18:04:09.082Z [INFO]  raft: added peer, starting replication: peer=rqlite-0
2021-07-21T18:04:09.082Z [INFO]  raft: added peer, starting replication: peer=rqlite-4
2021-07-21T18:04:09.083Z [WARN]  raft: appendEntries rejected, sending older logs: peer="{Voter rqlite-0 10.244.250.158:4002}" next=1
2021-07-21T18:04:09.084Z [INFO]  raft: pipelining replication: peer="{Voter rqlite-4 10.244.42.143:4002}"
2021-07-21T18:04:09.352Z [INFO]  raft: pipelining replication: peer="{Voter rqlite-0 10.244.250.158:4002}"
[rqlited] 2021/07/21 18:04:10 http: panic serving 10.244.193.211:50948: interface conversion: interface {} is nil, not *store.fsmExecuteResponse
goroutine 2256 [running]:
net/http.(*conn).serve.func1(0xc000250460)
	/usr/local/go/src/net/http/server.go:1824 +0x153
panic(0xbb7fa0, 0xc00037e000)
	/usr/local/go/src/runtime/panic.go:971 +0x499
github.com/rqlite/rqlite/store.(*Store).execute(0xc000240000, 0xc000ab80c0, 0x416c38, 0x48, 0xc1d520, 0xc00022a101, 0xc000226050)
	/home/theog/go/src/rqlitesource/src/github.com/rqlite/rqlite/store/store.go:593 +0x4a9
github.com/rqlite/rqlite/store.(*Store).Execute(0xc000240000, 0xc000ab80c0, 0x26000, 0xc000010008, 0x1, 0x1, 0x0)
	/home/theog/go/src/rqlitesource/src/github.com/rqlite/rqlite/store/store.go:538 +0x78
github.com/rqlite/rqlite/http.(*Service).handleExecute(0xc00016c0f0, 0xd6e0b0, 0xc000ab0000, 0xc000456100)
	/home/theog/go/src/rqlitesource/src/github.com/rqlite/rqlite/http/service.go:740 +0x41c
github.com/rqlite/rqlite/http.(*Service).ServeHTTP(0xc00016c0f0, 0xd6e0b0, 0xc000ab0000, 0xc000456100)
	/home/theog/go/src/rqlitesource/src/github.com/rqlite/rqlite/http/service.go:262 +0x651
net/http.serverHandler.ServeHTTP(0xc000ab01c0, 0xd6e0b0, 0xc000ab0000, 0xc000456100)
	/usr/local/go/src/net/http/server.go:2887 +0xa3
net/http.(*conn).serve(0xc000250460, 0xd6f660, 0xc000411100)
	/usr/local/go/src/net/http/server.go:1952 +0x8cd
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:3013 +0x39b
panic: interface conversion: driver.Stmt is nil, not *sqlite3.SQLiteStmt

goroutine 37 [running]:
github.com/rqlite/go-sqlite3.(*SQLiteConn).exec(0xc000218240, 0xd6f5f0, 0xc0000c8060, 0xc000a6c000, 0x1e1c6, 0x11628f8, 0x0, 0x0, 0xc00043b960, 0x47781b, ...)
	/home/theog/go/pkg/mod/github.com/rqlite/[email protected]/sqlite3.go:803 +0x78e
github.com/rqlite/go-sqlite3.(*SQLiteConn).Exec(0xc000218240, 0xc000a6c000, 0x1e1c6, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0)
	/home/theog/go/pkg/mod/github.com/rqlite/[email protected]/sqlite3.go:792 +0x138
github.com/rqlite/rqlite/db.(*DB).Execute.func1(0xc00043baf0, 0xc9a400, 0xc0002005a0, 0xc0005b6c30, 0x1501, 0x0, 0x0)
	/home/theog/go/src/rqlitesource/src/github.com/rqlite/rqlite/db/db.go:321 +0x2b4
github.com/rqlite/rqlite/db.(*DB).Execute(0xc0002005a0, 0xc0005b6c30, 0xc000420101, 0x0, 0x0, 0x0, 0x0, 0xc00043bc18)
	/home/theog/go/src/rqlitesource/src/github.com/rqlite/rqlite/db/db.go:356 +0xa5
github.com/rqlite/rqlite/store.(*Store).Apply(0xc000240000, 0xc000490398, 0x0, 0x0)
	/home/theog/go/src/rqlitesource/src/github.com/rqlite/rqlite/store/store.go:940 +0x3c7
github.com/hashicorp/raft.(*Raft).runFSM.func1(0xc00058a5c0)
	/home/theog/go/pkg/mod/github.com/hashicorp/[email protected]/fsm.go:90 +0x2c6
github.com/hashicorp/raft.(*Raft).runFSM.func2(0xc000376400, 0x1, 0x40)
	/home/theog/go/pkg/mod/github.com/hashicorp/[email protected]/fsm.go:113 +0x78
github.com/hashicorp/raft.(*Raft).runFSM(0xc000272000)
	/home/theog/go/pkg/mod/github.com/hashicorp/[email protected]/fsm.go:216 +0x392
github.com/hashicorp/raft.(*raftState).goFunc.func1(0xc000272000, 0xc000204280)
	/home/theog/go/pkg/mod/github.com/hashicorp/[email protected]/state.go:146 +0x55
created by github.com/hashicorp/raft.(*raftState).goFunc
	/home/theog/go/pkg/mod/github.com/hashicorp/[email protected]/state.go:144 +0x66

"clustering failure: error reading DNS configuration: invalid character '\'' " in kube stateufll set

What version are you running?

[rqlited] 2022/02/03 22:12:15 rqlited starting, version v7.2.0, commit 73681a69f692f2a75b9d1e570261ac3e4a73631e, branch master, compiler gc

What did you do?

I have applied the headless service and the statefull set

apiVersion: v1
kind: Service
metadata:
  name: rqlite-svc
spec:
  clusterIP: None 
  selector:
    app: rqlite
  ports:
    - protocol: TCP
      port: 4001
      targetPort: 4001

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: rqlite
spec:
  selector:
    matchLabels:
      app: rqlite # has to match .spec.template.metadata.labels
  serviceName: "rqlite"
  replicas: 3 # by default is 1
  template:
    metadata:
      labels:
        app: rqlite # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: rqlite
        image: rqlite/rqlite
        args: ["-disco-mode=dns","-disco-config='{\"name\":\"rqlite-svc\"}'","-bootstrap-expect 3"]
        ports:
        - containerPort: 4001
          name: rqlite
        volumeMounts:
        - name: rqlite-file
          mountPath: /rqlite/file
  volumeClaimTemplates:
  - metadata:
      name: rqlite-file
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "standard"
      resources:
        requests:
          storage: 1Gi

I get this error in the pod log after CrashLoopBackOff

kubectl logs rqlite-0

            _ _ _
           | (_) |
  _ __ __ _| |_| |_ ___
 | '__/ _  | | | __/ _ \   The lightweight, distributed
 | | | (_| | | | ||  __/   relational database.
 |_|  \__, |_|_|\__\___|
         | |               www.rqlite.io
         |_|

[rqlited] 2022/02/03 22:12:15 rqlited starting, version v7.2.0, commit 73681a69f692f2a75b9d1e570261ac3e4a73631e, branch master, compiler gc
[rqlited] 2022/02/03 22:12:15 go1.17, target architecture is amd64, operating system target is linux
[rqlited] 2022/02/03 22:12:15 launch command: /bin/rqlited -http-addr 0.0.0.0:4001 -raft-addr 0.0.0.0:4002 -http-adv-addr 10.244.1.3:4001 -raft-adv-addr 10.244.1.3:4002 -disco-mode=dns -disco-config='{"name":"rqlite-svc"}' -bootstrap-expect 3 /rqlite/file/data
[rqlited] 2022/02/03 22:12:15 Raft TCP mux Listener registered with 1
[rqlited] 2022/02/03 22:12:15 preexisting node state detected in /rqlite/file/data
[store] 2022/02/03 22:12:15 opening store with node ID 10.244.1.3:4002
[store] 2022/02/03 22:12:15 configured for an in-memory database
[store] 2022/02/03 22:12:15 ensuring directory for Raft exists at /rqlite/file/data
[mux] 2022/02/03 22:12:15 mux serving on [::]:4002, advertising 10.244.1.3:4002
[store] 2022/02/03 22:12:15 0 preexisting snapshots present
[store] 2022/02/03 22:12:15 first log index: 0, last log index: 0, last command log index: 0:
[store] 2022/02/03 22:12:15 created in-memory database at open
2022-02-03T22:12:15.338Z [INFO]  raft: initial configuration: index=0 servers=[]
[cluster] 2022/02/03 22:12:15 service listening on 10.244.1.3:4002
[rqlited] 2022/02/03 22:12:15 Cluster TCP mux Listener registered with 2
[http] 2022/02/03 22:12:15 service listening on [::]:4001
[rqlited] 2022/02/03 22:12:15 discovery mode: dns
2022-02-03T22:12:15.338Z [INFO]  raft: entering follower state: follower="Node at 10.244.1.3:4002 [Follower]" leader=
[rqlited] 2022/02/03 22:12:15 clustering failure: error reading DNS configuration: invalid character '\'' looking for beginning of value

What did you expect to happen?

The Pod should start

What happened instead?

The pod ends up in CrashLoopBackOff

Please include the Status, Nodes, and Expvar output from each node (or at least the Leader!)

See https://github.com/rqlite/rqlite/blob/master/DOC/DIAGNOSTICS.md

Full connection and transaction control

This change sees the addition of a new endpoint /db/connections.

A POST to this endpoint returns a new location, that is backed by a dedicated connection to the database. On this connection transactions are supported, meaning commands such as BEGIN and COMMIT are supported. These connections can be configured to abort any transaction that has been without activity for a period of time, or close altogether for the same reason.
I deployed rqlite by docker compose but it returned peer error
Hi, when I deploy rqlite and a service that call rqlite, but I received an error like this

tried all peers unsuccessfully. here are the results: peer #0: http://0.0.0.0:4001/status failed due to Get "http://0.0.0.0:4001/status": dial tcp 0.0.0.0:4001: connect: connection refused

Here how I write the docker compose file:

rqlite: image: rqlite/rqlite:5.9.0 container_name: rqlite ports: - 4001:4001 - 4002:4002 restart: always logging: driver: "json-file" options: max-file: "10" max-size: 20m healthcheck: test: ["CMD", "curl", "http://localhost:4001/status"]

And the connection uri I used is: http://rqlite:4001

Have anyone met that issue and how to fix it?
When the cluster is abnormal, the node started without the full local DB loaded.

What version are you running? 7.6

Are you using Docker or Kubernetes to run your system? no

Are you running a single node or a cluster? cluster

What did you do? Start one of the two cluster nodes and check the database data on the node

What did you expect to happen? The node started with the full local DB loaded.

What happened instead? The node loads only snapshot data.

Please include the Status, Nodes, and Expvar output from each node (or at least the Leader!)

See https://github.com/rqlite/rqlite/blob/master/DOC/DIAGNOSTICS.md
No ARM-images on Docker-Hub

Hello, i'd like to request adding ARMv7/8-images on Docker-Hub or another repository of choice.

Based on this blog-post i thought its already available, but sadly only the binary-releases include ARM-builds.

The lightweight, distributed relational database built on SQLite

Why?

How?

Key features

Quick Start

Docker

Forming a cluster

Cluster Discovery

Inserting records

Data API

Performance

In-memory databases

Does using an in-memory database put my data at risk?

Limitations

Status and Diagnostics

Backup and restore

Security

Google Group

Pronunciation?

Owner

rqlite

Comments

rqlite response with "database or disk is full" on insert or create index

Transactions?

cannot run quries to cluster members - only the leader responds

Resuming instance from different network address with existing raft.db triggers unnecessary election and fails to join

Write throughput when globally distributed

Problems quering from other nodes

Random database disk image is malformed errors

rqlite nodes crashine with no apparent reason

"clustering failure: error reading DNS configuration: invalid character '\'' " in kube stateufll set

Full connection and transaction control

I deployed rqlite by docker compose but it returned peer error

When the cluster is abnormal, the node started without the full local DB loaded.

No ARM-images on Docker-Hub

Related tags

A distributed MySQL binlog storage system built on Raft

JuiceFS is a distributed POSIX file system built on top of Redis and S3.

A distributed systems library for Kubernetes deployments built on top of spindle and Cloud Spanner.

A distributed locking library built on top of Cloud Spanner and TrueTime.

Distributed lock manager. Warning: very hard to use it properly. Not because it's broken, but because distributed systems are hard. If in doubt, do not use this.

Distributed reliable key-value store for the most critical data of a distributed system

CockroachDB - the open source, cloud-native distributed SQL database.

A linearizability distributed database by raft and wisckey.

Distributed disk storage database based on Raft and Redis protocol.

Lightweight, fault-tolerant message streams.

A library built to provide support for defining service health for golang services. It allows you to register async health checks for your dependencies and the service itself, provides a health endpoint that exposes their status, and health metrics.

Kafka implemented in Golang with built-in coordination (No ZK dep, single binary install, Cloud Native)

distributed data sync with operational transformation/transforms

High performance, distributed and low latency publish-subscribe platform.

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Storm, etc. I am also working on another similar pure Go system, https://github.com/chrislusf/gleam , which is more flexible and more performant.

Go Micro is a framework for distributed systems development

Simplified distributed locking implementation using Redis

A distributed lock service in Go using etcd