A realtime distributed messaging platform

Last update: Dec 29, 2022

Comments: 17

Source: https://github.com/nsqio/nsq
Issues: https://github.com/nsqio/nsq/issues
Mailing List: [email protected]
IRC: #nsq on freenode
Docs: https://nsq.io
Twitter: @nsqio

NSQ is a realtime distributed messaging platform designed to operate at scale, handling billions of messages per day.

It promotes distributed and decentralized topologies without single points of failure, enabling fault tolerance and high availability coupled with a reliable message delivery guarantee. See features & guarantees.

Operationally, NSQ is easy to configure and deploy (all parameters are specified on the command line and compiled binaries have no runtime dependencies). For maximum flexibility, it is agnostic to data format (messages can be JSON, MsgPack, Protocol Buffers, or anything else). Official Go and Python libraries are available out of the box (as well as many other client libraries) and, if you're interested in building your own, there's a protocol spec.

We publish binary releases for linux, darwin, freebsd and windows as well as an official Docker image.

NOTE: master is our development branch and may not be stable at all times.

In Production

Code of Conduct

Help us keep NSQ open and inclusive. Please read and follow our Code of Conduct.

Authors

NSQ was designed and developed by Matt Reiferson (@imsnakes) and Jehiah Czebotar (@jehiah) but wouldn't have been possible without the support of Bitly, maintainers (Pierce Lopez), and all our contributors.

Logo created by Wolasi Konu (@kisalow).

Owner

NSQ

A realtime distributed messaging platform

https://github.com/nsqio/nsq https://nsq.io

Comments

nsqadmin: refactoring
nsqadmin is a Go web app, which has some tradeoffs. It's great because it's dead simple to deploy. It's terrible because it's a PITA writing Go based web apps.

I think an ideal middle ground is:

nsqadmin becomes a JSON API with convenience endpoints to query the cluster and aggregate data

nsqadmin UI is a client-side app

all the static assets for (2) are bundled into the binary so it's still easy to deploy (this is already the case, just stating the obvious)

Simultaneously, it could use a facelift, but this refactoring alone makes things much easier to extend and for others to contribute.

cc @visionmedia
nsqd: remove --worker-id, replace with --node-id

This is one of the for-1.0 changes contemplated in #741

This is the minimal-effort approach. If --worker-id is specified with an int, parsing options will fail, and the user will hopefully notice the change instead of just specifying "true" or "false" to work-around the failure. Or is backwards-compatibility (with just a "deprecated" warning) desired?
nsqd: support deferred PUB

This came up on the mailing list: https://groups.google.com/forum/#!topic/nsq-users/HkoOMygaHK0 - just opening the issue here for others to weigh in and gauge interest.

This would allow you to PUB a message that goes directly to the deferred priority queue with the specified delay.

Also, It's complicated a bit by #34.
nsqd: track stats of end-to-end message processing time
As per #268, this pull request adds the ability to track statistics about the time it takes for messages to get .finished(). This is done by maintaining a probabilistic percentile calculation over each channel (using the perks package) and merging them when topic quantiles are requested.

This data is surfaced in:

[x] The /stats call

[x] statsd

[x] admin interface

Still to do:

[x] Before/after timings

[x] Docs

Sample data from /stats call:

$ curl "localhost:4151/stats" nsqd v0.2.23 (built w/go1.1.2) [test ] depth: 0 be-depth: 0 msgs: 591 e2e%: 13.5s, 13.5s, 683.9us, 375.1us, 269.6us [test_chan ] depth: 0 be-depth: 0 inflt: 0 def: 0 re-q: 0 timeout: 0 msgs: 591 e2e%: 13.5s, 13.5s, 1.1ms, 394.7us, 268.8us [V2 muon:55712 ] state: 3 inflt: 0 rdy: 62 fin: 591 re-q: 0 msgs: 591 connected: 7m18s
up-to-date dockerfiles

The official Docker builds have fallen out of date. It's also unfortunate that they can't take advantage of new Go speed improvements over time without manual intervention.

Would it be possible to include a Dockerfile in the repo for each app, based off google/golang, that built the apps from source on top of it? That would, at least, keep the Dockerfiles in source control. It also permits you to use the latest version of Go automatically. It can also be built into an automated pipeline, then, that updates the Docker Hub when a release is made.
nsqd: not properly creating channels registered in lookupd on seeing a new topic
I have a bunch of topics that are produced on a variety of nodes. It's not always clear to me ahead of time which nodes will be producing which topics, but it is certainly some overlapping subset. I have two consumer channels: "archive," for nsq_to_file, and "events," for part of the application. The problem is bringing up a new node: I don't know exactly which topics to create in nsqd, but I do know which topics it might produce, and of those, which channels they should go to.

As it stands, the first message on a topic will get to nsqd, and whichever consumer happens to find it first will see that message on its channel. Since nsqd doesn't yet know about the other channel, it will drop the message before the second consumer has connected.

Some thoughts on how to potentially deal with this:

The lookupd http apis seemed promising (though missing some documentation). If information created there were to flow in reverse back to nsqd, say, when a topic/channel is created, all the nsqd's peering with that lookupd could know about that topic-channel association. A fresh nsqd won't register itself as publishing the topic, but if the does get a message to publish on that topic, it could create the known channels immediately to ensure a consumer on those channels will definitely get that first message.

As an alternative solution: on a new topic, when lookupd is configured, nsqd could wait until lookupd has (somehow) confirmed all known peers have seen the new producer information and had a chance to connect on any additional channels before dropping messages on the new topic.

Let me know if there's something I'm misunderstanding! Thanks!
nsqd authorization

This is an attempt at creating an authorization protocol as originaly requested in #224

Specification:

nsqd TCP protocol will add a AUTH command with a json body that MUST be called before SUB or PUB when nsqd has authorization enabled.

Nsqd will make authorization requests against any auth server configured until it gets a valid response. It will pass along the metadata from the client, and the clients remote address. The auth server will reply with a list of topic/channels that the connection is valid for. The topic and channel names returned will be interpreted as a regex which will allow for the auth server to give blanket auth response in the form of topic=.*, channels=[.*] response and will specify a TTL for how long the authorization is good for. The auth request will be cached and re-tried upon expiration of the authorization metadata. (this will ensure that when authorization is removed from a auth server, you can ensure that it no longer applies after a specified ttl).

Authorization information will be stored on the connection for the life of the connection or until the TTL expires, and will be checked at SUB or PUB time.

It is expected that when using an authorization server, the nsqd HTTP endpoints are not exposed to untrusted clients.

A simple auth server will be included that gives a UI to manage authorizations based on login, tls, remote_ip and optionally an oauth2 endpoint used to lookup login values. It is expected that other auth servers will be written for different authorization backends (like LDAP, ssl certificate checking, etc). Connections between nsqd and the authorization server are expected to be over a trusted network.

cc: @mreiferson
nsqd: per-topic message IDs

(Pushing up rebased branches I've had laying around locally for ages)

This modifies message ID generation such that each topic maintains a monotonic/atomic counter to generate message IDs, a more scalable and performant approach.
nsqadmin: work-around bug for raw ipv6 addresses

Ipv6 address in broadcast_address should be returned with leading and trilling square bracket trimmed.

When we use go-nsq connect to nsqd by using lookupd's lookup api, go-nsq will compose the address by calling net.JoinHostPort. net.JoinHostPort will assume that if it is an ipv6 address, there should be no square brackets as net.JoinHostPort will add square brackets when it detects an ipv6 address is added.
nsqd: (optionally) require client certs

This is a first pass at client certificate requirement, optional verification, and ability to load custom certificate authority (for self signed certificates). Closing the client connection if not upgraded to TLS still needs to be implemented, as well as the test coverage. Was hoping to get any feedback on current implementation (even if it's just a thumbs up) before I finish it out. Thanks!

nsqd: client error logged even when cleanly closed

nsqd always print log as follow on our production environment:

[nsqd] 2015/01/08 22:49:51.130538 TCP: new client(10.6.28.53:22505)
[nsqd] 2015/01/08 22:49:51.130847 CLIENT(10.6.28.53:22505): desired protocol magic '  V2'
[nsqd] 2015/01/08 22:49:51.137175 PROTOCOL(V2): [10.6.28.53:22505] exiting ioloop
[nsqd] 2015/01/08 22:49:51.137273 ERROR: client(10.6.28.53:22505) - failed to read command - EOF
[nsqd] 2015/01/08 22:49:51.137310 PROTOCOL(V2): [10.6.28.53:22505] exiting messagePump

my environment like this:

CentOS release 6.6 (Final)                                                                                                 
Kernel \r on an \m                                                                                                
2.6.32-504.3.3.el6.x86_64
nsqd v0.3.0 (built w/go1.3.3)

and my nsq client used nsqphp

so, Any idea?

Ability to bind unix sockets for nsqd
Hello, this PR provided unix socket supports.

nsqd may started with

/build/nsqd --use-unix-sockets --tcp-address /var/run/nsqd.sock --http-address /var/run/nsqd-http.sock --data-path /var/local/nsqd/

I've adopted pynsq and go-nsq for connecting to unix sockets as well. I'll create PRs after resolution.

defer is not working as expected

Based on the document, we only need pass defer parameter in ms to achieve message delay delivery. https://nsq.io/components/nsqd.html#post-pub

consumer

const nsq = require('nsqjs')
const moment = require('moment')

// reader
const reader = new nsq.Reader('sample_topic', 'sample_topic', {
  lookupdHTTPAddresses: 'nsqlookupd:4161'
})

reader.connect()

reader.on('message', msg => {
  console.log('Received message [%s] [%s]: %s', moment.utc().format('YYYY-MM-DD HH:mm:ss:SSS'), msg.id, msg.body.toString())
  setTimeout(() => {
    msg.finish()
  }, 1000)
});

producer, just call the API with postman http://localhost:4151/pub?defer=5000&topic=sample_topic

pre-request part, will generate the current time in ms.

var moment = require('moment');
var date = moment.utc().format('YYYY-MM-DD HH:mm:ss:SSS');
pm.environment.set("now", date);

in the request

{
    "text": "some message",
    "message  ": [{{now}}]
}

output

nsq-demo-client-1      | Received message [2022-12-13 14:37:15:684] [11bc612bb5c88000]: {
nsq-demo-client-1      |     "text": "some message",
nsq-demo-client-1      |     "message  ": [2022-12-13 14:37:12:336]
nsq-demo-client-1      | }

expect the Received message should be 5 seconds later comparing with the timestamp in the message body.

actual the Received message is always within the defer ms.

nsqd: optimize the performance of httpServer's doPub

When a dense request is encountered and a single message is large, the use of ioutil.ReadAll will have an impact on the performance, causing a certain degree of memory leak. I have encountered serious cases, resulting in OOM. Because ioutil.ReadAll is to read out the information at one time. A large amount of information will also cause the expansion of the slice and affect the performance. At the same time, it will cause memory escape and increase the burden of GC.

Using io.Copy() avoids reading out messages at one time, and using pool improves memory utilization and reduce the burden of GC.
nsqd: ephemeral channels don't work with --mem-queue-size=0

When mem-queue-size= 0 is set, the message is sent to a temporary channel (#ephemeral), and it is found that the message can never be consumed. Is it because the mem-queue-size is exceeded and the message is directly discarded?
nsqadmin: non admin user can delete the topic on a node

Hi Team

On the nsqadmin UI page, Non admin user can see the delete icon and delete the topic on a node successfully

tombstoneNodeForTopicHandler is not protected by isAuthorizedAdminRequest

Thanks /Joe
*: Support NSQ in ArgoLabs Dataflow
In Dataflow, we currently have support for Kafka, Jetstream, NATS Streaming and other sources and sinks. It would be amazing to add support for NSQ too. We know that we plenty of use cases in the OSS community and this would get a lot of love.

The Dataflow team are not NSQ experts. My ask is - would anyone be interested in implementing the changes needed?

Here is a pull request for Jetstream, breaking it down:

Add manifests to install a dev NSQ into Kubernetes cluster.

Add APIs to specify a NSQ source and sink (e.g. the URL, authentication).

Implement the source and sink.

Write test infrastructure (e.g. check the right number of messages get written to a sink).

Write tests (e.g. does the right number of message get written, even if the pod is deleted).

Write examples.

Write docs.

I'll raise a ticket against the main repo, and the Go client repo (there might be engineers there with the exact skills needed).

Distributed lock manager. Warning: very hard to use it properly. Not because it's broken, but because distributed systems are hard. If in doubt, do not use this.

What Dlock is a distributed lock manager [1]. It is designed after flock utility but for multiple machines. When client disconnects, all his locks are

Dec 24, 2019

Distributed reliable key-value store for the most critical data of a distributed system

etcd Note: The main branch may be in an unstable or even broken state during development. For stable versions, see releases. etcd is a distributed rel

Dec 30, 2022

High-Performance server for NATS, the cloud native messaging system.

NATS is a simple, secure and performant communications system for digital systems, services and devices. NATS is part of the Cloud Native Computing Fo

Jan 8, 2023

High performance, distributed and low latency publish-subscribe platform.

Emitter: Distributed Publish-Subscribe Platform Emitter is a distributed, scalable and fault-tolerant publish-subscribe platform built with MQTT proto

Jan 2, 2023

Lockgate is a cross-platform locking library for Go with distributed locks using Kubernetes or lockgate HTTP lock server as well as the OS file locks support.

Lockgate Lockgate is a locking library for Go. Classical interface: 2 types of locks: shared and exclusive; 2 modes of locking: blocking and non-block

Dec 16, 2022

distributed data sync with operational transformation/transforms

DOT The DOT project is a blend of operational transformation, CmRDT, persistent/immutable datastructures and reactive stream processing. This is an im

Dec 16, 2022

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

Gleam Gleam is a high performance and efficient distributed execution system, and also simple, generic, flexible and easy to customize. Gleam is built

Jan 1, 2023

Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Storm, etc. I am also working on another similar pure Go system, https://github.com/chrislusf/gleam , which is more flexible and more performant.

glow Purpose Glow is providing a library to easily compute in parallel threads or distributed to clusters of machines. This is written in pure Go. I a

Dec 30, 2022

A realtime distributed messaging platform

In Production

Code of Conduct

Authors

Owner

NSQ

Comments

nsqadmin: refactoring

nsqd: remove --worker-id, replace with --node-id

nsqd: support deferred PUB

nsqd: track stats of end-to-end message processing time

up-to-date dockerfiles

nsqd: not properly creating channels registered in lookupd on seeing a new topic

nsqd authorization

nsqd: per-topic message IDs

nsqadmin: work-around bug for raw ipv6 addresses

nsqd: (optionally) require client certs

nsqd: client error logged even when cleanly closed

Ability to bind unix sockets for nsqd

defer is not working as expected

nsqd: optimize the performance of httpServer's doPub

nsqd: ephemeral channels don't work with --mem-queue-size=0

nsqadmin: non admin user can delete the topic on a node

*: Support NSQ in ArgoLabs Dataflow

Related tags

Distributed lock manager. Warning: very hard to use it properly. Not because it's broken, but because distributed systems are hard. If in doubt, do not use this.

Distributed reliable key-value store for the most critical data of a distributed system

High-Performance server for NATS, the cloud native messaging system.

High performance, distributed and low latency publish-subscribe platform.

Lockgate is a cross-platform locking library for Go with distributed locks using Kubernetes or lockgate HTTP lock server as well as the OS file locks support.

distributed data sync with operational transformation/transforms

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Storm, etc. I am also working on another similar pure Go system, https://github.com/chrislusf/gleam , which is more flexible and more performant.

Go Micro is a framework for distributed systems development

Simplified distributed locking implementation using Redis

A distributed lock service in Go using etcd

Skynet is a framework for distributed services in Go.

Go Open Source, Distributed, Simple and efficient Search Engine

Dapr is a portable, event-driven, runtime for building distributed applications across cloud and edge.

A distributed, proof of stake blockchain designed for the financial services industry.

A distributed system for embedding-based retrieval

Asynq: simple, reliable, and efficient distributed task queue in Go

💡 A Distributed and High-Performance Monitoring System. The next generation of Open-Falcon

Build share and run your distributed applications.