moss - a simple, fast, ordered, persistable, key-val storage library for golang

moss

moss provides a simple, fast, persistable, ordered key-val collection implementation as a 100% golang library.

moss stands for "memory-oriented sorted segments".

Build Status Coverage Status GoDoc Go Report Card

Features

  • ordered key-val collection API
  • 100% go implementation
  • key range iterators
  • snapshots provide for isolated reads
  • atomic mutations via a batch API
  • merge operations allow for read-compute-write optimizations for write-heavy use cases (e.g., updating counters)
  • concurrent readers and writers don't block each other
  • child collections allow multiple related collections to be atomically grouped
  • optional, advanced API's to avoid extra memory copying
  • optional lower-level storage implementation, called "mossStore", that uses an append-only design for writes and mmap() for reads, with configurable compaction policy; see: OpenStoreCollection()
  • mossStore supports navigating back through previous commit points in read-only fashion, and supports reverting to previous commit points.
  • optional persistence hooks to allow write-back caching to a lower-level storage implementation that advanced users may wish to provide (e.g., you can hook moss up to leveldb, sqlite, etc)
  • event callbacks allow the monitoring of asynchronous tasks
  • unit tests
  • fuzz tests via go-fuzz & smat (github.com/mschoch/smat); see README-smat.md
  • moss store's diagnostic tool: mossScope

License

Apache 2.0

Example

import github.com/couchbase/moss

c, err := moss.NewCollection(moss.CollectionOptions{})
c.Start()
defer c.Close()

batch, err := c.NewBatch(0, 0)
defer batch.Close()

batch.Set([]byte("car-0"), []byte("tesla"))
batch.Set([]byte("car-1"), []byte("honda"))

err = c.ExecuteBatch(batch, moss.WriteOptions{})

ss, err := c.Snapshot()
defer ss.Close()

ropts := moss.ReadOptions{}

val0, err := ss.Get([]byte("car-0"), ropts) // val0 == []byte("tesla").
valX, err := ss.Get([]byte("car-not-there"), ropts) // valX == nil.

// A Get can also be issued directly against the collection
val1, err := c.Get([]byte("car-1"), ropts) // val1 == []byte("honda").

For persistence, you can use...

store, collection, err := moss.OpenStoreCollection(directoryPath,
    moss.StoreOptions{}, moss.StorePersistOptions{})

Design

The design is similar to a (much) simplified LSM tree, with a stack of sorted, immutable key-val arrays or "segments".

To incorporate the next Batch of key-val mutations, the incoming key-val entries are first sorted into an immutable "segment", which is then atomically pushed onto the top of the stack of segments.

For readers, a higher segment in the stack will shadow entries of the same key from lower segments.

Separately, an asynchronous goroutine (the "merger") will continuously merge N sorted segments to keep stack height low.

In the best case, a remaining, single, large sorted segment will be efficient in memory usage and efficient for binary search and range iteration.

Iterations when the stack height is > 1 are implementing using a N-way heap merge.

In this design, the stack of segments is treated as immutable via a copy-on-write approach whenever the stack needs to be "modified". So, multiple readers and writers won't block each other, and taking a Snapshot is also a similarly cheap operation by cloning the stack.

See also the DESIGN.md writeup.

Limitations and considerations

NOTE: Keys in a Batch must be unique. That is, myBatch.Set("x", "foo"); myBatch.Set("x", "bar") is not supported. Applications that do not naturally meet this requirement might maintain their own map[key]val data structures to ensure this uniqueness constraint.

Max key length is 2^24 (24 bits used to track key length).

Max val length is 2^28 (28 bits used to track val length).

Metadata overhead for each key-val operation is 16 bytes.

Read performance characterization is roughly O(log N) for key-val retrieval.

Write performance characterization is roughly O(M log M), where M is the number of mutations in a batch when invoking ExecuteBatch().

Those performance characterizations, however, don't account for background, asynchronous processing for the merging of segments and data structure maintenance.

A background merger task, for example, that is too slow can eventually stall ingest of new batches. (See the CollectionOptions settings that limit segment stack height.)

As another example, one slow reader that holds onto a Snapshot or onto an Iterator for a long time can hold onto a lot of resources. Worst case is the reader's Snapshot or Iterator may delay the reclaimation of large, old segments, where incoming mutations have obsoleted the immutable segments that the reader is still holding onto.

Error handling

Please note that the background goroutines of moss may run into errors, for example during optional persistence operations. To be notified of these cases, your application can provide (highly recommended) an optional CollectionOptions.OnError callback func which will be invoked by moss.

Logging

Please see the optional CollectionOptions.Log callback func and the CollectionOptions.Debug flag.

Performance

Please try go test -bench=. for some basic performance tests.

Each performance test will emit output that generally looks like...

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
spec: {numItems:1000000 keySize:20 valSize:100 batchSize:100 randomLoad:false noCopyValue:false accesses:[]}
     open || time:     0 (ms) |        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s || cumulative:        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s
     load || time:   840 (ms) |  1190476 wop/s |   139508 wkb/s |        0 rop/s |        0 rkb/s || cumulative:  1190476 wop/s |   139508 wkb/s |        0 rop/s |        0 rkb/s
    drain || time:   609 (ms) |        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s || cumulative:   690131 wop/s |    80874 wkb/s |        0 rop/s |        0 rkb/s
    close || time:     0 (ms) |        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s || cumulative:   690131 wop/s |    80874 wkb/s |        0 rop/s |        0 rkb/s
   reopen || time:     0 (ms) |        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s || cumulative:   690131 wop/s |    80874 wkb/s |        0 rop/s |        0 rkb/s
     iter || time:    81 (ms) |        0 wop/s |        0 wkb/s | 12344456 rop/s |  1446616 rkb/s || cumulative:   690131 wop/s |    80874 wkb/s | 12344456 rop/s |  1446616 rkb/s
    close || time:     2 (ms) |        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s || cumulative:   690131 wop/s |    80874 wkb/s | 12344456 rop/s |  1446616 rkb/s
total time: 1532 (ms)
file size: 135 (MB), amplification: 1.133
BenchmarkStore_numItems1M_keySize20_valSize100_batchSize100-8

There are various phases in each test...

  • open - opening a brand new moss storage instance
  • load - time to load N sequential keys
  • drain - additional time after load for persistence to complete
  • close - time to close the moss storage instance
  • reopen - time to reopen the moss storage instance (OS/filesystem caches are still warm)
  • iter - time to sequentially iterate through key-val items
  • access - time to perform various access patterns, like random or sequential reads and writes

The file size measurement is after final compaction, with amplification as a naive calculation to compare overhead against raw key-val size.

Contributing changes

Please see the CONTRIBUTING.md document.

Comments
  • Correction to the example in readme

    Correction to the example in readme

    The CollectionOptions{} needed prefixing with moss. and c.NewBatch() returns a Batch and an error therefore can't re-assign to c (of type Collection).

  • Panic on Get

    Panic on Get

    I was trying to use moss as a backend store for Dgraph (https://github.com/dgraph-io/dgraph/tree/try/moss). But faced this issue

    panic: runtime error: slice bounds out of range
    
    goroutine 348 [running]:
    github.com/couchbase/moss.(*segment).FindStartKeyInclusivePos(0xc4201ee000, 0xc42a1cb7e0, 0x16, 0x16, 0x16)
    	/home/ashwin/go/src/github.com/couchbase/moss/segment.go:313 +0x19b
    github.com/couchbase/moss.(*segmentStack).get(0xc4201cac30, 0xc42a1cb7e0, 0x16, 0x16, 0x1e, 0x0, 0x7f7f00, 0xc42006efc0, 0x1f, 0x2a, ...)
    	/home/ashwin/go/src/github.com/couchbase/moss/segment_stack.go:90 +0x26b
    github.com/couchbase/moss.(*segmentStack).Get(0xc4201cac30, 0xc42a1cb7e0, 0x16, 0x16, 0xc4201cac00, 0x6, 0x6, 0x6, 0x0, 0x6)
    	/home/ashwin/go/src/github.com/couchbase/moss/segment_stack.go:74 +0x75
    github.com/couchbase/moss.(*Footer).Get(0xc42006efc0, 0xc42a1cb7e0, 0x16, 0x16, 0x465600, 0x1, 0x6, 0xc424e85910, 0x465182, 0xfc7740)
    	/home/ashwin/go/src/github.com/couchbase/moss/store_footer.go:426 +0x8a
    github.com/couchbase/moss.(*snapshotWrapper).Get(0xc420192fc0, 0xc42a1cb7e0, 0x16, 0x16, 0x0, 0x0, 0xc4200928f0, 0x6, 0x0, 0xc424e85948)
    	/home/ashwin/go/src/github.com/couchbase/moss/wrap.go:94 +0x62
    github.com/couchbase/moss.(*segmentStack).get(0xc42a1f0d20, 0xc42a1cb7e0, 0x16, 0x16, 0xffffffffffffffff, 0x0, 0xc424e85a00, 0x1, 0x1, 0x6, ...)
    	/home/ashwin/go/src/github.com/couchbase/moss/segment_stack.go:110 +0xa2
    github.com/couchbase/moss.(*segmentStack).Get(0xc42a1f0d20, 0xc42a1cb7e0, 0x16, 0x16, 0x0, 0x0, 0x414ea2, 0xc42a1f0cd0, 0x50, 0x48)
    	/home/ashwin/go/src/github.com/couchbase/moss/segment_stack.go:74 +0x75
    github.com/dgraph-io/dgraph/posting.(*List).getPostingList(0xc42a1e9e00, 0x0, 0x0)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/posting/list.go:190 +0x1ef
    github.com/dgraph-io/dgraph/posting.(*List).updateMutationLayer(0xc42a1e9e00, 0xc42a1f0cd0, 0x0)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/posting/list.go:263 +0x125
    github.com/dgraph-io/dgraph/posting.(*List).addMutation(0xc42a1e9e00, 0xfdec00, 0xc425576e40, 0xc42226c660, 0x5, 0x5, 0xb090c8)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/posting/list.go:340 +0xd7
    github.com/dgraph-io/dgraph/posting.(*List).AddMutationWithIndex(0xc42a1e9e00, 0xfdec00, 0xc425576e40, 0xc42226c660, 0x0, 0x0)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/posting/index.go:171 +0x2da
    github.com/dgraph-io/dgraph/worker.runMutations(0x7f62912c2280, 0xc425576e40, 0xc4221e0000, 0x3e8, 0x400, 0x0, 0x0)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/worker/mutation.go:50 +0x21a
    github.com/dgraph-io/dgraph/worker.(*node).processMutation(0xc42008c000, 0x2, 0x114, 0x0, 0xc420f4c000, 0x92c5, 0xa000, 0x0, 0x0, 0x0, ...)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/worker/draft.go:372 +0x13e
    github.com/dgraph-io/dgraph/worker.(*node).process(0xc42008c000, 0x2, 0x114, 0x0, 0xc420f4c000, 0x92c5, 0xa000, 0x0, 0x0, 0x0, ...)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/worker/draft.go:405 +0x36a
    created by github.com/dgraph-io/dgraph/worker.(*node).processApplyCh
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/worker/draft.go:444 +0x49b
    

    Any help would be appriciated. Thanks!

  • avoid deadlock by starting collection

    avoid deadlock by starting collection

    When running the Readme's example, you immediately run into an error since Start() hasn't been called on the collection. This will help getting the first example running without having to look around

  • Moss is using undefined type from ghistogram

    Moss is using undefined type from ghistogram

    I was trying to use moss as a store for bleve package, but I get the following error:

    ../../go/src/search/vendor/github.com/couchbase/moss/api.go:173:15: undefined: ghistogram.Histograms
    ./../go/src/search/vendor/github.com/couchbase/moss/collection.go:98:13: undefined: ghistogram.Histograms
    

    where ~/go/src/search is my client package that vendored bleve and moss I see in modd/api.go:173 that it's using a structu from ghistogram package called Historgrams in plural, but this struct doesn't exist in ghistograms package.

    is there something I'm missing or that was a typo, or ghistogram was changed overtime and became incompatible with moss latest version?

  • optimization idea: copy-on-write, ref-counted segmentStack's

    optimization idea: copy-on-write, ref-counted segmentStack's

    Currently, creating snapshot copies the the segment stack, which means memory allocations and copying of pointers. Instead, perhaps creating a snapshot should just bump a ref-count on some existing segmentStack which should be treated as immutable (except for the ref-count).

    Whenever a writer (such as ExecuteBatch, the background Merger, etc) wants to modify the segmentStack, it should use copy-on-write.

    The existing moss implementation actually does parts of the above anyways, so we might be near to that already.

  • Buckets?

    Buckets?

    Hi, from the description it is not obvious to me if moss supports buckets(like bolt)?

    There is the

    child collections allow multiple related collections to be atomically grouped

    which I'm not exactly sure if it is something like buckets or it is just a bunch of selected records manually put together?

  • Q: Transaction performance

    Q: Transaction performance

    Hi, this project looks very interesting. I have a lot of transaction writes, instead of multiple writes per transaction.

    How many transaction writes of small data can moss handle per second on average ssd? For example with BoltDB it was only about 250, so I wonder if this project can perform better or if it is also limited by the file system.

  • Q on write amplification

    Q on write amplification

    @mschoch I enjoyed your talk on Moss at GopherCon. At the end you pointed out a situation (I couldn't discern exactly when) where a small write lead to alot of copying/write-amplification.

    I just wanted to inquire if that issue had been addressed?

  • add a non-snapshot Collection.Get(key, ReaderOptions) API

    add a non-snapshot Collection.Get(key, ReaderOptions) API

    If a user just wants to lookup a single item in a collection, they have to first create a snapshot, then snapshot.Get(), then snapshot.Close().

    One issue is creating a snapshot means memory allocations (of the snapshot instance and taking a copy of the segment stack).

    A "onesie" API to just lookup a single item, if it can be implemented efficiently (without undue memory allocations and with having to hold a lock for a long time), more be more convenient for folks to grok.

    (See also https://github.com/couchbase/moss/issues/14)

  • support for multiple collections

    support for multiple collections

    Users can currently fake out multiple collections by explicitly adding a short collection name prefix to each of their keys. However, such a trick is suboptimal as it repeats a prefix for every key-val item.

    Instead, a proposal to support multiple collections natively in moss would be by introducing a few additional methods to the API, so that we remain backwards compatible for current moss users.

    The idea is that the current Collection in moss now logically becomes a "top-most" collection of an optional hierarchy of child collections.

    To the Batch interface, the proposal is to add the methods...

    NewChildCollectionBatch(childCollectionName string, hints) (Batch, error)
    
    DelChildCollection(childCollectionName string) error
    

    When a batch is executed, the entire hierarchy of a top-level batch and its batches for any child collections will be committed atomically.

    Removing a child collection takes precedence over adding more child collection mutations.

    To the Snapshot interface, the proposal is to add the methods...

    ChildCollectionNames() ([]string, error)
    
    ChildCollectionSnapshot(childCollectionName string) (Snapshot, error)
    

    And, that's it.

    The proposed API allows for deeply nested child collections of child collections, but the initial implementation might just return an error if the user tries to have deep nesting.

  • Iterator only returning one record

    Iterator only returning one record

    I'm using this code to store and retrieve records from moss.

    But when I call GetStoredSQSMessages() it only seems to return the last entry, as opposed to all the entries.

    If I run strings data-0000000000000001.moss I can see all the records I'm expecting, so I know their somewhere in the moss, but I just can't get at them w/ the iterator.

    Can you take a look at my GetStoredSQSMessages method and see if I'm doing anything wrong.

    If nothing is obvious, should I try repro'ing this in a unit test? I'm storing the moss in a docker volume mount, so it's possible I'm doing something funny (but like I said I can see all the records with strings, so it seems to be an iterator problem)

  • graphplot deprecated?

    graphplot deprecated?

    PS is there no go alternative for this?

    go test -run=TestMossDGM -outputToFile
    03:12:21 OpsSet 412470 numKeysRead 1811156 dbSize 37mb 75mb Memory 0mb MMap 0mb
    03:12:22 OpsSet 524287 numKeysRead 1356666 dbSize 85mb 173mb Memory 0mb MMap 0mb
    03:12:23 OpsSet 403496 numKeysRead 1131032 dbSize 122mb 300mb Memory 0mb MMap 0mb
    03:12:24 OpsSet 149049 numKeysRead 1677611 dbSize 136mb 191mb Memory 0mb MMap 0mb
    03:12:25 OpsSet 582009 numKeysRead 1313250 dbSize 189mb 293mb Memory 0mb MMap 0mb
    03:12:26 OpsSet 169398 numKeysRead 994228 dbSize 205mb 412mb Memory 0mb MMap 0mb
    03:12:27 OpsSet 619084 numKeysRead 1458883 dbSize 261mb 546mb Memory 0mb MMap 0mb
    03:12:28 OpsSet 584209 numKeysRead 1328463 dbSize 315mb 644mb Memory 0mb MMap 0mb
    03:12:29 OpsSet 597415 numKeysRead 1085736 dbSize 370mb 753mb Memory 0mb MMap 0mb
    Workers Stop...
    03:12:30 OpsSet 99802 numKeysRead 803970 dbSize 379mb 903mb Memory 0mb MMap 0mb
    03:12:30 - Closing Collections...Done 1.419
    03:12:37 - Closing Collections...Done 1.009
    PASS
    ok      _/Users/gert/Desktop/moss       17.650s
    
    python -m pip install --upgrade pandas
    
    python graph/dgm-moss-plot.py Results_024923_48983.json 
    Traceback (most recent call last):
      File "graph/dgm-moss-plot.py", line 309, in <module>
        main()
      File "graph/dgm-moss-plot.py", line 92, in main
        resultsFile['memused'] = (resultsFile['memtotal'] - (resultsFile['memfree']))/1024/1024
      File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 2927, in __getitem__
        indexer = self.columns.get_loc(key)
      File "/usr/local/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 2659, in get_loc
        return self._engine.get_loc(self._maybe_cast_indexer(key))
      File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
      File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
      File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
      File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
    KeyError: 'memtotal'
    
    cat Results_024923_48983.json 
    
    {"cfg_CompactionBufferPages":512,"cfg_CompactionLevelMaxSegments":9,"cfg_CompactionLevelMultiplier":3,"cfg_CompactionPercentage":0.65,"dbPath":"moss-test-data","diskMonitor":"sdc","keyLength":48,"keyOrder":"Random","memQuota":4294967296,"ncpus":8,"numReaders":1,"numWriters":1,"readBatchSize":100,"readBatchThinkTime":0,"runDescription":"-","runTime":10,"sampleFrequency":1000000000,"valueLength":48,"writeBatchSize":10000,"writeBatchThinkTime":0}
    {"intervaltime":"02:49:24","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1864177,"numKeysStart":0,"numKeysWrite":607959,"numReadBatches":39005,"numWriteBatches":60,"num_bytes_used_disk":73283936,"num_files":1,"num_segments":3,"processMem":0,"totalKeyBytes":17668224,"totalOpsDel":0,"totalOpsSet":368088,"totalValBytes":17668224,"total_compactions":1,"total_persists":19}
    {"intervaltime":"02:49:25","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1678772,"numKeysStart":0,"numKeysWrite":601788,"numReadBatches":34453,"numWriteBatches":60,"num_bytes_used_disk":168420704,"num_files":1,"num_segments":7,"processMem":0,"totalKeyBytes":42663744,"totalOpsDel":0,"totalOpsSet":520740,"totalValBytes":42663744,"total_compactions":0,"total_persists":12}
    {"intervaltime":"02:49:26","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1272640,"numKeysStart":0,"numKeysWrite":612793,"numReadBatches":26140,"numWriteBatches":62,"num_bytes_used_disk":291830838,"num_files":2,"num_segments":18,"processMem":0,"totalKeyBytes":66042192,"totalOpsDel":0,"totalOpsSet":487051,"totalValBytes":66042192,"total_compactions":0,"total_persists":19}
    {"intervaltime":"02:49:27","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1765483,"numKeysStart":0,"numKeysWrite":576301,"numReadBatches":36190,"numWriteBatches":57,"num_bytes_used_disk":174847246,"num_files":1,"num_segments":6,"processMem":0,"totalKeyBytes":69736416,"totalOpsDel":0,"totalOpsSet":76963,"totalValBytes":69736416,"total_compactions":1,"total_persists":5}
    {"intervaltime":"02:49:28","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1411938,"numKeysStart":0,"numKeysWrite":608265,"numReadBatches":28730,"numWriteBatches":61,"num_bytes_used_disk":279874336,"num_files":1,"num_segments":16,"processMem":0,"totalKeyBytes":95461344,"totalOpsDel":0,"totalOpsSet":535936,"totalValBytes":95461344,"total_compactions":0,"total_persists":18}
    {"intervaltime":"02:49:29","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1191213,"numKeysStart":0,"numKeysWrite":625228,"numReadBatches":24315,"numWriteBatches":63,"num_bytes_used_disk":382459295,"num_files":1,"num_segments":20,"processMem":0,"totalKeyBytes":128764464,"totalOpsDel":0,"totalOpsSet":693815,"totalValBytes":128764464,"total_compactions":0,"total_persists":20}
    {"intervaltime":"02:49:30","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":990468,"numKeysStart":0,"numKeysWrite":577997,"numReadBatches":20153,"numWriteBatches":58,"num_bytes_used_disk":578373280,"num_files":1,"num_segments":3,"processMem":0,"totalKeyBytes":115968048,"totalOpsDel":0,"totalOpsSet":0,"totalValBytes":115968048,"total_compactions":0,"total_persists":0}
    {"intervaltime":"02:49:31","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1462143,"numKeysStart":0,"numKeysWrite":605731,"numReadBatches":29763,"numWriteBatches":60,"num_bytes_used_disk":686256128,"num_files":1,"num_segments":16,"processMem":0,"totalKeyBytes":167349888,"totalOpsDel":0,"totalOpsSet":1070455,"totalValBytes":167349888,"total_compactions":0,"total_persists":21}
    {"intervaltime":"02:49:32","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1152540,"numKeysStart":0,"numKeysWrite":610852,"numReadBatches":23471,"numWriteBatches":61,"num_bytes_used_disk":795824128,"num_files":1,"num_segments":20,"processMem":0,"totalKeyBytes":197996448,"totalOpsDel":0,"totalOpsSet":638470,"totalValBytes":197996448,"total_compactions":0,"total_persists":20}
    {"intervaltime":"02:49:33","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":864135,"numKeysStart":0,"numKeysWrite":552955,"numReadBatches":17685,"numWriteBatches":56,"num_bytes_used_disk":938213376,"num_files":1,"num_segments":22,"processMem":0,"totalKeyBytes":213553104,"totalOpsDel":0,"totalOpsSet":324097,"totalValBytes":213553104,"total_compactions":0,"total_persists":10}
    {"tot_mapped":0,"tot_mhBlockDuration":0,"tot_mhBlocks":0,"tot_numKeysRead":13653509,"tot_numKeysStart":0,"tot_numKeysWrite":5980000,"tot_numReadBatches":279905,"tot_numWriteBatches":598,"tot_num_bytes_used_disk":1030427776,"tot_num_files":1,"tot_num_segments":6,"tot_processMem":0,"tot_totalKeyBytes":216281328,"tot_totalOpsDel":0,"tot_totalOpsSet":4505861,"tot_totalValBytes":216281328,"tot_total_compactions":2,"tot_total_persists":144}
    
  • StoreOptions parameter needed for both OpenStore and OpenCollection?

    StoreOptions parameter needed for both OpenStore and OpenCollection?

    Hi, for me it's not clear why both OpenStore and OpenCollection need a StoreOptions parameter? Is there a case to use different StoreOptions for OpenStore and store.OpenCollection? If they need to be the same isn't it safer to only set StoreOptions on theOpenStore call so no bug's can ocurre when StoreOptions get changed between the two calls?

    func OpenStoreCollection(dir string, options StoreOptions,
    	persistOptions StorePersistOptions) (*Store, Collection, error) {
    	store, err := OpenStore(dir, options)
    	if err != nil {
    		return nil, nil, err
    	}
    
    	coll, err := store.OpenCollection(options, persistOptions)
    	if err != nil {
    		store.Close()
    		return nil, nil, err
    	}
    
    	return store, coll, nil
    }
    
  • Moss panic with runaway disk usage

    Moss panic with runaway disk usage

    Hey everyone, I'm seeing the following panic which looks like it's run out of memory. However, I've attached a graph of Grafana which shows the disk usage by the Moss index (its from a file walker so it could have a race condition if Moss is generating lots of files while it is walking)

    fatal error: runtime: cannot allocate memory
    
    goroutine 103 [running]:
    runtime.systemstack_switch()
    	stdlib%/src/runtime/asm_amd64.s:311 fp=0xc000b75970 sp=0xc000b75968 pc=0x459c90
    runtime.persistentalloc(0xd0, 0x0, 0x27ad2b0, 0x7c4eac)
    	GOROOT/src/runtime/malloc.go:1142 +0x82 fp=0xc000b759b8 sp=0xc000b75970 pc=0x40c932
    runtime.newBucket(0x1, 0x4, 0x425f76)
    	GOROOT/src/runtime/mprof.go:173 +0x5e fp=0xc000b759f0 sp=0xc000b759b8 pc=0x42573e
    runtime.stkbucket(0x1, 0x33a000, 0xc000b75a98, 0x4, 0x20, 0xc000b75a01, 0x7f08c8658138)
    	GOROOT/src/runtime/mprof.go:240 +0x1aa fp=0xc000b75a50 sp=0xc000b759f0 pc=0x425a3a
    runtime.mProf_Malloc(0xc01298a000, 0x33a000)
    	GOROOT/src/runtime/mprof.go:344 +0xd6 fp=0xc000b75bc8 sp=0xc000b75a50 pc=0x425fd6
    runtime.profilealloc(0xc0026e8000, 0xc01298a000, 0x33a000)
    	GOROOT/src/runtime/malloc.go:1058 +0x4b fp=0xc000b75be8 sp=0xc000b75bc8 pc=0x40c6cb
    runtime.mallocgc(0x33a000, 0x14f3080, 0x1, 0xc008fb0000)
    	GOROOT/src/runtime/malloc.go:983 +0x46c fp=0xc000b75c88 sp=0xc000b75be8 pc=0x40bdac
    runtime.makeslice(0x14f3080, 0x0, 0x338b32, 0xc008fb0000, 0x0, 0x17ec0)
    	GOROOT/src/runtime/slice.go:70 +0x77 fp=0xc000b75cb8 sp=0xc000b75c88 pc=0x442c17
    vendor/github.com/couchbase/moss.newSegment(...)
    	vendor/github.com/couchbase/moss/segment.go:158
    vendor/github.com/couchbase/moss.(*segmentStack).merge(0xc005eaf180, 0xc000b75e01, 0xc007dec910, 0xc002d71a90, 0xc00004bc90, 0x10, 0xc00004bcb)
    	vendor/github.com/couchbase/moss/segment_stack_merge.go:73 +0x1bb fp=0xc000b75e48 sp=0xc000b75cb8 pc=0xcb199b
    vendor/github.com/couchbase/moss.(*collection).mergerMain(0xc0004b00c0, 0xc005eaf180, 0xc007dec910, 0x1, 0xc005eaf180)
    	vendor/github.com/couchbase/moss/collection_merger.go:248 +0x306 fp=0xc000b75ef0 sp=0xc000b75e48 pc=0xca6946
    vendor/github.com/couchbase/moss.(*collection).runMerger(0xc0004b00c0)
    	vendor/github.com/couchbase/moss/collection_merger.go:126 +0x2d0 fp=0xc000b75fd8 sp=0xc000b75ef0 pc=0xca5e30
    runtime.goexit()
    	stdlib%/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc000b75fe0 sp=0xc000b75fd8 pc=0x45bbf1
    created by vendor/github.com/couchbase/moss.(*collection).Start
    	vendor/github.com/couchbase/moss/collection.go:118 +0x62
    
    screen shot 2018-10-30 at 11 07 46 pm

    The disk usage grows for about 6 minutes then implodes when I assume the disk is completely filled. The green line after the blip is our service restarting and our indexes being rebuilt

  • Feature Request: Add ability to serialize and deserialize Batches and Collections

    Feature Request: Add ability to serialize and deserialize Batches and Collections

    I am working on a distributed cache that spreads Moss Collections across a cluster of nodes and while I have it working for basic Get, Set, and Delete operations, without the ability to serialize Batches, there isn't a really good way to replicate Batch operations. One solution would be to create my own batch implementation that can be serialized then "replay" the batch on the receiving node to create a moss.Batch, but it would be more convenient if a Batch could just be serialized directly and then deserialized on the receiving end.

    Similarly, I am using Raft for my replication and it would be nice if I could serialize an entire Collection so that I can create a Raft snapshot periodically. Currently, I am just iterating through all of the KVPs in the Collection and serializing them individually with my own serialization format, but this requires me to implement compaction and what-not myself and since Moss already has its own persistence format, as well as its own compaction algorithm, it would be nice to reuse this.

    I'm willing to implement both of these myself and submit PRs, but I was wondering if you had any pointers on doing this in a way that is backwards compatible and fits the overall vision and design goals of Moss.

  • optimization - same length keys

    optimization - same length keys

    if moss can somehow detect for a batch or segment that all the keys are exactly the same length, then one optimization might be to compress the kvs array -- don't need to store the key length

Related tags
A key-value db api with multiple storage engines and key generation
A key-value db api with multiple storage engines and key generation

Jet is a deadly-simple key-value api. The main goals of this project are : Making a simple KV tool for our other projects. Learn tests writing and git

Apr 5, 2022
rosedb is a fast, stable and embedded key-value (k-v) storage engine based on bitcask.
rosedb is a fast, stable and embedded key-value (k-v) storage engine based on bitcask.

rosedb is a fast, stable and embedded key-value (k-v) storage engine based on bitcask. Its on-disk files are organized as WAL(Write Ahead Log) in LSM trees, optimizing for write throughput.

Dec 28, 2022
Keyval - A simple key-value storage library written in Go

keyval keyval is a simple key-value storage library written in Go and its back c

Sep 16, 2022
Fault tolerant, sharded key value storage written in GoLang
Fault tolerant, sharded key value storage written in GoLang

Ravel is a sharded, fault-tolerant key-value store built using BadgerDB and hashicorp/raft. You can shard your data across multiple clusters with mult

Nov 1, 2022
Golang-key-value-store - Key Value Store API Service with Go DDD Architecture

This document specifies the tools used in the Key-Value store and reorganizes how to use them. In this created service, In-Memory Key-Value Service was created and how to use the API is specified in the HTML file in the folder named "doc"

Jul 31, 2022
Distributed, fault-tolerant key-value storage written in go.
Distributed, fault-tolerant key-value storage written in go.

A simple, distributed, fault-tolerant key-value storage inspired by Redis. It uses Raft protocotol as consensus algorithm. It supports the following data structures: String, Bitmap, Map, List.

Jan 3, 2023
The TinyKV course builds a key-value storage system with the Raft consensus algorithm.
The TinyKV course builds a key-value storage system with the Raft consensus algorithm.

The TinyKV Course The TinyKV course builds a key-value storage system with the Raft consensus algorithm. It is inspired by MIT 6.824 and TiKV Project.

Jan 4, 2022
A key-value storage transaction interpretator.

kv-txn-interpreter A key-value storage transaction interpreter, which provides an etcd-like transaction interface to help you build a transaction over

Feb 22, 2022
The TinyKV course builds a key-value storage system with the Raft consensus algorithm
The TinyKV course builds a key-value storage system with the Raft consensus algorithm

The TinyKV Course The TinyKV course builds a key-value storage system with the Raft consensus algorithm. It is inspired by MIT 6.824 and TiKV Project.

Jul 26, 2022
Tinykv - The TinyKV course builds a key-value storage system with the Raft consensus algorithm
Tinykv - The TinyKV course builds a key-value storage system with the Raft consensus algorithm

The TinyKV Course The TinyKV course builds a key-value storage system with the R

Dec 7, 2022
NutsDB a simple, fast, embeddable and persistent key/value store written in pure Go.
NutsDB a simple, fast, embeddable and persistent key/value store written in pure Go.

A simple, fast, embeddable, persistent key/value store written in pure Go. It supports fully serializable transactions and many data structures such as list, set, sorted set.

Jan 9, 2023
CrankDB is an ultra fast and very lightweight Key Value based Document Store.

CrankDB is a ultra fast, extreme lightweight Key Value based Document Store.

Apr 12, 2022
Fast key-value DB in Go.
Fast key-value DB in Go.

BadgerDB BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go. It is the underlying database for Dgraph, a fast,

Jan 1, 2023
Simple Distributed key-value database (in-memory/disk) written with Golang.

Kallbaz DB Simple Distributed key-value store (in-memory/disk) written with Golang. Installation go get github.com/msam1r/kallbaz-db Usage API // Get

Jan 18, 2022
Distributed cache and in-memory key/value data store. It can be used both as an embedded Go library and as a language-independent service.

Olric Distributed cache and in-memory key/value data store. It can be used both as an embedded Go library and as a language-independent service. With

Jan 4, 2023
An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.

go-cache go-cache is an in-memory key:value store/cache similar to memcached that is suitable for applications running on a single machine. Its major

Jan 3, 2023
yakv is a simple, in-memory, concurrency-safe key-value store for hobbyists.
yakv is a simple, in-memory, concurrency-safe key-value store for hobbyists.

yakv (yak-v. (originally intended to be "yet-another-key-value store")) is a simple, in-memory, concurrency-safe key-value store for hobbyists. yakv provides persistence by appending transactions to a transaction log and restoring data from the transaction log on startup.

Feb 24, 2022
ShockV is a simple key-value store with RESTful API
ShockV is a simple key-value store with RESTful API

ShockV is a simple key-value store based on badgerDB with RESTful API. It's best suited for experimental project which you need a lightweight data store.

Sep 26, 2021
Simple in memory key-value store.

Simple in memory key-value store. Development This project is written in Go. Make sure you have Go installed (download). Version 1.17 or higher is req

Nov 6, 2021