Mapreduce - A in-process MapReduce tool to help you to optimize service response time.

mapreduce

English | 简体中文

Go codecov Go Report Card Release License: MIT

Why we have this repo?

mapreduce is part of go-zero, but a few people asked if mapreduce can be used separately. But I recommend you to use go-zero for many more features.

Design ideas

Let's try to put ourselves in the author's shoes and sort out the possible business scenarios for the concurrency tool:

  1. querying product details: supporting concurrent calls to multiple services to combine product attributes, and supporting call errors that can be ended immediately.
  2. automatic recommendation of user card coupons on product details page: support concurrently verifying card coupons, automatically rejecting them if they fail, and returning all of them.

The above is actually processing the input data and finally outputting the cleaned data. There is a very classic asynchronous pattern for data processing: the producer-consumer pattern. So we can abstract the life cycle of data batch processing, which can be roughly divided into three phases.

  1. data production generate
  2. data processing mapper
  3. data aggregation reducer

Data producing is an indispensable stage, data processing and data aggregation are optional stages, data producing and processing support concurrent calls, data aggregation is basically a pure memory operation, so a single concurrent process can do it.

Since different stages of data processing are performed by different goroutines, it is natural to consider the use of channel to achieve communication between goroutines.

How can I terminate the process at any time?

It's simple, just listen to a global end channel or the given context in the goroutine.

A simple example

Calculate the sum of squares, simulating the concurrency.

package main

import (
    "fmt"
    "log"

    "github.com/kevwan/mapreduce"
)

func main() {
    val, err := mapreduce.MapReduce(func(source chan<- interface{}) {
        // generator
        for i := 0; i < 10; i++ {
            source <- i
        }
    }, func(item interface{}, writer mapreduce.Writer, cancel func(error)) {
        // mapper
        i := item.(int)
        writer.Write(i * i)
    }, func(pipe <-chan interface{}, writer mapreduce.Writer, cancel func(error)) {
        // reducer
        var sum int
        for i := range pipe {
            sum += i.(int)
        }
        writer.Write(sum)
    })
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println("result:", val)
}

References

go-zero: https://github.com/zeromicro/go-zero

Give a Star!

If you like or are using this project to learn or start your solution, please give it a star. Thanks!

Similar Resources

Spanner - A handy tool for visualising Datadog traces

Spanner - A handy tool for visualising Datadog traces

Spanner A minimal tool for visualising Datadog traces 🔧 Installation You can in

Jan 2, 2022

Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

Prometheus Common Data Exporter Prometheus Common Data Exporter 用于将多种来源(如http响应报文、本地文件、TCP响应报文、UDP响应报文)的Json、xml、yaml或其它格式的数据,解析为Prometheus metric数据。

May 18, 2022

Goget will send a http request, and show the request time, status, response, and save response to a file

Goget will send a http request, and show the request time, status, response, and save response to a file

Feb 9, 2022

Gowl is a process management and process monitoring tool at once. An infinite worker pool gives you the ability to control the pool and processes and monitor their status.

Gowl is a process management and process monitoring tool at once. An infinite worker pool gives you the ability to control the pool and processes and monitor their status.

Gowl is a process management and process monitoring tool at once. An infinite worker pool gives you the ability to control the pool and processes and monitor their status.

Nov 10, 2022

Squizit is a simple tool, that aim to help you get the grade you want, not the one you have learnt for.

Squizit is a simple tool, that aim to help you get the grade you want, not the one you have learnt for.

Squizit is a simple tool, that aim to help you get the grade you want, not the one you have learnt for. Screenshots First, input PIN Then enjoy! Hoste

Mar 11, 2022

Catalyst is an incident response platform / SOAR (Security Orchestration, Automation and Response) system.

Catalyst is an incident response platform / SOAR (Security Orchestration, Automation and Response) system.

Catalyst Speed up your reactions Website - The Catalyst Handbook (Documentation) - Try online (user: bob, password: bob) Catalyst is an incident respo

Jan 6, 2023

Lightweight CLI tool to programmatically rescale your Hetzner virtual server daily to optimize your budget spending

Lightweight CLI tool to programmatically rescale your Hetzner virtual server daily to optimize your budget spending

Nov 28, 2022

A Telegram bot that asks you a question and evaluate the response you provide.

A Telegram bot that asks you a question and evaluate the response you provide.

PiSquared A Telegram bot that asks you a question and evaluate the response you provide. Thanks to the labse_bert model, the evaluation of the answer

Nov 13, 2022

Run your MapReduce workloads as a single binary on a single machine with multiple CPUs and high memory. Pricing of a lot of small machines vs heavy machines is the same on most cloud providers.

gomap Run your MapReduce workloads as a single binary on a single machine with multiple CPUs and high memory. Pricing of a lot of small machines vs he

Sep 16, 2022

Golang implement of MapReduce

 Golang implement of MapReduce

mapreduce implement in Golang, EE447 Final Project

May 17, 2021

Optimize Windows's network/NIC driver settings for NewTek's NDI(Network-Device-Interface).

windows-ndi-optimizer[WIP] Optimize Windows's network/NIC driver settings for NewTek's NDI(Network-Device-Interface). How it works This is batchfile d

Apr 15, 2022

A Github action to codon optimize sequences.

codon-optimize A Github action to codon optimize sequences. codon-optimize is a Github Action that receives a path for an amino acid fasta file (faa),

Jul 28, 2022

A best practices Go source project with unit-test and integration test, also use skaffold & helm to automate CI & CD at local to optimize development cycle

Dependencies Docker Go 1.17 MySQL 8.0.25 Bootstrap Run chmod +x start.sh if start.sh script does not have privileged to run Run ./start.sh --bootstrap

Apr 4, 2022

Removes unnecessarily saved git objects to optimize the size of the .git directory.

Git Repo Cleaner Optimizes the size of the .git directory by removing all of the files that are unnecessarily-still-saved as part of the git history.

Mar 24, 2022

The MapReduce pattern with Goroutines and channels to count n-grams in a directory of text files

MapReduce Ngram This Golang program implements the MapReduce pattern with Goroutines and channels to count n-grams in a directory of text files. Usage

Dec 16, 2021

An image server which automatically optimize non webp and avif images to webp and avif images

go-imageserver go-imageserver is an image server which automatically optimize no

Apr 18, 2022

Green: a distribute key value system for optimize block chain data

Green: a distribute key value system for optimize block chain data

Introduce Green is a distribute key value system for optimize block chain data A

Jan 6, 2022

Search running process for a given dll/function. Exposes a bufio.Scanner-like interface for walking a process' PEB

Search running process for a given dll/function. Exposes a bufio.Scanner-like interface for walking a process' PEB

Apr 21, 2022

A system written in Golang to help ops team to automate the process of mapping Vault groups to LDAP Groups.

A system written in Golang to help ops team to automate the process of mapping Vault groups to LDAP Groups. This utility automatically adds LDAP Groups' members to the corresponding Vault Groups.

Nov 12, 2021
Comments
  • `mapreduce.Finish`中嵌套使用`mapreduce.MapReduce`会导致`Finish`变成非阻塞操作

    `mapreduce.Finish`中嵌套使用`mapreduce.MapReduce`会导致`Finish`变成非阻塞操作

    嵌套的MapReducereducer中如果不调用writer.Write方法,会产生一个ErrReduceNoOutput错误,Finish 中 worker 返回异常会直接结束 Finish 调用,但是Finish中调用的MapReduceVoid会吞掉ErrReduceNoOutput错误返回一个 nil,从最后结果看是没有异常的成功调用,实际其他的 worker 都还在异步运行

    例如下面这样的调用:

    func main(){
            err := mapreduce.Finish(func() error {
    		return worker1()
    	}, func() error {
    		val, err := mapreduce.MapReduce(func(source chan<- interface{}) {
    			for i := 0;i<10;i++{
    				source <- i
    			}
    		}, func(item interface{}, writer mapreduce.Writer, cancel func(error)) {
    			i := item.(int)
    			writer.Write(i * i)
    		}, func(pipe <-chan interface{}, writer mapreduce.Writer, cancel func(error)) {
    			var cnt int
    			for i := range pipe{
    				cnt += i.(int)
    			}
                             // 这里不调用Write 会导致当前这个 worker 任务返回一个异常
    			// writer.Write(cnt) 
    		})
                     // 收到一个异常 `ErrReduceNoOutput`
    		if err != nil {
    			return err
    		}
    		fmt.Println("result:", val)
    	})
           // 这里的 err 是 nil
           if err != nil {
               fmt.Println(err)
          }
    }
    
A tree like tool help you to explore data structures in your redis server
 A tree like tool help you to explore data structures in your redis server

Redis-view is a tree like tool help you explore data structures in your redis server

Mar 17, 2022
Custom generic HTTP handler providing automatic JSON decoding/encoding of HTTP request/response to your concrete types

gap Custom generic HTTP handler providing automatic JSON decoding/encoding of HTTP request/response to your concrete types. gap.Wrap allows to use the

Aug 28, 2022
Go implementation of the van Emde Boas tree data structure: Priority queue for positive whole numbers in O(log log u) time.

vEB Go implementation of the van Emde Boas tree data structure: Priority queue for positive whole numbers in O(log log u) time. Supports the following

Mar 7, 2022
The simplest sorting algorithm that sorts in quadratic time
The simplest sorting algorithm that sorts in quadratic time

Simplest sort Showcases the simplest sorting algorithm that works in quadratic time. From here. The pseudocode for this algo can be seen below (sorts

Oct 14, 2022
Golang examples of algorithms according to its time complexity.

big-o-notation-go Examples of algorithms and explanation for each Big O Notation category. Some examples are based in this video. If you didn't manage

Sep 1, 2022
The Go library that will drive you to AOP world!

Beyond The Golang library that will drive you to the AOP paradigm world! Check Beyond Documentation What's AOP? In computing, aspect-oriented programm

Dec 6, 2022
☔️ A complete Go cache library that brings you multiple ways of managing your caches
☔️ A complete Go cache library that brings you multiple ways of managing your caches

Gocache Guess what is Gocache? a Go cache library. This is an extendable cache library that brings you a lot of features for caching data. Overview He

Jan 1, 2023
Standardized Malware Analysis Tool

S.M.A.T Standardized Malware Analysis Toolkit Capabilities Unpac.me sample submission download results check if already submitted MWDB query for confi

Oct 25, 2022
Dasel - Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool.
Dasel - Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool.

Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

Jan 1, 2023
A Connected Graph Generator tool that construct graphs of some given size

graph graph is a Connected Graph Generator tool that construct graphs of some given size. Notice that it generates all possible connected, undirected

Nov 5, 2021