open source training courses about distributed database and distributed systemes

Last update: Jan 3, 2023

Comments: 15

Welcome to learn Talent Plan Courses!

Talent Plan is an open source training program initiated by PingCAP. It aims to create or combine some open source learning materials for people interested in open source, distributed systems, Rust, Golang, and other infrastructure knowledges. As such, a series of courses focused on open source collaboration, rust programming, distributed database and systems are provided.

Note:

Each course is developed independently, so they vary in their presentation and their expectations from course-takers. Please see the individual course documentation for details.

Our Courses

Series 1: Open Source Collaboration

Open source collaboration includes a series of open source related learning mateials to help open source enthusiasts have a basic knowledge of what open source software is, the differences among kinds of open source software licenses, how to participate in open source projects and what a welcoming open source community looks like. Courses of this series are:

Series 2: Rust Programming

Two courses are included in this series, which are:

TP 201: Practical Networked Applications in Rust. A series of projects that incrementally develop a single Rust project from the ground up into a high-performance, networked, parallel and asynchronous key/value store. Along the way various real-world and practical Rust development subject matter are explored and discussed.
TP 202: Distributed Systems in Rust. Adapted from the MIT 6.824 distributed systems coursework, this course focuses on implementing important distributed algorithms, including the Raft consensus algorithm, and the Percolator distributed transaction protocol.

Series 3: Distributed Database

Two courses are included in this series, which are:

Series 4: Deep Dive into TiDB Ecosystems

TP 401: Deep Dive into TiDB(WIP)
TP 402: Deep Dive into TiKV(WIP)

See Courses for more details

Contributing to talent plan

Contributions of any kind are welcome! Check out the Contributing Guide in this repository for more information on how you can contribute to talent plan.

We love our community and take great care to ensure it is fun, safe and rewarding. Please review our Code of Conduct for community expectations and guidelines for reporting concerns.

License

These courses may be freely used and modified for any purpose, under the terms of each course's individual license. See the courses for details.

Owner

PingCAP

The team behind TiDB TiKV, an open source MySQL compatible NewSQL HTAP database

https://github.com/pingcap/talent-plan https://university.pingcap.com/talent-plan/

Comments

Create initial readme for Rust course and outline lesson plan

This PR describes a Rust training course called "Practical Networked Applications in Rust", and lays out a draft lesson plan.

Rendered README.

Rendered lesson plan.

The flow of this course is curious, with topics derived from the needs of the hands-on projects and the hands-on projects designed to teach useful and interesting topics, leaving some important topics not covered (at least yet). I believe the lesson plan proceeds quite differently from typical Rust intro/beginner material. I hope that the documentation conveys the rationale sufficiently.

It is not quite finished, but I think there's enough here to understand what it's about, enough to review and critique.

The structure of this course is inspired by the MIT 5.824 distributed systems course, which we are also adapting to Rust as a follow-on course to this one. I intend that the two courses ultimately share a common structure.

The lesson plan is intended to be an index, with links to the full lessons and projects. The descriptions within the lesson plan of the goals, topics etc are intended as a guide to writing the full lessons and project descriptions. Once those are written we might reduce the descriptive content in the lesson plan so that it's easier for the reader to take in the full lesson/project progression by skimming.

I very much want feedback. Are the goals and non-goals reasonable? Is the project interesting? Does the lesson plan make sense? Are the topics interesting? Within the given lesson plan, and considering the projects being implemented, what other topics should be included? As a new Rust programmer, does the content here excite you to take the course? What changes would make the course more exciting?

I'm also looking for people interested in helping build this course, since there's quite a lot of material to cover. If this looks like something you would like to participate in, please speak up.

Note that there is a file here called notes.md that contains notes I've taken about this course. It includes a list of potential subjects and links to other courses and draft training material. I intend the material to take inspiration, if not content, from https://github.com/ferrous-systems/rust-three-days-course, https://github.com/nrc/talks, and other existing training material. We'll do a survey of existing training material soon, with goals of understanding how this course can provide unique value, and how other courses progress from topic to topic.

I was hoping that this PR would fill out some of the material's structure, like how the slides and text are organized, an example full project description, how the projects' tests work; but I haven't made it that far yet.

cc @siddontang PTAL. I hope that the projects are aligned with your expectations. I suspect the volume of lesson material here is more ambitious than you expected, and is very probably more than can be completed in Q1. We can deprioritize certain lessons for an MVP. Give me all the feedback you've got. We haven't discussed how this course is licensed or who owns it, but I've preemptively put a CC-BY notice at the bottom of the readme so that there's some intent conveyed as to the license.

cc @Hoverbear @breeswish @overvenus @rleungx @Hijiao @huachaohuang @nrc if you have the time I'd be grateful for your feedback.

cc @skade I'd be grateful for any review you can offer. Hoverbear and I have been inspired by your rust-three-days-course and you've got lots of experience in Rust training.
./rust code update and ./dss/raft description update
Practical Networked Applications in Rust

Update: BufReaderWithPos ~~and BufWriterWithPos~~.

Fix: Temp_dir is dropped during the test.

Distributed Systems in Rust

Update: README.md

Fix: wrong code in the comment.
Add tests and an example solution to Rust training Project 1

This pr is based on #7 .

The structopt extension is not included yet.

I don't know how to test the cli in integration tests, so I only write tests for KvStore.

The clap get_matches function uses os_args, which I cannot easily hack.

Another way is to use get_matches_from_safe and pass args in, but that will make the signature of our main function horrible to beginners.

assert_cli or assert_cmd should work too. But then I cannot think of the advantage of placing the user interface into the library.
More rust training content

It's been a long time since I've submitted anything, so here's my current branch.

The main things to note here is that project 1 has more text now, and that I've removed gRPC and tokio from the plan (from discussion with Liu we agreed they were too ambitious for these 5 projects and could come in a later project).

Rendered

Tomorrow I will create an issue outlining the next steps so that others might help.

PTAL @overvenus @nolouch
rust/projects: specify tests
To accomplish #118

Specify which tests should be pass in each parts.

Before modifying the code in subsequent projects, I want to hear from @sticnarf @brson . If you think it's ok, I will add commits to change the code in Project these project.

I'm not clear in the problems below:

In project2, we need to implement set/rm in part2, but some tests for set/rm use get, which should be implemented in part3.

In project3, I only specify the tests in part4 and part6. I wonder if there are some more proper way to place these tests.
Outline Rust project 4
Fixes https://github.com/pingcap/talent-plan/issues/38

I put this together quickly after our discussion @sticnarf. I do have a few concerns:

there is apparently no production-quality concurrent map type we can incorporate :-/

like in the previous project, this abstracts a thing then compares to implementations. I don't know if it is valuable to do the same type of exercise again. I'd prefer if we can find a new angle on this, so I've suggested we do a compile-time abstraction using cargo features, but I'm open to any ideas.

Tidb week 1 homework: single thread merge sort already faster than sort.Slice

I found in my computer, sort.Slice is about 15s, merge sort in single thread need about 9s. Then I write a base quick sort in hand with go, it is only about 7s in benchmark(sort.Slice slower than hand quick sort maybe because reflect?). Anybody see the same phenomenon？

Implementions is as follow(go version: 1.12.2):

merge sort:

// MergeSort performs the merge sort algorithm.
// Please supplement this function to accomplish the home work.
func MergeSort(src []int64) {
	dst := make([]int64, len(src))
	copy(dst, src)
	mSort(dst, src, 0, len(src))
}

//从src归并到dest
func mSort(src []int64, dest []int64, low int, high int) {
	if high-low <= 1 {
		return
	}

	mid := (low + high) >> 1
	mSort(dest, src, low, mid)
	mSort(dest, src, mid, high)

	for i, p, q := low, low, mid; i < high; i++ {
		if q >= high || (p < mid && src[p] < src[q]) {
			dest[i] = src[p]
			p++
		} else {
			dest[i] = src[q]
			q++
		}
	}
}

my hand quick sort:

func quickSort(arr []int64) {
	qSort(arr, 0, len(arr) - 1)
}

func qSort(arr []int64, low int, high int) {
	if low < high{
		pivot := partition(arr, low, high)
		qSort(arr, low, pivot - 1)
		qSort(arr, pivot + 1, high)
	}
}

func partition(arr []int64, low int, high int) int {
	pivotKey := arr[low]
	for low < high{
		for low < high && arr[high] >= pivotKey{
			high--
		}
		arr[low] = arr[high]
		for low < high && arr[low] <= pivotKey{
			low++
		}
		arr[high] = arr[low]
	}
	arr[low] = pivotKey

	return low
}

benchmark:

func BenchmarkMergeSort(b *testing.B) {
	numElements := 16 << 20
	src := make([]int64, numElements)
	original := make([]int64, numElements)
	prepare(original)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		b.StopTimer()
		copy(src, original)
		b.StartTimer()
		MergeSort(src)
	}
}


func BenchmarkNormalSort(b *testing.B) {
	// 16M个整数
	numElements := 16 << 20
	src := make([]int64, numElements)
	original := make([]int64, numElements)
	prepare(original)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		b.StopTimer()
		copy(src, original)
		b.StartTimer()
		sort.Slice(src, func(i, j int) bool { return src[i] < src[j] })
	}
}

func BenchmarkQuickSort(b *testing.B) {
	// 16M个整数
	numElements := 16 << 20
	src := make([]int64, numElements)
	original := make([]int64, numElements)
	prepare(original)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		b.StopTimer()
		copy(src, original)
		b.StartTimer()
		quickSort(src)
	}
}

Benchmark results as follows:

goos: linux
goarch: amd64
pkg: pingcap/talentplan/tidb/mergesort
BenchmarkMergeSort-12                 1         9503371329 ns/op
BenchmarkNormalSort-12                1       15261800118 ns/op
BenchmarkQuickSort-12                  1         7552893791 ns/op
PASS
ok      pingcap/talentplan/tidb/mergesort       37.860s

We can see that merge sort is faster than sort.Slice，hand quick sort is faster than merge sort.

I also benchmark sort.Sort, found it is slower than sort.Slice

If so, Is it more meaningful to compare parallel merge sort to quick sort in hand, instead of sort.Slice?

Outline project 3

This builds on https://github.com/pingcap/talent-plan/pull/36.

This project has been in flux, I made a bunch of calls here on what it is about, and I think it has become pretty substantial. Anybody who completes this course is going to be on their way to writing a real database.

Closes #37.
rust: remove example solution

Recently, I find there are some students who turn to the example solution directly before they finish their own project. So I think we need to remove the example solution from the repo.

Note that it does not mean we don't trust students. Probably, they just don't read the lesson plan carefully and click into the solution by mistake.

There is another important reason why we'd better remove it. The distributed system course in this repo adapts the way of filling the framework. It is better to keep consistent among courses. (There is a student who took dss course first. Then, he thought the rust course is the same way so he cloned the repo and modified the existsing code to finish the course.)
kvraft: generic_test with unreliable and crash args set to true
Hi, I am working on the kvraft lab, and I find my code pass or not pass some tests randomly, and I had spent long time to debug with it (use println! 🙂), it leads me to the tests, and the Rpc framework, and I not sure is it just bugs of my implement or the tests, but here is what I found: within the generic_test function, if the crash arg set to true, the tester will isolating all server's network, so servers can't not reach neither other servers nor the clients, and the tester also take a clone of all server's persister. And there come the problem: the Rpc framework only check network connection and set up network environment(to simulate worse network environment) when request comes in and pull it to the poller to simulate network, but the Rpc framework will not check network connection when simulate finished and process the reply base on the network environment seted up when request come in. And come back to the test, when the tester call shutdown_server on a server, the situation mentioned on the defination of shutdown_server may still happen:

// disable client connections to the server. // it's important to do this before creating // the new Persister in saved[i], to avoid // the possibility of the server returning a // positive reply to an Append but persisting // the result in the superseded Persister. self.net.delete_server(&format!("{}", i));

specifiy when a client send a PutAppand request to the leader, and the leader send the append_entries request to majority of the followers, and then tester call shutdown_server on all server and create new Persister base on the Persister of a server. But as Rpc framework do not check network on reply, the majority of follower can still response to leader, and leader can still response a positive reply to client, and leader and follower persisting the result in the superseded Persister. This happen only when tester call shutdown_server between

leader send the append_entries request to majority of the followers

Rpc framework finish simulate of the network transfer and dispatch request to followers for process

usually this interval will be very small but when the unreliable arg to generic_test set to true there will be a short_delay in this interval which increasing the possibility, and also when implementing lease read, all log entries will be PutAppend command, and also increasing the possibility, and in my implement of kvraft which happend 10s times pass and 1 time not pass mainly on tests: test_persist_concurrent_unreliable_3a and test_snapshot_unreliable_recover_3b like: panicked at '1 missing element "x 1 31 y" in Append result ... And that is what I find, I not really sure is it jut a bug of my implement, as I am new to Rust, may be there are just some native misunderstanding of the tests and the Rpc framework, and I very need help on it, thanks advance .
KVRaft: RPC Seems Blocking the Whole RPC Network
Hi, I come across the problem when implementing KVRaft 3A.

I could pass test_basic_3a, where there is only one client.

But in test_concurrent_3a, a weird blocking problem appears, I have simplified the code to make the problem more obvious, here is what I did:

Two clients call put_append RPC to one same KVServer(leader or follower) at the same time;

KVServer's put_append handler does one thing and only one thing: thread::sleep(5 Seconds);

So KVserver is sleeping, but in my opinion, its Raft Module should work as normal (i.e. sends heartbeat to maintain its authority). But, after a Raft Election Timeout (from the time when the KVServer receives these two RPCs), all other raft module becomes candidate and request votes from others, and as I observed, no one could receive any RPC from anyone.

Above is a simplified version, in my real implementation(which passed one client test_basic_3a), this problem still exists, this is how it comes:

Like before, two clients call put_append RPC to KVServer(leader) at the same time;

Node is locked, so I can focus on dealing with one RPC at a time.

KVServer's put_append handler calls Raft.start(&message) once, which would send AppendEntries RPC to peers;

If what I'm suspecting is correct: the two client RPC calls is blocking the whole RPC network, thus no AppendEntries RPC will be received by peers, such that the KVServer(leader server) Polls apply_ch but cannot get a response, and waits forever.

Then after a Raft Election Timeout this leader loses its leadership.

Loops.

I hope someone could help me with this problem, is this a common problem or has anyone got the KVRaft task done without encountering the blocking problem?

Below is part of my simplified code, if someone interested in trying himself.

In client.rs

impl Clerk { fn put_append(&self, op: Op) { let reply = self.servers[leader_i].put_append(&Op_args) .map_err(Error::Rpc).wait(); } }

In server.rs

impl Node { pub fn new(kv: KvServer) -> Node { Node { server: Arc::new(Mutex::new(kv)), } } } impl KvService for Node { fn get(&self, arg: GetRequest) -> RpcFuture<GetReply> { unimplemented!() } fn put_append(&self, arg: PutAppendRequest) -> RpcFuture<PutAppendReply> { thread::sleep(5000 * MILLIS); unimplemented!() } }

I think I can work around this problem by letting KVServer refuse the second RPC from Client, such that there could be only one RPC not responsed at one time, but this shouldn't be the correct way right?
courses/tp102: Update outdated links for github learning material
What problem does this PR solve?

Brief description of the problem: Current links for first 4 chapter's learning materials are invalid; 404 error.

What is changed and how it works?

Update links to the new url of the repositories under GitHub Skills

Check List

Tests

No code
Added TalentPlan Github course for suggestion
generally maintained the same format

What problem does this PR solve?

Issue number: close #xxx

Breif description of the problem:

What is changed and how it works?

Check List

Tests

Unit test

Integration test

Manual test (add detailed scripts or steps below)

No code

Side effects

Possible performance regression

Increased code complexity

Related changes

Need to update the documentation
[Update] bump up version of rustc and dependencies in PNA Rust
What problem does this PR solve?

Issue number: close #434

Breif description of the problem: the tutorial of CLI has been out of date, and not work well under current clap ecosystem.

What is changed and how it works?

I append some help documentations for bb1's exercise and bump up project 1's rustc version. actually I also consider the exercise doesn't match current rust ecosystem, but I have no idea which tutorial could work it out instead. maybe this is not a bad choice: https://medium.com/@ukpaiugochi0/building-a-cli-from-scratch-with-clapv3-fb9dc5938c82
Bump thread_local from 1.0.1 to 1.1.4 in /courses/dss
Bumps thread_local from 1.0.1 to 1.1.4.

Commits

4a54e57 Bump version to 1.1.4

ebf8b45 Merge pull request #34 from ibraheemdev/patch-1

3d69afa Fix memory ordering in RawIter::next

c7d8dcd Bump version to 1.1.3

5e8bbf2 Merge pull request #30 from Marwes/fix_drop

a44b836 fix: Drop the value in the ThreadLocal on drop

322cf34 Bump version to 1.1.2

dca4007 Merge pull request #29 from Kestrer/raw-iter

33ad405 Add #[inline] to non-generic functions

810c043 Implement iterator logic in RawIter

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

Database Abstraction Layer (dbal) for Go. Support SQL builder and get result easily (now only support mysql)

godbal Database Abstraction Layer (dbal) for go (now only support mysql) Motivation I wanted a DBAL that No ORM、No Reflect、Concurrency Save, support S

Nov 17, 2022

Additions to Go's database/sql for super fast performance and convenience. (fork of gocraft/dbr)

dbr (fork of gocraft/dbr) provides additions to Go's database/sql for super fast performance and convenience. Getting Started // create a connection (

Dec 31, 2022

Filebuilder - Go tool for creating concatenated builds of source files

filebuilder filebuilder is a litte tool (written in Go) that lets you concatenat

Jan 11, 2022

Go database query builder library for PostgreSQL

buildsqlx Go Database query builder library Installation Selects, Ordering, Limit & Offset GroupBy / Having Where, AndWhere, OrWhere clauses WhereIn /

Dec 23, 2022

Zero boilerplate database operations for Go

(Now compatible with MySQL and PostgreSQL!) Everyone knows that performing simple DATABASE queries in Go takes numerous lines of code that is often re

Dec 17, 2022

A Go (golang) package that enhances the standard database/sql package by providing powerful data retrieval methods as well as DB-agnostic query building capabilities.

ozzo-dbx Summary Description Requirements Installation Supported Databases Getting Started Connecting to Database Executing Queries Binding Parameters

Dec 31, 2022

Bulk query SQLite database over the network

SQLiteQueryServer Bulk query SQLite database over the network. Way faster than SQLiteProxy!

May 20, 2022

Mocking your SQL database in Go tests has never been easier.

copyist Mocking your SQL database in Go tests has never been easier. The copyist library automatically records low-level SQL calls made during your te

Dec 19, 2022

Go library for accessing multi-host SQL database installations

hasql hasql provides simple and reliable way to access high-availability database setups with multiple hosts. Status hasql is production-ready and is

Dec 28, 2022

Document-oriented, embedded SQL database

Genji Document-oriented, embedded, SQL database Table of contents Table of contents Introduction Features Installation Usage Using Genji's API Using d

Jan 1, 2023

Database access layer for golang

grimoire ⚠️ Grimoire V2 is available as REL and Changeset package. Grimoire is a database access layer inspired by Ecto. It features a flexible query

Dec 21, 2022

Generate a Go ORM tailored to your database schema.

SQLBoiler is a tool to generate a Go ORM tailored to your database schema. It is a "database-first" ORM as opposed to "code-first" (like gorm/gorp). T

Jan 9, 2023

SQL builder and query library for golang

__ _ ___ __ _ _ _ / _` |/ _ \ / _` | | | | | (_| | (_) | (_| | |_| | \__, |\___/ \__, |\__,_| |___/ |_| goqu is an expressive SQL bu

Dec 30, 2022

Type safe SQL builder with code generation and automatic query result data mapping

Jet Jet is a complete solution for efficient and high performance database access, consisting of type-safe SQL builder with code generation and automa

Jan 6, 2023

Type safe SQL query builder and struct mapper for Go

sq (Structured Query) ?? ?? sq is a code-generated, type safe query builder and struct mapper for Go. ?? ?? Documentation • Reference • Examples This

Dec 19, 2022

Command line tool to generate idiomatic Go code for SQL databases supporting PostgreSQL, MySQL, SQLite, Oracle, and Microsoft SQL Server

About xo xo is a command-line tool to generate Go code based on a database schema or a custom query. xo works by using database metadata and SQL intro

Jan 8, 2023

open source training courses about distributed database and distributed systemes

Welcome to learn Talent Plan Courses!

Our Courses

Series 1: Open Source Collaboration

Series 2: Rust Programming

Series 3: Distributed Database

Series 4: Deep Dive into TiDB Ecosystems

Contributing to talent plan

We're here to help

License

Owner

PingCAP

Comments

Create initial readme for Rust course and outline lesson plan

./rust code update and ./dss/raft description update

Add tests and an example solution to Rust training Project 1

More rust training content

rust/projects: specify tests

Outline Rust project 4

Tidb week 1 homework: single thread merge sort already faster than sort.Slice

Outline project 3

rust: remove example solution

kvraft: generic_test with unreliable and crash args set to true

KVRaft: RPC Seems Blocking the Whole RPC Network

courses/tp102: Update outdated links for github learning material

What problem does this PR solve?

What is changed and how it works?

Check List

Added TalentPlan Github course for suggestion

What problem does this PR solve?

What is changed and how it works?

Check List

[Update] bump up version of rustc and dependencies in PNA Rust

What problem does this PR solve?

What is changed and how it works?

Bump thread_local from 1.0.1 to 1.1.4 in /courses/dss

Related tags

Database Abstraction Layer (dbal) for Go. Support SQL builder and get result easily (now only support mysql)

Additions to Go's database/sql for super fast performance and convenience. (fork of gocraft/dbr)

Filebuilder - Go tool for creating concatenated builds of source files

Go database query builder library for PostgreSQL

Zero boilerplate database operations for Go

A Go (golang) package that enhances the standard database/sql package by providing powerful data retrieval methods as well as DB-agnostic query building capabilities.

Bulk query SQLite database over the network

Mocking your SQL database in Go tests has never been easier.

Go library for accessing multi-host SQL database installations

Document-oriented, embedded SQL database

Database access layer for golang

Generate a Go ORM tailored to your database schema.

SQL builder and query library for golang

Type safe SQL builder with code generation and automatic query result data mapping

Type safe SQL query builder and struct mapper for Go

Command line tool to generate idiomatic Go code for SQL databases supporting PostgreSQL, MySQL, SQLite, Oracle, and Microsoft SQL Server

golang orm and sql builder

Query AWS Athena and download the result as CSV.

Golang Sequel ORM that support Enum, JSON, Spatial and many more