Modern Job Scheduler

AJ Bahnken

Last update: Dec 21, 2022

Comments: 16

Kala

Kala is a simplistic, modern, and performant job scheduler written in Go. Features:

Single binary
No dependencies
JSON over HTTP API
Job Stats
Configurable Retries
Scheduling with ISO 8601 Date and Interval notation
Dependent Jobs
Persistent with several database drivers
Web UI

Note that it is not battle-tested. Use at your own risk.

Kala was inspired by the desire for a simpler Chronos (developed by Airbnb). Kala is Chronos for the rest of us.

If you need fault tolerance, distributed features, massive scale, then I recommend checking out Chronos. This is designed to be the Chronos for start-ups.

Installing Kala

Kala uses Go Modules

Get Kala
```
go get github.com/ajvb/kala
```
Run Kala
```
kala serve
```

Getting Started

Once you have installed Kala onto the machine you would like to use, you can follow the below steps to start using it.

To run Kala as a server:

$ kala serve
INFO[0000] Preparing cache
INFO[0000] Starting server on port :8000

$ kala serve -p 2222
INFO[0000] Preparing cache
INFO[0000] Starting server on port :2222

Kala uses BoltDB by default for the job database by using jobdb and boltpath params:

kala serve --jobdb=boltdb --boltpath=/path/to/dir

use Redis by using the jobdb and jobdb-address params:

kala serve --jobdb=redis --jobdb-address=127.0.0.1:6379

use Consul by using the jobdb and jobdb-address params:

kala serve --jobdb=consul --jobdb-address=127.0.0.1:8500

use Mongo by using the jobdb, jobdb-address, jobdb-username, and jobdb-password params:

kala serve --jobdb=mongo --jobdb-address=server1.example.com,server2.example.com --jobdb-username=admin --jobdb-password=password

use Postgres by using the jobdb, jobdb-address params:

kala serve --jobdb=postgres --jobdb-address=server1.example.com/kala --jobdb-username=admin --jobdb-password=password

use MariaDB, MySQL by using the jobdb, jobdb-address, jobdb-tls-capath, jobdbTlsCertPath, jobdb-tls-keypath, jobdb-tls-servername params:

kala serve --jobdb=mariadb --jobdb-address=(server1.example.com)/kala --jobdb-username=admin --jobdb-password=password

kala serve --jobdb=mysql --jobdb-address="tcp(server1.example.com:3306)/kala?tls=custom" --jobdb-username=admin --jobdb-password=password --jobdb-tls-capath=/path/to/server-ca.pem --jobdbTlsCertPath=/path/to/client-cert.pem --jobdb-tls-keypath=/path/to/client-key.pem --jobdb-tls-servername=server1.example.com

Kala runs on 127.0.0.1:8000 by default. You can easily test it out by curling the metrics path.

$ curl http://127.0.0.1:8000/api/v1/stats/
{"Stats":{"ActiveJobs":2,"DisabledJobs":0,"Jobs":2,"ErrorCount":0,"SuccessCount":0,"NextRunAt":"2015-06-04T19:25:16.82873873-07:00","LastAttemptedRun":"0001-01-01T00:00:00Z","CreatedAt":"2015-06-03T19:58:21.433668791-07:00"}}

Once it's up in running, you can utilize curl or the official go client to interact with Kala. Also check out the examples directory.

Examples of Usage

There are more examples in the examples directory within this repo. Currently its pretty messy. Feel free to submit a new example if you have one.

Deployment

Supervisord

After installing supervisord, open its config file (/etc/supervisor/supervisord.conf is the default usually) and add something like:

[program:kala]
command=kala serve
autorestart=true
stdout_logfile=/var/log/kala.stdout.log
stderr_logfile=/var/log/kala.stderr.log

Docker

If you have docker installed, you can build the dockerfile in this directory with docker build -t kala . and run it as a daemon with: docker run -it -d -p 8000:8000 kala

API v1 Docs

All routes have a prefix of /api/v1

Client Libraries

Official:

Go - Docs: http://godoc.org/github.com/ajvb/kala/client
```
go get github.com/ajvb/kala/client
```

Contrib:

Node.js
```
npm install kala-node
```

Python

pip install git+https://github.com/dmajere/kala-python.git

Job Data Struct

Docs can be found here

Things to Note

If schedule is omitted, the job will run immediately.

Job JSON Example

{
        "name":"test_job",
        "id":"93b65499-b211-49ce-57e0-19e735cc5abd",
        "command":"bash /home/ajvb/gocode/src/github.com/ajvb/kala/examples/example-kala-commands/example-command.sh",
        "owner":"",
        "disabled":false,
        "dependent_jobs":null,
        "parent_jobs":null,
        "schedule":"R2/2015-06-04T19:25:16.828696-07:00/PT10S",
        "retries":0,
        "epsilon":"PT5S",
        "success_count":0,
        "last_success":"0001-01-01T00:00:00Z",
        "error_count":0,
        "last_error":"0001-01-01T00:00:00Z",
        "last_attempted_run":"0001-01-01T00:00:00Z",
        "next_run_at":"2015-06-04T19:25:16.828794572-07:00"
}

Breakdown of schedule string. (ISO 8601 Notation)

Example schedule string:

R2/2017-06-04T19:25:16.828696-07:00/PT10S

This string can be split into three parts:

Number of times to repeat/Start Datetime/Interval Between Runs

Number of times to repeat

This is designated with a number, prefixed with an R. Leave out the number if it should repeat forever.

Examples:

R - Will repeat forever
R1 - Will repeat once
R231 - Will repeat 231 times.

Start Datetime

This is the datetime for the first time the job should run.

Kala will return an error if the start datetime has already passed.

Examples:

2017-06-04T19:25:16
2017-06-04T19:25:16.828696
2017-06-04T19:25:16.828696-07:00
2017-06-04T19:25:16-07:00

To Note: It is recommended to include a timezone within your schedule parameter.

Interval Between Runs

This is defined by the ISO8601 Interval Notation.

It starts with a P, then you can specify years, months, or days, then a T, preceded by hours, minutes, and seconds.

Lets break down a long interval: P1Y2M10DT2H30M15S

P - Starts the notation
1Y - One year
2M - Two months
10D - Ten days
T - Starts the time second
2H - Two hours
30M - Thirty minutes
15S - Fifteen seconds

Now, there is one alternative. You can optionally use just weeks. When you use the week operator, you only get that. An example of using the week operator for an interval of every two weeks is P2W.

Examples:

P1DT1M - Interval of one day and one minute
P1W - Interval of one week
PT1H - Interval of one hour.

More Information on ISO8601

Wikipedia's Article

Overview of routes

Task	Method	Route
Creating a Job	POST	/api/v1/job/
Getting a list of all Jobs	GET	/api/v1/job/
Getting a Job	GET	/api/v1/job/{id}/
Deleting a Job	DELETE	/api/v1/job/{id}/
Deleting all Jobs	DELETE	/api/v1/job/all/
Getting metrics about a certain Job	GET	/api/v1/job/stats/{id}/
Starting a Job manually	POST	/api/v1/job/start/{id}/
Disabling a Job	POST	/api/v1/job/disable/{id}/
Enabling a Job	POST	/api/v1/job/enable/{id}/
Getting app-level metrics	GET	/api/v1/stats/

/job

This route accepts both a GET and a POST. Performing a GET request will return a list of all currently running jobs. Performing a POST (with the correct JSON) will create a new Job.

Note: When creating a Job, the only fields that are required are the Name and the Command field. But, if you omit the Schedule field, the job will be ran immediately.

Example:

$ curl http://127.0.0.1:8000/api/v1/job/
{"jobs":{}}
$ curl http://127.0.0.1:8000/api/v1/job/ -d '{"epsilon": "PT5S", "command": "bash /home/ajvb/gocode/src/github.com/ajvb/kala/examples/example-kala-commands/example-command.sh", "name": "test_job", "schedule": "R2/2017-06-04T19:25:16.828696-07:00/PT10S"}'
{"id":"93b65499-b211-49ce-57e0-19e735cc5abd"}
$ curl http://127.0.0.1:8000/api/v1/job/
{
    "jobs":{
        "93b65499-b211-49ce-57e0-19e735cc5abd":{
            "name":"test_job",
            "id":"93b65499-b211-49ce-57e0-19e735cc5abd",
            "command":"bash /home/ajvb/gocode/src/github.com/ajvb/kala/examples/example-kala-commands/example-command.sh",
            "owner":"",
            "disabled":false,
            "dependent_jobs":null,
            "parent_jobs":null,
            "schedule":"R2/2017-06-04T19:25:16.828696-07:00/PT10S",
            "retries":0,
            "epsilon":"PT5S",
            "success_count":0,
            "last_success":"0001-01-01T00:00:00Z",
            "error_count":0,
            "last_error":"0001-01-01T00:00:00Z",
            "last_attempted_run":"0001-01-01T00:00:00Z",
            "next_run_at":"2017-06-04T19:25:16.828794572-07:00"
        }
    }
}

/job/{id}

This route accepts both a GET and a DELETE, and is based off of the id of the Job. Performing a GET request will return a full JSON object describing the Job. Performing a DELETE will delete the Job.

Example:

$ curl http://127.0.0.1:8000/api/v1/job/93b65499-b211-49ce-57e0-19e735cc5abd/
{"job":{"name":"test_job","id":"93b65499-b211-49ce-57e0-19e735cc5abd","command":"bash /home/ajvb/gocode/src/github.com/ajvb/kala/examples/example-kala-commands/example-command.sh","owner":"","disabled":false,"dependent_jobs":null,"parent_jobs":null,"schedule":"R2/2017-06-04T19:25:16.828696-07:00/PT10S","retries":0,"epsilon":"PT5S","success_count":0,"last_success":"0001-01-01T00:00:00Z","error_count":0,"last_error":"0001-01-01T00:00:00Z","last_attempted_run":"0001-01-01T00:00:00Z","next_run_at":"2017-06-04T19:25:16.828737931-07:00"}}
$ curl http://127.0.0.1:8000/api/v1/job/93b65499-b211-49ce-57e0-19e735cc5abd/ -X DELETE
$ curl http://127.0.0.1:8000/api/v1/job/93b65499-b211-49ce-57e0-19e735cc5abd/

/job/stats/{id}

Example:

$ curl http://127.0.0.1:8000/api/v1/job/stats/5d5be920-c716-4c99-60e1-055cad95b40f/
{"job_stats":[{"JobId":"5d5be920-c716-4c99-60e1-055cad95b40f","RanAt":"2017-06-03T20:01:53.232919459-07:00","NumberOfRetries":0,"Success":true,"ExecutionDuration":4529133}]}

/job/start/{id}

Example:

$ curl http://127.0.0.1:8000/api/v1/job/start/5d5be920-c716-4c99-60e1-055cad95b40f/ -X POST

/job/disable/{id}

Example:

$ curl http://127.0.0.1:8000/api/v1/job/disable/5d5be920-c716-4c99-60e1-055cad95b40f/ -X POST

/job/enable/{id}

Example:

$ curl http://127.0.0.1:8000/api/v1/job/enable/5d5be920-c716-4c99-60e1-055cad95b40f/ -X POST

/stats

Example:

$ curl http://127.0.0.1:8000/api/v1/stats/
{"Stats":{"ActiveJobs":2,"DisabledJobs":0,"Jobs":2,"ErrorCount":0,"SuccessCount":0,"NextRunAt":"2017-06-04T19:25:16.82873873-07:00","LastAttemptedRun":"0001-01-01T00:00:00Z","CreatedAt":"2017-06-03T19:58:21.433668791-07:00"}}

Debugging Jobs

There is a command within Kala called run which will immediately run a command as Kala would run it live, and then gives you a response on whether it was successful or not. Allows for easier and quicker debugging of commands.

$ kala run "ruby /home/user/ruby/my_ruby_script.rb"
Command Succeeded!
$ kala run "ruby /home/user/other_dir/broken_script.rb"
FATA[0000] Command Failed with err: exit status 1

Dependent Jobs

How to add a dependent job

Check out this example for how to add dependent jobs within a python script.

Notes on Dependent Jobs

Dependent jobs follow a rule of First In First Out
A child will always have to wait until a parent job finishes before it runs
A child will not run if its parent job does not.
If a child job is disabled, it's parent job will still run, but it will not.
If a child job is deleted, it's parent job will continue to stay around.
If a parent job is deleted, unless its child jobs have another parent, they will be deleted as well.

Original Contributors and Contact

Original Author and Core Maintainer:

AJ Bahnken / @ajvbahnken / [email protected]

Original Reviewers:

Sam Dolan / @samdolan
Steve Phillips / @elimisteve

Donate

Owner

AJ Bahnken

Security @mozilla / @mozilla-services

https://github.com/ajvb/kala

Comments

Prevent deleted job to run

This PR solving #237

As you can see in the original code: first job run, then checked Is It exists in the cache. We don't do anything if the job deleted: neither try to run nor reschedule it.
Remote support
Added support for jobType as discussed in #73 with a remote job type

basically it adds the following support:

if user passed job_type: 1 in the create job json, it allows her/him to specify remote_properties:

url, allows to configure to what url to send the request

method, specifying the method to use

body, allows adding a body to the http request

headers, a list of headers as jsons ({"key": "k", "value": "v"}

timeout, after how long to declare the request as failure if there is no response

expected_response_codes, a list of ints to specify which response codes are declared as success

all params have default values and all except for url are optional
Prevent deleted job to run (with test)

Following with comments in https://github.com/ajvb/kala/pull/245 this PR has cherry-pick the code and adding test (and now with https://github.com/ajvb/kala/pull/249 merged) should pass (the lint)
Feature/storage consul

Added a storage option for consul. I probably do need some more thorough testing with exceptions, and had a couple issues with godeps dependencies. Please let me know what additional changes need to be done! Thanks
config file implementation
Switched from codegangster/cli to spf13/cobra and spf13/viper.

Remove deprecated Godep/_workspaces folder in favor to vendor/ folder.

Added gulpfile to faster golang environment

cannot work on windows && raspberry pi 3b

i have run kala with cmd

kala run --port="8091"

then create new job with

$ curl -H "'Content-Type':'application/json'" -d '{"schedule":"R/2020-01-31T23:05:35/PT5M","name":"start_ebookdownloader_job","command":"/home/pi/gowork/src/github.com/sndnvaps/ebookdownloader/ebookdownloader_cli --bookid=91_91911 --txt --mobi"}' -X POST http://192.168.13.100:8091/api/v1/job/

got the crash

ERRO[0454] Error occured when unmarshalling data: invalid character 's' looking for beginning of object key string 
[negroni] PANIC: runtime error: invalid memory address or nil pointer dereference
goroutine 100 [running]:
github.com/codegangsta/negroni.(*Recovery).ServeHTTP.func1(0x6632f048, 0x24cf9c0, 0x24d6460)
        /home/pi/gowork/pkg/mod/github.com/codegangsta/[email protected]/recovery.go:34 +0x9c
panic(0x4c22a0, 0x8e5480)
        /usr/local/go/src/runtime/panic.go:679 +0x194
runtime/internal/atomic.goXadd64(0x24d8654, 0x1, 0x0, 0x24c6d01, 0xe5)
        /usr/local/go/src/runtime/internal/atomic/atomic_arm.go:103 +0x1c
github.com/cornelk/hashmap.(*List).insertAt(0x24d8650, 0x24cfc80, 0x0, 0x0, 0x0)
        /home/pi/gowork/pkg/mod/github.com/cornelk/[email protected]/list.go:129 +0x7c
github.com/cornelk/hashmap.(*List).AddOrUpdate(0x24d8650, 0x24cfc80, 0x0, 0x24c6d60)
        /home/pi/gowork/pkg/mod/github.com/cornelk/[email protected]/list.go:65 +0x118
github.com/cornelk/hashmap.(*HashMap).insertListElement(0x24d8660, 0x24cfc80, 0xa5072301, 0x73aec6a8)
        /home/pi/gowork/pkg/mod/github.com/cornelk/[email protected]/hashmap.go:152 +0x64
github.com/cornelk/hashmap.(*HashMap).Set(0x24d8660, 0x4a60c0, 0x24d2690, 0x257a580)
        /home/pi/gowork/pkg/mod/github.com/cornelk/[email protected]/hashmap.go:131 +0xa4
github.com/ajvb/kala/job.(*LockFreeJobCache).Set(0x2538140, 0x257a580, 0x24, 0x661800e0)
        /home/pi/gowork/src/github.com/ajvb/kala/job/cache.go:258 +0x50
github.com/ajvb/kala/job.(*Job).Init(0x257a580, 0x5e4dc0, 0x2538140, 0x0, 0x0)
        /home/pi/gowork/src/github.com/ajvb/kala/job/job.go:192 +0x184
github.com/ajvb/kala/api.HandleAddJob.func1(0x6632f048, 0x24cf9c0, 0x25b6000)
        /home/pi/gowork/src/github.com/ajvb/kala/api/api.go:139 +0xa4
net/http.HandlerFunc.ServeHTTP(0x24ce300, 0x6632f048, 0x24cf9c0, 0x25b6000)
        /usr/local/go/src/net/http/server.go:2007 +0x34
github.com/gorilla/mux.(*Router).ServeHTTP(0x24d0150, 0x6632f048, 0x24cf9c0, 0x25b6000)
        /home/pi/gowork/pkg/mod/github.com/gorilla/[email protected]/mux.go:100 +0x23c
github.com/codegangsta/negroni.Wrap.func1(0x6632f048, 0x24cf9c0, 0x25b6000, 0x24d6550)
        /home/pi/gowork/pkg/mod/github.com/codegangsta/[email protected]/negroni.go:41 +0x3c
github.com/codegangsta/negroni.HandlerFunc.ServeHTTP(0x24d64b0, 0x6632f048, 0x24cf9c0, 0x25b6000, 0x24d6550)
        /home/pi/gowork/pkg/mod/github.com/codegangsta/[email protected]/negroni.go:24 +0x3c
github.com/codegangsta/negroni.middleware.ServeHTTP(0x5e04f8, 0x24d64b0, 0x24d64e0, 0x6632f048, 0x24cf9c0, 0x25b6000)
        /home/pi/gowork/pkg/mod/github.com/codegangsta/[email protected]/negroni.go:33 +0x84
github.com/ajvb/kala/api/middleware.(*Logger).ServeHTTP(0x255db00, 0x6632f048, 0x24cf9c0, 0x25b6000, 0x24d6540)
        /home/pi/gowork/src/github.com/ajvb/kala/api/middleware/logger.go:21 +0xe0
github.com/codegangsta/negroni.middleware.ServeHTTP(0x5df958, 0x255db00, 0x24d64d0, 0x6632f048, 0x24cf9c0, 0x25b6000)
        /home/pi/gowork/pkg/mod/github.com/codegangsta/[email protected]/negroni.go:33 +0x84
github.com/codegangsta/negroni.(*Recovery).ServeHTTP(0x24d6460, 0x6632f048, 0x24cf9c0, 0x25b6000, 0x24d6530)
        /home/pi/gowork/pkg/mod/github.com/codegangsta/[email protected]/recovery.go:45 +0x70
github.com/codegangsta/negroni.middleware.ServeHTTP(0x5df988, 0x24d6460, 0x24d64c0, 0x6632f048, 0x24cf9c0, 0x25b6000)
        /home/pi/gowork/pkg/mod/github.com/codegangsta/[email protected]/negroni.go:33 +0x84
github.com/codegangsta/negroni.(*Negroni).ServeHTTP(0x24ced60, 0x5e3688, 0x2552120, 0x25b6000)
        /home/pi/gowork/pkg/mod/github.com/codegangsta/[email protected]/negroni.go:73 +0xc4
net/http.serverHandler.ServeHTTP(0x2552090, 0x5e3688, 0x2552120, 0x25b6000)
        /usr/local/go/src/net/http/server.go:2802 +0x88
net/http.(*conn).serve(0x248c4e0, 0x5e4148, 0x25a4480)
        /usr/local/go/src/net/http/server.go:1890 +0x7e0
created by net/http.(*Server).Serve
        /usr/local/go/src/net/http/server.go:2927 +0x2f0

Create a more generic "JobRunner" that will allow for different job types.
Need to flush this out more.

Basic idea is that Kala should be able to support different Job Types, but all of them must be able to be passed to Kala via JSON over HTTP.

A good example of such (taken from Chronos) is the ability to run a container as a job.

Some notes:

Make JobRunner more generic

Change JobRunner.runCmd() ( https://github.com/ajvb/kala/blob/master/job/runner.go#L86 ) to something more generic that inspects the Job to figure out what Job type it should run.

Add in JobRunner.runDocker()

Add needed fields to Job

Needed Fields:

job_type as proposed by @falschparker82 within #96

container similar to the field of the same name within Chronos - https://mesos.github.io/chronos/docs/api.html#adding-a-docker-job

Repetition count in schedule string should be decremented by 1

Looking on your readme, on che schedule string section, is written that R1 - Will repeat once.

So I was expecting just one run on the job that I scheduled with R1 repetition, but it happened twice. Then I investigated and I noticed that basically every job that was scheduled with a fixed repetition count has been executed 1 time more than expected.

How to reproduce the problem: Those are two examples that I run to figure out the problem: 1 - I set up a job to run with 1 repetition

curl http://127.0.0.1:8000/api/v1/job/ -d '{"command": "bash echo A >> ~/Desktop/tmp_log_kala", "name": "test_job2", "schedule": "R1/2016-05-03T15:00:00+01:00/PT10S"}'

and this is the response from kala: {"id":"8affc3e9-2837-415f-4aa0-0ef601611b16"} But looking at the GET job API after the end of the execution, I found two execution of it:

{
   "job" : {
      "next_run_at" : "2016-05-03T15:00:10.002875991+01:00",
      "parent_jobs" : null,
      "retries" : 0,
      "schedule" : "R1/2016-05-03T15:00:00+01:00/PT10S",
      "epsilon" : "",
      "metadata" : {
         "last_error" : "2016-05-03T15:00:10.016369805+01:00",
         "last_success" : "0001-01-01T00:00:00Z",
         "error_count" : 2,
         "success_count" : 0,
         "last_attempted_run" : "2016-05-03T15:00:10.008073338+01:00"
      },
      "disabled" : false,
      "id" : "8affc3e9-2837-415f-4aa0-0ef601611b16",
      "owner" : "",
      "command" : "bash echo A >> ~/Desktop/tmp_log_kala",
      "name" : "test_job2",
      "stats" : [
         {
            "number_of_retries" : 0,
            "job_id" : "8affc3e9-2837-415f-4aa0-0ef601611b16",
            "ran_at" : "2016-05-03T15:00:00.002876963+01:00",
            "execution_duration" : 11777780,
            "success" : false
         },
         {
            "success" : false,
            "job_id" : "8affc3e9-2837-415f-4aa0-0ef601611b16",
            "execution_duration" : 8295465,
            "ran_at" : "2016-05-03T15:00:10.00807454+01:00",
            "number_of_retries" : 0
         }
      ],
      "dependent_jobs" : null
   }
}

2 - Then I tried forcing a 0 repetition for my job (that in theory has no sense)

curl http://127.0.0.1:8000/api/v1/job/ -d '{"command": "bash echo A >> ~/Desktop/tmp_log_kala", "name": "test_job2", "schedule": "R0/2016-05-03T15:00:00+01:00/PT10S"}'

and the response was a job created correctly {"id":"f44e86bf-feb3-4e4d-5580-d9850583d623"} But looking at the execution I found that was executed once:

{
   "job" : {
      "owner" : "",
      "metadata" : {
         "last_success" : "0001-01-01T00:00:00Z",
         "last_attempted_run" : "2016-05-03T15:00:00.002873922+01:00",
         "success_count" : 0,
         "error_count" : 1,
         "last_error" : "2016-05-03T15:00:00.013881925+01:00"
      },
      "schedule" : "R0/2016-05-03T15:00:00+01:00/PT10S",
      "parent_jobs" : null,
      "retries" : 0,
      "id" : "f44e86bf-feb3-4e4d-5580-d9850583d623",
      "next_run_at" : "2016-05-03T15:00:00.00000066+01:00",
      "stats" : [
         {
            "ran_at" : "2016-05-03T15:00:00.002874753+01:00",
            "number_of_retries" : 0,
            "execution_duration" : 11007273,
            "job_id" : "f44e86bf-feb3-4e4d-5580-d9850583d623",
            "success" : false
         }
      ],
      "command" : "bash echo A >> ~/Desktop/tmp_log_kala",
      "epsilon" : "",
      "dependent_jobs" : null,
      "disabled" : false,
      "name" : "test_job2"
   }
}

Is the documentation that is wrong or is kala itself? At the moment I'm forcing my jobs to be repeated n-1 times, but I'd like to know if I have to revert this hack in the future or not...

Regards, Marco

JobStats cleanup

I have a job that scheduled to run every 5 mins. After few weeks of job executions, the job stats get accumulated quite a few of entries, it caused the API api/v1/job/{id} to return lot of data. I am looking for ways to clean up these stats since I am only interested in checking for the last running job status. I am thinking about changing kala code in db.go to add few more methods for cleaning up the old stats entries (e.g. removing items greater than 1 weeks old). I need some pointers to see if this is the right thing to do. I can contribute the code back once I am done. Please let me know. Thanks, -Quan
This fix issue #118

Fixing bug on job without repetition and interval specified that causes panic if its time has passed when kala was off. Issue #118

Added control on GetWaitDuration() function and test
Fix postgres persistence and upgrade bolt library
Hi, first thanks for developing and sharing this project.

We are using with postgres 10.6, and we notice some problems

First when trying to delete (don't have now the message) it's fail because "job -> 'id'" has documented here '{"a": {"b":"foo"}}'::json->'a' gives {"b":"foo"} and we need filter by the id of the job... '{"a":1,"b":2}'::json->>'b' that give 2. We replace > with >> in get and delete

Second, save need check if exist, to update either to insert (as diferent as mysql where can use replace, hear postgres/json)

If i have more time, i'll make another PR with https://github.com/ory/dockertest for testing.

Thanks again
update lib/pq version to v1.10.6

Updated the lib/pq version, since the old lib/pq version is not supporting SCRAM-SHA-256 authentication (feature made available from Postgres 10 onwards)

Wrong unmarshalNewJob when create a remote job by webui build-in

i create a remote job by webui like this

Screenshot_20211110_220538

and i got error log below.

INFO[0000] Job test-create-job-by-web-ui-001:6a57c82e-45e9-40fc-5029-e3bcfd08ad7c added to cache. 
INFO[0000] Starting server on port :8000                
INFO[0028] {"name":"test023","type":1,"owner":"[email protected]","schedule":"R/2022-10-12T20:12:16.828696-08:00/PT10S","retries":0,"url":"http://127.0.0.1:8888/shell/run","method":"GET","headers":{},"body":"{}","timeout":0,"expected_response_codes":[]} 
ERRO[0028] Invalid Remote Job. Job's must contain a Name and a url field 
ERRO[0028] Error occurred when initializing the job: Invalid Remote Job. Job's must contain a Name and a url field 
ERRO[0029] Error occurred when trying to get the job you requested.

the struct of json sent by webui dosen't match the struct of Job below,

type Job struct {
        Name string `json:"name"`
        Id   string `json:"id"`

        // Command to run
        // e.g. "bash /path/to/my/script.sh"
        Command string `json:"command"`

        // Email of the owner of this job
        // e.g. "[email protected]"
        Owner string `json:"owner"`

        // Is this job disabled?
        Disabled bool `json:"disabled"`

        // Jobs that are dependent upon this one will be run after this job runs.
        DependentJobs []string `json:"dependent_jobs"`

        // List of ids of jobs that this job is dependent upon.
        ParentJobs []string `json:"parent_jobs"`

        // Job that gets run after all retries have failed consecutively
        OnFailureJob string `json:"on_failure_job"`

        // ISO 8601 String
        // e.g. "R/2014-03-08T20:00:00.000Z/PT2H"
        Schedule     string `json:"schedule"`
        scheduleTime time.Time
        // ISO 8601 Duration struct, used for scheduling
        // job after each run.
        delayDuration *iso8601.Duration

        // Number of times to schedule this job after the
        // first run.
        timesToRepeat int64

        // Number of times to retry on failed attempt for each run.
        Retries uint `json:"retries"`

        // Duration in which it is safe to retry the Job.
        Epsilon         string `json:"epsilon"`
        epsilonDuration *iso8601.Duration

        jobTimer  clock.Timer
        NextRunAt time.Time `json:"next_run_at"`

        // Templating delimiters, the left & right separated by space,
        // for example `{{ }}` or `${ }`.
        //
        // If this field is non-empty, then each time this
        // job is executed, Kala will template its main
        // content as a Go Template with the job itself as data.
        //
        // The Command is templated for local jobs,
        // and Url and Body in RemoteProperties.
        TemplateDelimiters string

        // The clock for this job; used to mock time during tests.
        clk Clock

        // If the job is disabled (or the system inoperative) and we pass
        // the scheduled run point, when the job becomes active again,
        // normally the job will run immediately.
        // With this setting on, it will not run immediately, but will wait
        // until the next scheduled run time comes along.
        ResumeAtNextScheduledTime bool `json:"resume_at_next_scheduled_time"`

        // Meta data about successful and failed runs.
        Metadata Metadata `json:"metadata"`

        // Type of the job
        JobType jobType `json:"type"`

        // Custom properties for the remote job type
        RemoteProperties RemoteProperties `json:"remote_properties"`

        // Collection of Job Stats
        Stats []*JobStat `json:"stats"`

        lock sync.RWMutex

        // Says if a job has been executed right numbers of time
        // and should not been executed again in the future
        IsDone bool `json:"is_done"`

        // The job will send on this channel when it's done running; used for tests.
        // Note that if the job should be rescheduled, it will send on this channel
        // when it's done rescheduling rather than when the job is done running.
        // That's most useful for testing the scheduling aspect of jobs.
        ranChan chan struct{}

        // Used for testing schedules.
        succeedInstantly bool
}
...
// RemoteProperties Custom properties for the remote job type
type RemoteProperties struct {
        Url    string `json:"url"`
        Method string `json:"method"`

        // A body to attach to the http request
        Body string `json:"body"`

        // A list of headers to add to http request (e.g. [{"key": "charset", "value": "UTF-8"}])
        Headers http.Header `json:"headers"`

        // A timeout property for the http request in seconds
        Timeout int `json:"timeout"`

        // A list of expected response codes (e.g. [200, 201])
        ExpectedResponseCodes []int `json:"expected_response_codes"`
}
...

json of remote job should looks like this

{"name":"test023","type":1,"owner":"[email protected]","schedule":"R/2022-10-12T20:12:16.828696-08:00/PT10S","retries":0,"remote_properties":{"url":"http://127.0.0.1:8888/shell/run","method":"GET","headers":{},"body":"{}","timeout":0,"expected_response_codes":[]}}

Health points for kubernetes deployment (liveness & readyness)
If you could consider interesting, we could implement 2 endpoints

liveness only to check the servicess is up

readyness check the database connection is up and running

What do you think?
Add Sample Schedule Strings to Docs
More documentation is needed, in general.

More specifically, there should be samples of schedule strings. For example, a job with no schedule will run immediately, but a one-off job would look like this:

R0/2014-03-08T20:00:00Z/

A list of examples should be drawn up to show various scenarios in a concise fashion.
Upgrade MySQL "Real" Mocks

Currently the MySQL storage driver uses https://github.com/lestrrat-go/test-mysqld to perform some test with a real database (in addition to the fully mocked ones provided by the datadog library).

However, the lestrrat library is a frequent source of troubles. A potential replacement is https://github.com/src-d/go-mysql-server

Review.

Modern Job Scheduler

Kala

Installing Kala

Getting Started

Examples of Usage

Deployment

Supervisord

Docker

API v1 Docs

Client Libraries

Official:

Contrib:

Job Data Struct

Things to Note

Job JSON Example

Breakdown of schedule string. (ISO 8601 Notation)

Number of times to repeat

Start Datetime

Interval Between Runs

More Information on ISO8601

Overview of routes

/job

/job/{id}

/job/stats/{id}

/job/start/{id}

/job/disable/{id}

/job/enable/{id}

/stats

Debugging Jobs

Dependent Jobs

How to add a dependent job

Notes on Dependent Jobs

Original Contributors and Contact

Donate

Owner

AJ Bahnken

Comments

Prevent deleted job to run

Remote support

Prevent deleted job to run (with test)

Feature/storage consul

config file implementation

cannot work on windows && raspberry pi 3b

Create a more generic "JobRunner" that will allow for different job types.

Make JobRunner more generic

Add needed fields to Job

Repetition count in schedule string should be decremented by 1

JobStats cleanup

This fix issue #118

Fix postgres persistence and upgrade bolt library

update lib/pq version to v1.10.6

Wrong unmarshalNewJob when create a remote job by webui build-in

Health points for kubernetes deployment (liveness & readyness)

Add Sample Schedule Strings to Docs

Upgrade MySQL "Real" Mocks

Related tags

A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC

A web-based simulator for the Kubernetes scheduler

scenario system to check the behavior of kube-scheduler

OpenAIOS vGPU scheduler for Kubernetes is originated from the OpenAIOS project to virtualize GPU device memory.

Kueue: Kubernetes-native Job Queueing

:rocket: Modern cross-platform HTTP load-testing tool written in Go

Resilient, scalable Brainf*ck, in the spirit of modern systems design

k6 is a modern load testing tool for developers and testers in the DevOps era.

The smart virtual machines manager. A modern CLI for Vagrant Boxes.

Go library providing algorithms optimized to leverage the characteristics of modern CPUs

💧 Visual Data Preparation (VDP) is an open-source tool to seamlessly integrate Vision AI with the modern data stack

Tigris is a modern, scalable backend for building real-time websites and apps.

Modern Job Scheduler

nano-gpu-scheduler is a Kubernetes scheduler extender for GPU resources scheduling.

Statefulset-scheduler (aka sfs-scheduler)

Scheduler: the scheduler of distbuild written in Golang

Scheduler - Scheduler package is a zero-dependency scheduling library for Go

K8s-scheduler-extender - Scheduler extender for thpa

Linstor-scheduler-extender - LINSTOR scheduler extender plugin for Kubernetes

Crane scheduler is a Kubernetes scheduler which can schedule pod based on actual node load.

Add needed fields to `Job`