Jobbuzz - Brunei job search database and alert notification

Last update: Jul 30, 2022

Comments: 9

JobBuzz

Brunei open source job search database and alert notification

Development

Requirements

Go 1.17 or higher
MySQL 8

Running locally

Copy the .env.example file in the repository root to .env.
Update the contents of .env with your database access details.
Change working directory to cmd/jobbuzz-api and run go run . to start the API server.
Change working directory to cmd/jobbuzz-scraper and run go run . to start the web scraper program.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Owner

https://github.com/B-Open/jobbuzz https://jobbuzz.org

Comments

Add zerolog as logger

I added zerolog so that we can use leveled logging.

For gin, if it is an error we would still just use panic(err) so that the middleware can handle it, but for logging errors that doesn't cause a panic, use information and debug information, we can use leveled logging so that we know what type of log it is.

Another useful thing about this is that debug logs will show when we are in development but not when in production.

Fixes #9

Add user registration

User registration only

Example query

mutation {
  registerAccount(input: {email: "[email protected]", password: "helloworld"}) {
    __typename
    ... on LoginResult {
      accessToken
    }
  }
}

Contributes to #17

add graphql endpoint
Added graphql schema generator and API server endpoint. ~~I have left the folder structure as default.~~

General workflow for developing with the gqlgen tool is as follow.

Make changes in pkg/graph/schema.graphqls.

Run make generate in repo root.

It should update the pkg/graph/schema.resolvers.go file.

If you deleted anything in step 1, it will probably throw an error, so go into pkg/graph/schema.resolvers.go and find the code to be deleted at the bottom, check that they are no longer required and deleted them. Then run make generate again.

Edit pkg/graph/schema.resolvers.go to add the database code or business logic.
Changing scraper logic to be more resilient

Currently with go-colly, there is a potential issue where if 1 page fails to load then the data is considered corrupted because we need the whole set of data in order to determine which job listings are active or inactive.

There is no retry functionality in go-colly and error handling is not very useful.

I think it might be better for us to fetch the html as string (where we can have our own retry logic) then use an HTML parser to process the data instead.

This will be more similar to the logic of the scraper in the .NET version.

Get html node in go with css selector: https://github.com/PuerkitoBio/goquery

Retry: https://github.com/avast/retry-go

JSON returned is capitalised

If we use the following

type Job struct {
    gorm.Model
    Title     string         
    Company   string     
    Salary    string         
    Location  string 
}

It will return the following JSON in API

{
  "ID" : 1,
  "CreatedAt" : "foo",
  "UpdatedAt" : "foo",
  "DeletedAt": "foo",
  "Title" : "foo",
  "Company" : "foo",
  "Salary" : "foo",
  "Location" : "foo",
}

Annotating the struct with json"field" won't lowercase the fields in gorm.Model

type Job struct {
    gorm.Model
    Title     string `json:"title"` 
    Company   string `json:"company"`
    Salary    string `json:"salary"`
    Location  string `json:"location"`
}

It will return the following JSON in API

{
  "ID" : 1,
  "CreatedAt" : "foo",
  "UpdatedAt" : "foo",
  "DeletedAt": "foo",
  "title" : "foo",
  "company" : "foo",
  "salary" : "foo",
  "location" : "foo",
}

To lowercase all the fields we can't use gorm.Model

type Job struct {
	ID        uint           `gorm:"primarykey" json:"id"`
	CreatedAt time.Time      `json:"created_at"`
	UpdatedAt time.Time      `json:"updated_at"`
	DeletedAt gorm.DeletedAt `gorm:"index" json:"deleted_at"`
	Title     string         `json:"title"`
	Company   string         `json:"company"`
	Salary    string         `json:"salary"`
	Location  string         `json:"location"`
}

It will return the following JSON in API

{
  "id" : 1,
  "created_at" : "foo",
  "updated_at" : "foo",
  "deleted_at": "foo",
  "title" : "foo",
  "company" : "foo",
  "salary" : "foo",
  "location" : "foo",
}

I prefer the above JSON fields since it's the common convention in API response. What do you think?

Job uniqueness

When we run the scraper cmd, it automatically create new jobs based on the jobs returned by the scrapers. Sometimes, the jobs is already exist in the DB, how do we prevent it to be inserted?

Some ideas, store the job links in db as well, then we query based on links before we insert the job to DB
Job notification subscription feature
User should be able to specify filter parameters

keywords

location

salary

Notify daily?

Notification medium

web push

email

app push (our own standalone app or something like pushover?)

[x] Create database schema

[ ] Create system architecture

[x] Create sequence diagram

[ ] Implement
DB Seeder

It's not an enjoyable development experience to keep on scraping the websites especially when you have unreliable internet connection. Suggestion is to have another cmd programme that seeds data to the DB. We can use export an SQL file from existing data and create a new cmd programme that imports the SQL file.
Anonymous salary report
Salary reporting similar to glassdoor.

This would be a fairly big feature so the details should be discussed before development.

Allow users to report salary anonymously.

Include information such as sectors, industry, years of experience, fulltime/parttime.

Show infographic with filters.

References:

https://www.tokyodev.com/insights/2020-developer-survey/

https://www.payscale.com/research/SG/Job=Software_Developer/Salary

https://www.levels.fyi/

Related tags

[mirror] the database client and tools for the Go vulnerability database

The Go Vulnerability Database golang.org/x/vulndb This repository is a prototype of the Go Vulnerability Database. Read the Draft Design. Neither the

Dec 29, 2022

Database - Example project of database realization using drivers and models

database Golang based database realization Description Example project of databa

Feb 10, 2022

Mogo: a lightweight browser-based logs analytics and logs search platform for some datasource(ClickHouse, MySQL, etc.)

mogo Mogo is a lightweight browser-based logs analytics and logs search platform

Dec 30, 2022

Zinc Search engine. A lightweight alternative to elasticsearch that requires minimal resources, written in Go.

Zinc Zinc is a search engine that does full text indexing. It is a lightweight alternative to elasticsearch and runs in less than 100 MB of RAM. It us

Jan 8, 2023

A Go rest API project that is following solid and common principles and is connected to local MySQL database.

This is an intermediate-level go project that running with a project structure optimized RESTful API service in Go. API's of that project is designed based on solid and common principles and connected to the local MySQL database.

Dec 25, 2022

An observability database aims to ingest, analyze and store Metrics, Tracing and Logging data.

BanyanDB BanyanDB, as an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data. It's designed to handle observab

Dec 31, 2022

Database Access Layer for Golang - Testable, Extendable and Crafted Into a Clean and Elegant API

REL Modern Database Access Layer for Golang. REL is golang orm-ish database layer for layered architecture. It's testable and comes with its own test

Dec 29, 2022

🏋️ dbbench is a simple database benchmarking tool which supports several databases and own scripts

dbbench Table of Contents Description Example Installation Supported Databases Usage Custom Scripts Troubeshooting Development Acknowledgements Descri

Dec 30, 2022

Additions to Go's database/sql for super fast performance and convenience.

gocraft/dbr (database records) gocraft/dbr provides additions to Go's database/sql for super fast performance and convenience. $ go get -u github.com/

Jan 1, 2023

Database migrations. CLI and Golang library.

migrate Database migrations written in Go. Use as CLI or import as library. Migrate reads migrations from sources and applies them in correct order to

Jan 9, 2023

Library for scanning data from a database into Go structs and more

scany Overview Go favors simplicity, and it's pretty common to work with a database via driver directly without any ORM. It provides great control and

Jan 9, 2023

Interactive terminal user interface and CLI for database connections. MySQL, PostgreSQL. More to come.

?? dbui dbui is the terminal user interface and CLI for database connections. It provides features like, Connect to multiple data sources and instance

Jan 5, 2023

Scan database/sql rows directly to structs, slices, and primitive types

Scan Scan standard lib database rows directly to structs or slices. For the most comprehensive and up-to-date docs see the godoc Examples Multiple Row

Dec 28, 2022

Go package providing simple database and server interfaces for the CSV files produced by the sfomuseum/go-libraryofcongress package

go-libraryofcongress-database Go package providing simple database and server interfaces for the CSV files produced by the sfomuseum/go-libraryofcongr

Oct 29, 2021

A go Library for scan database/sql rows to struct、slice、other types. And it support multiple databases connection management

ploto A go Library for scan database/sql rows to struct、slice、other types. And it support multiple databases connection management It's not an ORM. wo

Nov 3, 2022

Jobbuzz - Brunei job search database and alert notification

JobBuzz

Development

Requirements

Running locally

Contributing

Owner

Comments

Add zerolog as logger

Add user registration

add graphql endpoint

Changing scraper logic to be more resilient

JSON returned is capitalised

Job uniqueness

Job notification subscription feature

DB Seeder

Anonymous salary report

Related tags

[mirror] the database client and tools for the Go vulnerability database

Database - Example project of database realization using drivers and models

Mogo: a lightweight browser-based logs analytics and logs search platform for some datasource(ClickHouse, MySQL, etc.)

Zinc Search engine. A lightweight alternative to elasticsearch that requires minimal resources, written in Go.

A Go rest API project that is following solid and common principles and is connected to local MySQL database.

An observability database aims to ingest, analyze and store Metrics, Tracing and Logging data.

Database Access Layer for Golang - Testable, Extendable and Crafted Into a Clean and Elegant API

🏋️ dbbench is a simple database benchmarking tool which supports several databases and own scripts

Additions to Go's database/sql for super fast performance and convenience.

Database migrations. CLI and Golang library.

Library for scanning data from a database into Go structs and more

Interactive terminal user interface and CLI for database connections. MySQL, PostgreSQL. More to come.

Scan database/sql rows directly to structs, slices, and primitive types

Go package providing simple database and server interfaces for the CSV files produced by the sfomuseum/go-libraryofcongress package

A go Library for scan database/sql rows to struct、slice、other types. And it support multiple databases connection management

CRUD API example is written in Go using net/http package and MySQL database.

Web-based, zero-config, dependency-free database schema change and version control tool for teams

A proxy is database proxy that de-identifies PII for PostgresDB and MySQL

Lightweight SQL database written in Go for prototyping and playing with text (CSV, JSON) data