Warhammer40K faction scraper written in Golang, powered by colly.

Wascra

Description

Wascra is a tool written in Golang, which lets you extract all relevant Datasheet info from a Warhammer40K (9th edition) faction from Wahapedia.ru and is powered by the colly framework.

Wascra searches for faction(s) via command-line arguments and supports terminal output as well as output to file.

For dependencies/ module information please see go.mod

Note that this module is WIP.


Usage

For a full list of provided factions from the wiki, please refer to the official site

Synopsis

To scrape the desired faction(s), just execute the following command in a terminal:

go run wascra.go [{-flags}] {faction1} [faction2]

Flags

Supported flags are:

-v - for verbose output while scraping (shows URLs of visited models)

-w - writes data to an auto-generated file to ./factions/ (filename is a concatenation of given arguments/factions)

-wv - verbose writing

-h - lists additional help

Example

go run wascra.go -v space-marines

Note: Faction arguments can be lower or uppercase, while whitespaces must be replaced with hyphens -.


Future improvements / planned fixes

  • - fix selector strings

  • - serialize Model struct for JSON export

  • - add JSON support

  • - concurrent scraping for multiple factions

Similar Resources

Best Room Price Scraper from Booking.com

Best Room Price Scraper from Booking.com This repo is a tutorial of Large Scale

Nov 11, 2022

A simple scraper to export data from buildkite to honeycomb using opentelemetry SDK

A simple scraper to export data from buildkite to honeycomb using opentelemetry SDK

A quick scraper program that let you export builds on BuildKite as OpenTelemetry data and then send them to honeycomb.io for slice-n-dice high cardinality analysis.

Jul 7, 2022

Pholcus is a distributed high-concurrency crawler software written in pure golang

Pholcus is a distributed high-concurrency crawler software written in pure golang

Pholcus Pholcus(幽灵蛛)是一款纯 Go 语言编写的支持分布式的高并发爬虫软件,仅用于编程学习与研究。 它支持单机、服务端、客户端三种运行模式,拥有Web、GUI、命令行三种操作界面;规则简单灵活、批量任务并发、输出方式丰富(mysql/mongodb/kafka/csv/excel等

Dec 30, 2022

Gospider - Fast web spider written in Go

Gospider - Fast web spider written in Go

GoSpider GoSpider - Fast web spider written in Go Painless integrate Gospider into your recon workflow? Enjoying this tool? Support it's development a

Dec 31, 2022

Go-site-crawler - a simple application written in go that can fetch contentfrom a url endpoint

Go Site Crawler Go Site Crawler is a simple application written in go that can f

Feb 5, 2022

View reddit memes and posts from ur terminal with golang and webscraping

goddit View reddit memes and posts from your terminal with golang and webscraping Installation run the following commands on your terminal to install

Feb 22, 2021

[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.

go_spider A crawler of vertical communities achieved by GOLANG. Latest stable Release: Version 1.2 (Sep 23, 2014). QQ群号:337344607 Features Concurrent

Jan 6, 2023

DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets

DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets

dorkscout dokrscout is a tool to automate the finding of vulnerable applications or secret files around the internet throught google searches, dorksco

Nov 21, 2022

New World Auction House Crawler In Golang

New-World-Auction-House-Crawler Goal of this library is to have a process which grabs New World auction house data in the background while playing the

Sep 7, 2022
Related tags
A crawler/scraper based on golang + colly, configurable via JSON

Super-Simple Scraper This a very thin layer on top of Colly which allows configuration from a JSON file. The output is JSONL which is ready to be impo

Aug 21, 2022
Ratemyprof scraper - Ratemyprof scraper with golang

ratemyprof scraper visit https://ratemyprof-api.vercel.app/api/getProf to try ou

Jan 18, 2022
DataHen Till is a standalone tool that instantly makes your existing web scraper scalable, maintainable, and more unblockable, with minimal code changes on your scraper.
DataHen Till is a standalone tool that instantly makes your existing web scraper scalable, maintainable, and more unblockable, with minimal code changes on your scraper.

DataHen Till is a standalone tool that instantly makes your existing web scraper scalable, maintainable, and more unblockable, with minimal code changes on your scraper.

Dec 14, 2022
Collyzar - A distributed redis-based framework for colly.

Collyzar A distributed redis-based framework for colly. Collyzar provides a very simple configuration and tools to implement distributed crawling/scra

Nov 22, 2022
Elegant Scraper and Crawler Framework for Golang

Colly Lightning Fast and Elegant Scraping Framework for Gophers Colly provides a clean interface to write any kind of crawler/scraper/spider. With Col

Jan 9, 2023
Golang based web site opengraph data scraper with caching
Golang based web site opengraph data scraper with caching

Snapper A Web microservice for capturing a website's OpenGraph data built in Golang Building Snapper building the binary git clone https://github.com/

Oct 5, 2022
Web Scraper in Go, similar to BeautifulSoup

soup Web Scraper in Go, similar to BeautifulSoup soup is a small web scraper package for Go, with its interface highly similar to that of BeautifulSou

Jan 9, 2023
Simple price scraper with HTTP server/exporter for use with Prometheus

priceserver v0.3 Simple price scraper with HTTP server/exporter for use with Prometheus Currently working with Bitrue.com exchange but easily adaptabl

Nov 16, 2021
Scraper to download school attendance data from the DfE's statistics website
Scraper to download school attendance data from the DfE's statistics website

?? Simple to use. Scrape attendance data with a single command! ?? Super fast. A

Mar 31, 2022
A cli scraper of gocomics.com made in go

goComic goComic is a cli tool written in go that scrapes your favorite childhood favorite comic from gocomics.com. It will give you a single days comi

Dec 24, 2021