Reactor Crawler

Simple CLI content crawler for Joyreactor. He'll find all media content on the page you've provided and save it. If there will be any kind of pagination... he'll go through all pages as well unless you'll tell him to not.

Quick start

Here's the quickest way to download something and test the crawler:

Download a build according to your OS.
Pick some URL from Joyreactor.
Run the crawler $ reactor-crw -p "http://joyreactor.cc/tag/digital+art"

What else

There's a list of optional flags that adds a little more control over the crawler.

$ reactor-crw --help

Allows to quickly download all content by its direct url or entire tag or fandom from joyreactor.cc.
Example: reactor-crw -d "." -p "http://joyreactor.cc/tag/someTag/all" -w 2 -c "cookie-string"

Usage:
  reactor-crw [flags]

Flags:
  -c, --cookie string        User's cookie. Some content may be unavailable without it
  -d, --destination string   Save path for content. Default value is a user's home folder
                             (example C:\Users\username for Windows) (default "/home/avpretty")
  -h, --help                 help for reactor-crw
  -p, --path string          Provide a full page URL
  -s, --search string        A comma separated list of content types that should be downloaded.
                             Possible values: image,gif,webm,mp4. Example: -s "image,webm" (default "image,gif")
  -o, --single-page          Crawl only one page
  -w, --workers int          Amount of workers (default 1)

From all flags only -p --path is required. All other flags can be omitted and default values will be used.

Here's another example:

$ reactor-crw -p "http://joyreactor.cc/post/000000" -d "." -s "mp4" -o -c "cookies from joyreactor"

This one will download only mp4 content from the post and will save it to the current directory. -o means that only the current page will be parsed, and the user's cookie -s will be used by the crawler.

Note: some content may be parsed only with user's cookie.

Simple content crawler for joyreactor.cc

Reactor Crawler

Quick start

What else

Owner

Similar Resources

Fast, highly configurable, cloud native dark web crawler.

A crawler/scraper based on golang + colly, configurable via JSON

Just a web crawler

crawlergo is a browser crawler that uses chrome headless mode for URL collection.

A crawler/scraper based on golang + colly, configurable via JSON

New World Auction House Crawler In Golang

A PCPartPicker crawler for Golang.

Multiplexer: HTTP-Server & URL Crawler

High-performance crawler framework based on fasthttp.

Related tags

A simple crawler sending Telegram notification when Refurbished Macbook Air / Pro in stock.

Go-site-crawler - a simple application written in go that can fetch contentfrom a url endpoint

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Elegant Scraper and Crawler Framework for Golang

Pholcus is a distributed high-concurrency crawler software written in pure golang

:paw_prints: Creeper - The Next Generation Crawler Framework (Go)

ant (alpha) is a web crawler for Go.

Go IMDb Crawler

Apollo 💎 A Unix-style personal search engine and web crawler for your digital footprint.

High-performance crawler framework based on fasthttp