This is a small tool designed to scrape one or more URLs given as command arguments.

HTTP-FETCH

This is a small tool designed to scrape one or more URLs given as command arguments.

Usage

http-fetch [--metadata] ...URLs

The output files will be found under a folder called "output" under the root of this project's directory structure.

Example

http-fetch --metadata https://www.google.com

Docker Usage

docker build -t http-fetch .
docker run -v /Users/daniel/http-fetch-output:/app/output http-fetch --metadata https://www.google.com

In the above run case, you may replace the /Users/daniel/http-fetch-output portion with any path on your local filesystem.

TODO

  • Add tests based around mocked web server. Test would ensure that file contents match between on-disk sample and scraped results
  • Add support for nested resources loaded via JS and CSS
Similar Resources

HTTP Echo is a go web server that echos back the arguments given to it.

HTTP Echo is a go web server that echos back the arguments given to it. This is especially useful for demos or a more extensive "hello world" application in Docker or Kubernetes.

Jan 3, 2023

Link converter service converts URLs to deeplinks or deeplinks to URLs.

Link converter Link converter service converts URLs to deeplinks or deeplinks to URLs. The service responds to the incoming request and first checks w

Dec 23, 2021

A tool that creates requests with the given urls and converts its response to md5 hash.

Response Converter A tool that creates requests with the given urls and converts its response to md5 hash. Prerequisites Before you begin you must hav

Nov 20, 2022

You had one job, or more then one, which can be done in steps

Leprechaun Leprechaun is tool where you can schedule your recurring tasks to be performed over and over. In Leprechaun tasks are recipes, lets observe

Nov 23, 2022

A tool to enumerate all the command-line arguments used to start a Linux process written in Go.

A tool to enumerate all the command-line arguments used to start a Linux process written in Go.

ranwith A tool to enumerate all the command-line arguments used to start a Linux process written in Go. ranwith uses the Linux /proc directory to obta

Jun 30, 2022

Small program that takes in commands and moves one or more robots around the surface of Mars!

Mars Rover Build and Run the Image Build image from current directory: docker build -t marsrover . Run image interactively: docker run -i marsrover

Jan 2, 2022

Parse any web page for URLs and return the HTTP response code of each one.

Parse any web page for URLs and return the HTTP response code of each one.

ParseWebPage - Fully Functional WebPage Parser Parse any web page for URLs and return the HTTP response code of each one. Creators 👤 Steven Williams

Oct 25, 2021

argv - Go library to split command line string as arguments array using the bash syntax.

Argv Argv is a library for Go to split command line string into arguments array. Documentation Documentation can be found at Godoc Example func TestAr

Nov 19, 2022

A command-line arguments parser that will make you smile.

docopt-go An implementation of docopt in the Go programming language. docopt helps you create beautiful command-line interfaces easily: package main

Jan 7, 2023

Package osargs provides functions to parse command line arguments

osargs About Package osargs provides functions to parse command line arguments. It is published on https://github.com/vbsw/osargs and https://gitlab.c

May 8, 2022

Go cmd utility that prints its command line arguments using strings.Join

Results This is an exercise of the book The Go Programming Language, by Alan A. A. Donovan and Brian Kernighan. Comparison between different versions

Dec 18, 2021

A proxy that authorizes and enforces a given label in a given PromQL query

prom-authzed-proxy prom-authzed-proxy is a proxy for Prometheus that authorizes the request's Bearer Token with Authzed and enforces a label in a Prom

Jul 19, 2022

Bump-version - Bump a given semantic version, following a given version fragment

bump-version Bump a given semantic version, following a given version fragment.

Feb 7, 2022

Nomad-driver-await-dependency - A Nomad driver that acts as blocker for subsequent task until a given Consul service has reached a given state

Nomad Skeleton Driver Plugin Skeleton project for Nomad task driver plugins. Thi

Feb 12, 2022

Scrape the Twitter Frontend API without authentication with Golang.

Twitter Scraper Twitter's API is annoying to work with, and has lots of limitations — luckily their frontend (JavaScript) has it's own API, which I re

Dec 29, 2022

Scrape the web in the eink era. Convert websites into books.

Scrape the web in the eink era. Convert websites into books.

Dec 29, 2022

🤖 Automatically scrape PortableApps.com (or official release page) and convert into Edgeless plugin package

Edgeless 自动插件机器人 2 简介 该项目是为了使用 Golang 重新实现 Edgeless 自动插件机器人 特性 (WIP) 完全兼容 Edgeless 自动插件机器人,包括 Tasks,以实现无缝迁移 更快的构建速度 更好的代码结构 更高的拓展性 工作进度 截止至 2021/11/28

Sep 12, 2022

Car guardian - web scrape used cars

🎂 [PROJECTNAME] 🎂 Short description of the project. 💾 ABOUT Are you tired of repetetive searching for used cars? Let me fix your problem. This is k

Oct 28, 2022

Go-Yahoo-Finance-Daily-Actives - Scrape for the daily actives on yh Finance and save the data to a CSV, and optionally send it to yourself as an email

Go-Yahoo-Finance-Daily-Actives - Scrape for the daily actives on yh Finance and save the data to a CSV, and optionally send it to yourself as an email

Go-Yahoo-Finance-Daily-Actives - Scrape for the daily actives on yh Finance and save the data to a CSV, and optionally send it to yourself as an email

Dec 13, 2022
A Golang library to scrape lyrics from musixmatch.com (WIP)

A Golang library to scrape lyrics from musixmatch.com (WIP)

Aug 5, 2022
WebWalker - Fast Script To Walk Web for find urls...

WebWalker send http request to url to get all urls in url and send http request to urls and again .... WebWalker can find 10,000 urls in 10 seconds.

Nov 28, 2021
Go program that fetches URLs concurrently and handles timeouts

fetchalltimeout This is an exercise of the book The Go Programming Language, by

Dec 18, 2021
Go program that fetches urls and prepends http:// if missing

fetchautoprefix This is an exercise of the book The Go Programming Language, by

Dec 18, 2021
Fast golang web crawler for gathering URLs and JavaSript file locations.

Fast golang web crawler for gathering URLs and JavaSript file locations. This is basically a simple implementation of the awesome Gocolly library.

Sep 24, 2022
DataHen Till is a standalone tool that instantly makes your existing web scraper scalable, maintainable, and more unblockable, with minimal code changes on your scraper.
DataHen Till is a standalone tool that instantly makes your existing web scraper scalable, maintainable, and more unblockable, with minimal code changes on your scraper.

DataHen Till is a standalone tool that instantly makes your existing web scraper scalable, maintainable, and more unblockable, with minimal code changes on your scraper.

Dec 14, 2022
Cirno-go A tool for downloading books from hbooker in Go.
Cirno-go A tool for downloading books from hbooker in Go.

Cirno-go A tool for downloading books from hbooker in Go. Features Login your own account Search books by book name Download books as txt and epub fil

Oct 25, 2022
DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets
DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets

dorkscout dokrscout is a tool to automate the finding of vulnerable applications or secret files around the internet throught google searches, dorksco

Nov 21, 2022
🦙 acao(阿草), the tool man for data scraping of https://asoul.video/.

?? acao acao(阿草), the tool man for data scraping of https://asoul.video/. Deploy to Aliyun serverless function with Raika update_member Update A-SOUL

Jul 25, 2022
A simple go program which checks if your websites are running and runs forever (stop it with ctrl+c). It takes two optional arguments, comma separated string with urls and an interval.

uptime A simple go program which checks if your websites are running and runs forever (stop it with ctrl+c). It takes two optional arguments: -interva

Dec 15, 2022