A download manager package for Go

grab

GoDoc Build Status Go Report Card

Downloading the internet, one goroutine at a time!

$ go get github.com/cavaliercoder/grab

Grab is a Go package for downloading files from the internet with the following rad features:

  • Monitor download progress concurrently
  • Auto-resume incomplete downloads
  • Guess filename from content header or URL path
  • Safely cancel downloads using context.Context
  • Validate downloads using checksums
  • Download batches of files concurrently
  • Apply rate limiters

Requires Go v1.7+

Example

The following example downloads a PDF copy of the free eBook, "An Introduction to Programming in Go" into the current working directory.

resp, err := grab.Get(".", "http://www.golang-book.com/public/pdf/gobook.pdf")
if err != nil {
	log.Fatal(err)
}

fmt.Println("Download saved to", resp.Filename)

The following, more complete example allows for more granular control and periodically prints the download progress until it is complete.

The second time you run the example, it will auto-resume the previous download and exit sooner.

package main

import (
	"fmt"
	"os"
	"time"

	"github.com/cavaliercoder/grab"
)

func main() {
	// create client
	client := grab.NewClient()
	req, _ := grab.NewRequest(".", "http://www.golang-book.com/public/pdf/gobook.pdf")

	// start download
	fmt.Printf("Downloading %v...\n", req.URL())
	resp := client.Do(req)
	fmt.Printf("  %v\n", resp.HTTPResponse.Status)

	// start UI loop
	t := time.NewTicker(500 * time.Millisecond)
	defer t.Stop()

Loop:
	for {
		select {
		case <-t.C:
			fmt.Printf("  transferred %v / %v bytes (%.2f%%)\n",
				resp.BytesComplete(),
				resp.Size(),
				100*resp.Progress())

		case <-resp.Done:
			// download is complete
			break Loop
		}
	}

	// check for errors
	if err := resp.Err(); err != nil {
		fmt.Fprintf(os.Stderr, "Download failed: %v\n", err)
		os.Exit(1)
	}

	fmt.Printf("Download saved to ./%v \n", resp.Filename)

	// Output:
	// Downloading http://www.golang-book.com/public/pdf/gobook.pdf...
	//   200 OK
	//   transferred 42970 / 2893557 bytes (1.49%)
	//   transferred 1207474 / 2893557 bytes (41.73%)
	//   transferred 2758210 / 2893557 bytes (95.32%)
	// Download saved to ./gobook.pdf
}

Design trade-offs

The primary use case for Grab is to concurrently downloading thousands of large files from remote file repositories where the remote files are immutable. Examples include operating system package repositories or ISO libraries.

Grab aims to provide robust, sane defaults. These are usually determined using the HTTP specifications, or by mimicking the behavior of common web clients like cURL, wget and common web browsers.

Grab aims to be stateless. The only state that exists is the remote files you wish to download and the local copy which may be completed, partially completed or not yet created. The advantage to this is that the local file system is not cluttered unnecessarily with addition state files (like a .crdownload file). The disadvantage of this approach is that grab must make assumptions about the local and remote state; specifically, that they have not been modified by another program.

If the local or remote file are modified outside of grab, and you download the file again with resuming enabled, the local file will likely become corrupted. In this case, you might consider making remote files immutable, or disabling resume.

Grab aims to enable best-in-class functionality for more complex features through extensible interfaces, rather than reimplementation. For example, you can provide your own Hash algorithm to compute file checksums, or your own rate limiter implementation (with all the associated trade-offs) to rate limit downloads.

Owner
Ryan Armstrong
Cloud wrangler, Kernel prober, packet filterer.
Ryan Armstrong
Comments
  • Download is not resumed after killing the application with ctrl + c

    Download is not resumed after killing the application with ctrl + c

    I made an application that uses your lib and when I hit ctrl + c and then execute it again it starts downloading from 0. I'm downloading the same URL on the same path.

  • nil pointer dereference @ response.go:81

    nil pointer dereference @ response.go:81

    After update:

    panic: runtime error: invalid memory address or nil pointer dereference [signal 0xb code=0x1 addr=0x4 pc=0xc6974]

    goroutine 34 [running]: panic(0x346eb0, 0x1070a038) /root/.gvm/gos/go1.6/src/runtime/panic.go:464 +0x330 sync/atomic.loadUint64(0x1080a1d4, 0x0, 0x0) /root/.gvm/gos/go1.6/src/sync/atomic/64bit_arm.go:10 +0x54 github.com/cavaliercoder/grab.(_Response).BytesTransferred(0x1080a180, 0x4, 0x386878) /root/.gvm/pkgsets/go1.6/global/src/github.com/cavaliercoder/grab/response.go:81 +0x40 github.com/cavaliercoder/grab.(_Client).do(0x107b79c0, 0x107f80f0, 0x0, 0x0, 0x0) /root/.gvm/pkgsets/go1.6/global/src/github.com/cavaliercoder/grab/client.go:222 +0x3c0 github.com/cavaliercoder/grab.(_Client).DoAsync.func1(0x107b79c0, 0x107f80f0, 0x10802380) /root/.gvm/pkgsets/go1.6/global/src/github.com/cavaliercoder/grab/client.go:94 +0x24 created by github.com/cavaliercoder/grab.(_Client).DoAsync /root/.gvm/pkgsets/go1.6/global/src/github.com/cavaliercoder/grab/client.go:102 +0x60

  • readme: fix syntax error in example

    readme: fix syntax error in example

    grab.Response.Size is a int64 value and not a function

    I noticed this problem when testing the example in the readme.

    Hopefully this could help others...

    image

  • DeleteOnError does not seem to work (at least on Windows)

    DeleteOnError does not seem to work (at least on Windows)

    // create download request
    req, err := NewRequest("", "http://example.com/example.zip")
    if err != nil {
        panic(err)
    }
    
    // set request checksum
    sum, err := hex.DecodeString("33daf4c03f86120fdfdc66bddf6bfff4661c7ca11c5da473e537f4d69b470e57")
    if err != nil {
        panic(err)
    }
    req.SetChecksum(sha256.New(), sum, true)
    
    // download and validate file
    resp := DefaultClient.Do(req)
    if err := resp.Err(); err != nil {
        panic(err)
    }
    

    Although there is a checksum mismatch, the file will not be removed:

    panic: checksum mismatch
    
    goroutine 1 [running]:
    main.downloadOpenjdk()
            C:/Users/path/to/main.go:88 +0x1b6
    main.main()
            C:/Users/path/to/main.go:94 +0x2c
    exit status 2
    
    

    Apart from this, how to prevent that the file gets downloaded anyway if there is checksum mismatch?

    I think that one of the first improvements could be adding error handling to the os.remove snippet https://github.com/cavaliercoder/grab/blob/925bcfe56bc16868f1a398af4231cd4ffa07276f/client.go#L294 to ensure that at least an error message is returned. In my opinion error message should not be omitted.

  • Response from github is 403 when downloading a release file.

    Response from github is 403 when downloading a release file.

    Not able to download release artifact from GitHub.

    package main
    
    import (
    	"fmt"
    	"log"
    
    	"github.com/cavaliercoder/grab"
    )
    
    func main() {
    	client := grab.NewClient()
    	req, err := grab.NewRequest("", "https://github.com/minishift/minishift-centos-iso/releases/download/v1.12.0/minishift-centos7.iso")
    	if err != nil {
    		log.Fatal(err)
    	}
    	resp := client.Do(req)
    	fmt.Printf("Response is: %v\n", resp.HTTPResponse.Status)
    }
    

    Unexpected one.

    ==== Output ====
    $ go run test.go
    Response is: 403 Forbidden
    
  • Synchronized access to Response.transfer and Response.bytesResumed

    Synchronized access to Response.transfer and Response.bytesResumed

    I've been experiencing "race condition detected" errors during tests, I'm displaying a progress bar during download (using another goroutine), that's why those fields needs to be protected for concurrent access.

  • Corrupted contents when downloading on top of wrong file

    Corrupted contents when downloading on top of wrong file

    Grab looked like a good fit for a project I'm working on so I gave it a spin. I found that it downloaded a file perfectly and when asked to download the same file again managed to avoid downloading all the bytes again, which was just what I was looking for.

    I then overwrote the downloaded file with completely different contents and then downloaded again using grab. The message:

      206 Partial Content
    

    was emitted and the download was apparently successful. The downloaded file even had the same number of bytes as the original, but unfortunately the contents were corrupted.

    Fortunately, this problem is easily reproduced using the example program in the README:

    $ go run main.go
    Downloading http://www.golang-book.com/public/pdf/gobook.pdf...
      200 OK
    Download saved to ./gobook.pdf
    $ mv gobook.pdf gobook.pdf.good
    $ cp main.go gobook.pdf
    $ go run main.go
    Downloading http://www.golang-book.com/public/pdf/gobook.pdf...
      206 Partial Content
    Download saved to ./gobook.pdf
    $ diff gobook.pdf gobook.pdf.good
    Binary files gobook.pdf and gobook.pdf.good differ
    $ ls -l
    total 11320
    -rw-r--r--  1 gnormington  staff  2893557  1 Jan  1970 gobook.pdf
    -rw-r--r--  1 gnormington  staff  2893557  1 Jan  1970 gobook.pdf.good
    -rw-r--r--  1 gnormington  staff     1139  3 Nov 11:10 main.go
    

    The environment is go version go1.9.2 darwin/amd64 on macOS 10.13.1.

    In case the README changes, the contents of main.go above is:

    package main
    
    import (
    	"fmt"
    	"os"
    	"time"
    
    	"github.com/cavaliercoder/grab"
    )
    
    func main() {
    	// create client
    	client := grab.NewClient()
    	req, _ := grab.NewRequest(".", "http://www.golang-book.com/public/pdf/gobook.pdf")
    
    	// start download
    	fmt.Printf("Downloading %v...\n", req.URL())
    	resp := client.Do(req)
    	fmt.Printf("  %v\n", resp.HTTPResponse.Status)
    
    	// start UI loop
    	t := time.NewTicker(500 * time.Millisecond)
    	defer t.Stop()
    
    Loop:
    	for {
    		select {
    		case <-t.C:
    			fmt.Printf("  transferred %v / %v bytes (%.2f%%)\n",
    				resp.BytesComplete(),
    				resp.Size,
    				100*resp.Progress())
    
    		case <-resp.Done:
    			// download is complete
    			break Loop
    		}
    	}
    
    	// check for errors
    	if err := resp.Err(); err != nil {
    		fmt.Fprintf(os.Stderr, "Download failed: %v\n", err)
    		os.Exit(1)
    	}
    
    	fmt.Printf("Download saved to ./%v \n", resp.Filename)
    
    	// Output:
    	// Downloading http://www.golang-book.com/public/pdf/gobook.pdf...
    	//   200 OK
    	//   transferred 42970 / 2893557 bytes (1.49%)
    	//   transferred 1207474 / 2893557 bytes (41.73%)
    	//   transferred 2758210 / 2893557 bytes (95.32%)
    	// Download saved to ./gobook.pdf
    }
    
  • Make buffer size configurable

    Make buffer size configurable

    Hey im downloading to an external hard drive and with the default buffer of 4096 bytes I'm getting really poor performance <1M/s. I manually increased the buffer size to 4096*1024 bytes and now im getting 6.2M/s (which is the maximum my internet connection offers).

  • Calling CancelFunc in context doesn't cancel download

    Calling CancelFunc in context doesn't cancel download

    @oliverpool As per discussed in PR #73, creating a request with req.WithContext(ctx) that uses a context created from context.WithCancel and calling the returned CancelFunc doesn't stop the file from downloading.

    The file being downloaded is around 160mb, which takes ~18s to download with 100Mbps connection.

    Network activity can still be observed after calling CancelFunc, and the code doesn't return context.ErrCanceled.

  • Data race

    Data race

    I just ran the example on the start page with the race detector.

    Downloading http://www.golang-book.com/public/pdf/gobook.pdf...
      200 OK
    ==================
    WARNING: DATA RACE
    Read at 0x00c420136298 by main goroutine:
      github.com/cavaliercoder/grab.(*Response).BytesComplete()
          /home/rkaufmann/go/src/github.com/cavaliercoder/grab/response.go:133 +0x43
      main.main()
          /home/rkaufmann/Downloads/grab.go:30 +0x442
    
    Previous write at 0x00c420136298 by goroutine 14:
      [failed to restore the stack]
    
    Goroutine 14 (running) created at:
      github.com/cavaliercoder/grab.(*Client).Do()
          /home/rkaufmann/go/src/github.com/cavaliercoder/grab/client.go:81 +0x451
      main.main()
          /home/rkaufmann/Downloads/grab.go:18 +0x325
    ==================
    ==================
    WARNING: DATA RACE
    Write at 0x00c420076180 by main goroutine:
      sync/atomic.CompareAndSwapInt32()
          /usr/local/go/src/runtime/race_amd64.s:293 +0xb
      sync.(*Mutex).Lock()
          /usr/local/go/src/sync/mutex.go:74 +0x4d
      github.com/cavaliercoder/grab.(*transfer).N()
          /home/rkaufmann/go/src/github.com/cavaliercoder/grab/transfer.go:74 +0x4a
      github.com/cavaliercoder/grab.(*Response).BytesComplete()
          /home/rkaufmann/go/src/github.com/cavaliercoder/grab/response.go:133 +0x58
      main.main()
          /home/rkaufmann/Downloads/grab.go:30 +0x442
    
    Previous write at 0x00c420076180 by goroutine 14:
      [failed to restore the stack]
    
    Goroutine 14 (running) created at:
      github.com/cavaliercoder/grab.(*Client).Do()
          /home/rkaufmann/go/src/github.com/cavaliercoder/grab/client.go:81 +0x451
      main.main()
          /home/rkaufmann/Downloads/grab.go:18 +0x325
    ==================
      transferred 1156706 / 0 bytes 2893557 bps (39.98%)
      transferred 2617418 / 0 bytes 2893557 bps (90.46%)
    Download saved to ./gobook.pdf
    Found 2 data race(s)
    exit status 66
    
    
  • Downloading Text file gives an issue while in resume mode.

    Downloading Text file gives an issue while in resume mode.

    I am trying to to download text file form a URL.

    Grab downloads it perfectly first time.

    But if I make some small change in source text file and try to download again; Grab append that change to end of a file. now downloaded text file is of a no use.

    Can you please help in this regards?

  • Fix head break connection

    Fix head break connection

    Problem

    I tried to use grab and tested download from server https://speed.hetzner.de When I retried download to partially saved file grab failed with nil response and error Head EOF

    I've researched the source of the problem. Remote server breaks the connection on HEAD request, but works good at GET request. So, I think that in this case grab should not fail after HEAD and go further and try GET.

    This PR fixes this.

    Solution

    Test

    I've added test that emulate HEAD request breaking connection WithHeadRequestBreak.

    Fix

    In headRequest State Function in branch with response error grab now does not closing the response but go to GET request.

    client.Do(req) returns only nil response along with error, so there is no need to close nil response.

    Other fixes

    I've added one more commit to PR with linter fixes and update of the Go version.

  • Update project on golang.org

    Update project on golang.org

    Hi!

    I can't update my grab (v2) module to v3 because of https://pkg.go.dev/github.com/cavaliercoder/grab has only Version: v2.0.0+incompatible Latest

    Could you update it please?

  •  http: ContentLength=111 with Body length 0

    http: ContentLength=111 with Body length 0

    I'm receiving an error while trying to download a file using POST, I've successfully done this with regular net/http. But the net/http is having issues with larger files, which is why I'm trying to move to this package. The error I'm receiving is on a file I was able to download successfully with net/http. I say another stackoverflow discussion that seems similar to my issue, but for another package, but not sure how to solve it in this package. Any help is appreciated. Link to other issue: https://stackoverflow.com/questions/52429036/getting-error-on-put-body-length-0-using-net-http

    The error: http: ContentLength=111 with Body length 0

    My code:

    	transport := &http.Transport{
    		TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
    	}
    
    	client := &http.Client{
    		Transport: transport,
    	}
    
    
    	site := fmt.Sprintf("https://%s/seg/api/v3/matrix/data/0/services-export", FSApplianceFQDN)
    	method := "POST"
    
    	//payload := strings.NewReader(`{"srcZoneId":"g_8973766297000773843","dstZoneId":"g_3554460426726078343","shouldOnlyShowPolicyViolation":false}`)
    	payloadFormat := fmt.Sprintf(`{"srcZoneId":"%s","dstZoneId":"%s","shouldOnlyShowPolicyViolation":false}`, SRCZone, DSTZone)
    	payload := strings.NewReader(payloadFormat)
    	req, err := http.NewRequest(method, site, payload)
    
    	if err != nil {
    		fmt.Println(err)
    		return
    	}
    	req.Header.Add("Host", FSApplianceFQDN)
    	req.Header.Add("Accept", "application/json, text/plain, */*")
    	req.Header.Add("Sec-Ch-Ua-Mobile", "?0")
    	req.Header.Add("Content-Type", "application/json;charset=UTF-8")
    	req.Header.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36")
    	req.Header.Add("Sec-Fetch-Site", "same-origin")
    	req.Header.Add("Sec-Fetch-Mode", "cors")
    	req.Header.Add("Sec-Fetch-Dest", "empty")
    	req.Header.Set("referer", fmt.Sprintf("https://%s/forescout-client/", FSApplianceFQDN))
    	req.Header.Add("Accept-Encoding", "gzip, deflate")
    	req.Header.Add("Accept-Language", "en-US,en;q=0.9")
    	req.Header.Add("Connection", "close")
    	user := fmt.Sprintf("%%22%s%%22", FSusername)
    	Cookies := fmt.Sprintf("JSESSIONID=%v; user=%v; XSRF-TOKEN=%v", JSESSIONID, user, XSRFTOKEN)
    	req.Header.Set("Cookie", Cookies)
    	req.Header.Set("X-Xsrf-Token", XSRFTOKEN)
    
    
    	grabclient := &grab.Client{HTTPClient: client}
    	grabreq := &grab.Request{HTTPRequest: req}
    	filedownload := grabclient.Do(grabreq)
    	if err := filedownload.Err(); err != nil {
    		log.Fatal(err)
    	}
    
  • cannot call non-function resp.Size (type int64)

    cannot call non-function resp.Size (type int64)

    Apologies if this is some rookie mistake - I'm fairly new to Go :-)

    When trying the basic grab example (straight from the homepage), I'm getting an error: cannot call non-function resp.Size (type int64)

    This is with Go 1.16.3 (reproducible both on MacOS and Raspbian; go env output attached). go mod reports github.com/cavaliercoder/[email protected]+incompatible.

    Any help would be greatly appreciated.

    goenv.txt main.go.txt

  • rate limiter is not accurate

    rate limiter is not accurate

    Time to wait should be expected download chunk time minus actual download chunk time, and then plus last sleep adjust time for compensation. Actual sleep time is not equal to desired sleep time exactly sometimes.

Go Package Manager (gopm) is a package manager and build tool for Go.

?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? In favor of Go Modules Proxy since Go 1.11, this pr

Dec 14, 2022
Concurrent download manager built in Go
Concurrent download manager built in Go

Golang Download Manager - Weekend Project A concurrent download manager written in pure Go without any dependency. I wrote this code for my YouTube vi

Sep 29, 2022
A terminal-based download manager written in Go!

goload-manager About A terminal-based download manager written in Go! This project uses grab for downloading and tcell for the terminal interface. Hea

Oct 7, 2021
A concurrent Download Manager written in Go

golang-download-manager A concurrent Download Manager written in Go Changes In main.go file paste the file url in fileUrl variable paste the path for

Aug 16, 2022
A simple download file manager that sorts your files into your desired folders, This was meant to be a small project and nothing big.

GoDFM Simply go to the tags and download the .exe file (or compile it yourself by running go build). Add it to your environment paths by going to sett

Aug 9, 2022
painless task queue manager for shell commands with an intuitive cli interface (execute shell commands in distributed cloud-native queue manager).

EXEQ DOCS STILL IN PROGRESS. Execute shell commands in queues via cli or http interface. Features Simple intuitive tiny cli app. Modular queue backend

Dec 14, 2022
Terraform Provider for Azure (Resource Manager)Terraform Provider for Azure (Resource Manager)
Terraform Provider for Azure (Resource Manager)Terraform Provider for Azure (Resource Manager)

Terraform Provider for Azure (Resource Manager) Version 2.x of the AzureRM Provider requires Terraform 0.12.x and later, but 1.0 is recommended. Terra

Oct 16, 2021
Got: Simple golang package and CLI tool to download large files faster 🏃 than cURL and Wget!

Got. Simple and fast concurrent downloader. Installation ❘ CLI Usage ❘ Module Usage ❘ License Comparison Comparison in cloud server: [root@centos-nyc-

Dec 29, 2022
A C/S Tool to Download Torrent Remotely and Retrieve Files Back Over HTTP at Full Speed without ISP Torrent Limitation.

remote-torrent Download Torrent Remotely and Retrieve Files Over HTTP at Full Speed without ISP Torrent Limitation. This repository is an extension to

Sep 30, 2022
Download, build, cache and run a Go app easily.

demand -- An easy way to install apps demand will download, build, cache and run a Go app. You can use it as an interpreter. Create a file bytes2human

Sep 10, 2017
Shell script to download and set GO environmental paths to allow multiple versions.
Shell script to download and set GO environmental paths to allow multiple versions.

gobrew gobrew lets you easily switch between multiple versions of go. It is based on rbenv and pyenv. Installation The automatic installer You can ins

Nov 3, 2022
Query AWS Athena and download the result as CSV.

Overview This tool can download an Athena SQL query results in CSV format. Installation Using Homebrew: $ brew tap flowerinthenight/tap $ brew install

Nov 11, 2021
tmux-wormhole - download files and directories with tmux!
tmux-wormhole - download files and directories with tmux!

tmux-wormhole Use tmux and magic wormhole to get things from your remote computer to your tmux. If tmux has DISPLAY set, open the file locally! Demo U

Nov 9, 2022
📷 Command-line utility to download all photos from Instagram
📷 Command-line utility to download all photos from Instagram

Instagram Downloader This is a simple command-line tool, written in Go, to download all images from an Instagram account. Getting Started Install inst

Sep 9, 2022
Download movie from YTS
Download movie from YTS

Torrent-Box Download movie form YTS without visiting to YTS built top on anacrolix/torrent lib. Motivation Most of the times, We are lazy people; to s

Nov 14, 2022
A small utility command line application that can recursively download Notion pages

notionbackup A small utility command line application that can recursively download Notion pages. I needed something scriptable that could periodicall

Dec 5, 2022
📥 Command-line tool to download videos from hanime.tv

hanime Command-line tool to download videos from hanime.tv Requirements Installation Install via go get Install from source Install from release Usage

Dec 18, 2022
Download an upload large files to Google Drive (API v3)

gdriver gdriver is a command-line tool, written in Go, used for uploading and downloading large personal files from Google Drive (API v3). The tool pr

Nov 30, 2022
Desktop application to download videos and playlists from youtube by simply copying its url.
Desktop application to download videos and playlists from youtube by simply copying its url.

tubemp3 Desktop application to download videos and playlists from youtube by simply copying its url. You just need to run tubemp3 and copy (CTRL + C)

Oct 25, 2022
Download Vimeo videos and retrieve metadata in Go.

vimego Download Vimeo videos and retrieve metadata. Largely based on yashrathi's vimeo_downloader. Installing go get github.com/raitonoberu/vimego Ple

Dec 30, 2022