An IPFS bytes exchange for caching and retrieving data from Filecoin


🐸

go-hop-exchange


An IPFS bytes exchange to allow any IPFS node to become a Filecoin retrieval provider and retrieve content from Filecoin

Highlights

  • IPFS exchange interface like Bitswap
  • Turn any IPFS node into a Filecoin retrieval provider (YES, that means you will earn FIL when we launch on mainnet!)
  • New content is dispatched via Gossipsub and stored if enough space is available
  • IPFS Plugin to wrap the default Bitswap implementation and fetch blocks from Filecoin if not available on the public IPFS network
  • Upload and retrieve directly from Filecoin if not secondary providers cache the content (Coming Soon)

Background

To speed up data retrieval from Filecoin, a secondary market allows clients to publish their content ids to a network of providers in order to retrieve it faster and more often at a cheaper price. This does not guarrantee data availability and so should be used in addition to a regular storage deal. You can think of this as the CDN layer of Filecoin. This library is still very experimental and more at the prototype stage so feel free to open an issue if you have any suggestion or would like to contribute!

Install

As a library:

$ go get github.com/myelnet/go-hop-exchange

As an IPFS plugin:

Please follow the instructions in the plugin repo

Library Usage

  1. Import the package.
package main

import (
	hop "github.com/myelnet/go-hop-exchange"
)
  1. Initialize a blockstore, graphsync, libp2p host and gossipsub subscription.
var ctx context.Context
var bstore blockstore.Blockstore
var ps *pubsub.PubSub
var host libp2p.Host
var ds datastore.Batching
var gs graphsync.GraphExchange
var ks keystore.Keystore

exch, err := hop.NewExchange(
		ctx,
		hop.WithBlockstore(bstore),
		hop.WithPubSub(ps),
		hop.WithHost(host),
		hop.WithDatastore(ds),
		hop.WithGraphSync(gs),
		hop.WithRepoPath("ipfs-repo-path"),
		hop.WithKeystore(ks),
		hop.WithFilecoinAPI(
			"wss://filecoin.infura.io",
			http.Header{
				"Authorisation": []string{"Basic "},
			},
		)
	)

blocks := bserv.New(bstore, exch)
dag := dag.NewDAGService(n.blocks)

WithFilecoinAPI is optional and if not provided, the node only supports free transfers and will charge 0 price per byte to client requests for content it serves. This is mostly for testing purposes.

  1. When getting from the DAG it will automatically query the network
var dag ipld.DAGService
var ctx context.Context
var root cid.Cid

node, err := dag.Get(ctx, root)
  1. Clients can anounce a new deal they made so the content is propagated to providers this is also called when adding a block with dag.Add.
var ctx context.Context
var root cid.Cid

err := exch.Announce(ctx, root)
  1. We're also exposing convenience methods to transfer funds or import keys to the underlying wallet
var ctx context.Context
var to address.Address

from, err := exch.Wallet().DefaultAddress() 

err = exch.Wallet().Transfer(ctx, from, to, "12.5")

Design principles

  • Composable: Hop is highly modular and can be combined with any ipfs, data transfer, Filecoin or other exchange systems.
  • Lightweight: we try to limit the size of the build as we are aiming to bring this exchange to mobile devices. We do not import core implementations such as go-ipfs or lotus directly but rely on shared packages.
  • Do one thing well: there are many problems to solve in the decentralized storage space. This package only focuses on routing and retrieving content from peers in the most optimal way possible.
  • KISS: Keep it simple, stupid. We take a naive approach to everything and try not to reinvent the wheel. Filecoin is already complex enough as it is.
Owner
Myel
Community powered content delivery network
Myel
Comments
  • feat: enhance logging

    feat: enhance logging

    We can now run the program and specify log level via arg or env :

    • ENV LOG=debug go run . start
    • go run . -log=debug start

    3 levels available : trace, debug and info

    • trace & debug for dev purpose with slow but clean logs
    • info for prod with fast json encoded logs

    Also, this PR will try to replace every fmt.print based logs with the zerolog lib, and will log all the silenced errors, at the cost that during the tests for exemple, too many lines might be displayed.

  • Use a Myel node for bootstrap and gate connections from IPFS nodes

    Use a Myel node for bootstrap and gate connections from IPFS nodes

    • Confirm that a Myel node can be used as a bootstrap node (I think the DHT server mode is enabled by default but I might be wrong so you can double check by connecting 2 nodes to a 3rd Myel node and see if they discover each other.
    • Use a connection gater to prevent Myel nodes from connecting to IPFS nodes. Two ways to go about it would be either checking the user agent header or the protocols a peer supports.
  • Generate SSL certificates for Myel providers

    Generate SSL certificates for Myel providers

    How could Myel providers who run a Pop on their home devices easily generate a SSL certificate so clients can retrieve over WSS

    Benefits: Each swarm could use its own SSL certificates to ensure :

    1. A possible communication of peers using web browsers with the Providers
    2. A safe communication channel

    Problem:

    1. Deal with ACME DNS challenge
    2. Might be too SPOF

    Hints :

  • Implement logging strategy

    Implement logging strategy

    Currently there is very little logs and most implemented with fmt.Print* we need a performant logging solution that can also save logs to disk when running on remote machines. Also support different levels of logging. https://github.com/ipfs/go-log might be convenient otherwise https://github.com/go-kit/kit/tree/master/log is nice as well. We can discuss.

  • Smart chunking

    Smart chunking

    When it is possible to detect file type with the extension name, we should select an appropriate chunking strategy. This improves deduplication, data transfer speed and makes the network overall more efficient. Here's some general guidelines:

    • Audio and video content should have trickle layout and chunk sizes of 1MB.
    • Images, compressed archives (.zip etc), size splitter with 1MB chunks, balanced layout.
    • Text, JSON etc. Buzhash chunker with balanced layout and 16kb chunks for best deduplication. We can probably experiment with different params but this seems to be reasonable efforts.
  • Create nodes with “testing” roles

    Create nodes with “testing” roles

    We need nodes with "testing" roles in that they are capable of autonomously:

    • Sending files to be cached by other peers.
    • Retrieving files from peers at random time intervals.
    • Logging and displaying stats from these actions.

    The nodes should be capable of running multiple test scenarios.

    eg. nodes send File A 99% of the time to emulate the coverage of a "popular"/ oft requested file. They send File B 1% of time and then gather stats that compare coverage and retrieval performance for A and B.

    eg. nodes have a library of files in order of increasing size. They collect stats on pushing / retrieval times relative to file size.

  • refactor: different way to wait for data transfer events

    refactor: different way to wait for data transfer events

    This seems like a better way for waiting for a transfer without having to do gymnastics to check it is the correct transfer. @gallexis maybe try this way on the replication dispatch (replication.go:141)

  • feat: benchmark JS client

    feat: benchmark JS client

    This adds a new cli tool for running benchmarks against a JS client running in a headless chrome browser. The cli offers 2 different test modes:

    • e2e: starts a pop node as provider, adds all the content in the given directory and retrieves it all in parallel
    • daemon: starts a service worker client in a headless browser and a cli command server listening for get commands. Each get command navigates to the given URL loading the content from the service worker client during a shared browser session.
  • feat: import car file and replicate to the network

    feat: import car file and replicate to the network

    @alexander-camuto you will need to set the replication factor to the exact number of nodes you'd like to dispatch too. Make sure they're connected otherwise it might hang for a while.

  • feat: add upgrade handler to the pop server

    feat: add upgrade handler to the pop server

    Now if a github secret is provided, the server will listen for webhook requests and trigger the upgrade handler. This means you can send requests to <name.myel.zone>/upgrade and if activate, the node should auto upgrade. TODO:

    • [x] We're still missing a function to call the process to restart itself. @alexander-camuto I've seen some libraries do that if you wanna look into it.
  • feat: k8s capabilities for pop

    feat: k8s capabilities for pop

    • allows for the deployment of a global CDN
    • a core number of nodes are run on regular EC2 on-demand instance -- these form the backbone of the CDN
    • scaling the CDN is then done using volatile aws spot / excess compute
    • master node autoscales CDN as requests increase i.e deploys more or less worker nodes
    • master node provides a monitoring dashboard to track usage / performance
    • worker nodes all share same FIL private key
  • Make requests to IPFS gateways using bcli

    Make requests to IPFS gateways using bcli

    • Add the ability to make requests for CIDs from arbitrary IPFS gateways eg. bcli get ipfs.com cid
    • Log TTFB and transfer time as when fetching from pop nodes
  • Decentralized hole punching

    Decentralized hole punching

    • We currently use ngrok as a NAT hole punching solution. Although easy to use this introduces a myriad of setbacks:
      • ngrok servers are not distributed geographically, so we really take a hit in performance
      • ngrok code is proprietary so its hard to figure out what exactly is going on behind the scenes
      • ngrok relays / servers can't act as providers on our network -- which is a missed opportunity

    In light of the PL project flare developments we should consider rolling out their relay-circuit v2 implementation for the public nodes on our networks (i.e those not behind a NAT).

    Some brief notes on how to implement this:

    • Use autonat to determine if a node is behind a NAT or not -- if not, automatically promote a node to a relay.
    • Multi-address can contain information as to whether a peer requires a connection via a relay or not -- and through which relay.
    • Relays can see requests for content and would determine if they themselves should cache the content to boost performance and avoid the messaging roundtrips hole-punching requires -- this would be a first step in introducing a performance boosting hierarchy to the network.
    • Baking in support for WebRTC would remove the need for the DNS records we have to maintain atm -- which would make it easier to onboard new providers (as WebRTC is currently the only protocol that can perform hole-punching browser side)
  • Separate hashing function when responding to caching requests

    Separate hashing function when responding to caching requests

    • Currently providers can claim they already have a CID locally when receiving a caching request. This provides a vector of attack whereby malicious providers could lie about the content they have locally to intentionally reduce the replication factor / redundancy of specific pieces of content.

    • A simple solution to this is for the provider to back up this claim using a new hash of the content DAG (eg. keccak / sha-3) that the CID alone wouldn't provide -- serving as a simple proof that the provider does have the content.

      • the problem is that the provider can hold content, hash it, and subsequently delete it, but store the hash to respond to subsequent requests.
      • a potential solution is to use a keyed hash function, whereby a node sending caching requests also includes a randomly generated key as a payload. The nodes responding then have to hash the DAG using that key, providing a proof that, at least at that given point in time, they did actually hold the content.
  • TestMultiTx is racy

    TestMultiTx is racy

    Failed in CI with output:

    {"level":"error","error":"No state for /1626167599029: datastore: key not found","time":"2021-07-13T09:13:20Z","message":"attempting to configure data store"}
    2021-07-13T09:13:20.035Z	ERROR	fsm	fsm/fsm.go:92	Executing event planner failed: Invalid transition in queue, state `0`, event `4`:
        github.com/filecoin-project/go-statemachine/fsm.eventProcessor.Apply
            /home/runner/go/pkg/mod/github.com/filecoin-project/[email protected]/fsm/eventprocessor.go:137
    2021-07-13T09:13:20.035Z	ERROR	fsm	fsm/fsm.go:92	Executing event planner failed: Invalid transition in queue, state `0`, event `4`:
        github.com/filecoin-project/go-statemachine/fsm.eventProcessor.Apply
            /home/runner/go/pkg/mod/github.com/filecoin-project/[email protected]/fsm/eventprocessor.go:137
    {"level":"error","error":"No state for /1626167599030: datastore: key not found","time":"2021-07-13T09:13:20Z","message":"attempting to configure data store"}
    2021-07-13T09:13:20.051Z	ERROR	fsm	fsm/fsm.go:92	Executing event planner failed: Invalid transition in queue, state `0`, event `4`:
        github.com/filecoin-project/go-statemachine/fsm.eventProcessor.Apply
            /home/runner/go/pkg/mod/github.com/filecoin-project/[email protected]/fsm/eventprocessor.go:137
    2021-07-13T09:13:20.051Z	ERROR	fsm	fsm/fsm.go:92	Executing event planner failed: Invalid transition in queue, state `0`, event `4`:
        github.com/filecoin-project/go-statemachine/fsm.eventProcessor.Apply
            /home/runner/go/pkg/mod/github.com/filecoin-project/[email protected]/fsm/eventprocessor.go:137
    2021-07-13T09:13:20.053Z	ERROR	fsm	fsm/fsm.go:92	Executing event planner failed: Invalid transition in queue, state `6`, event `27`:
        github.com/filecoin-project/go-statemachine/fsm.eventProcessor.Apply
            /home/runner/go/pkg/mod/github.com/filecoin-project/[email protected]/fsm/eventprocessor.go:137
        tx_test.go:415: could not finish gtx2
    
  • Improve payment channel settlement on the provider side

    Improve payment channel settlement on the provider side

    • Context: In the case of a simple retrieval, the provider can redeem the received vouchers and settle the channel once the transfer is finished.
    • Problem: if the client wishes to reuse a payment channel a while longer say for progressively retrieving parts of a DAG, it becomes more complex for the provider to know when is a good time to settle a payment channel.
    • Naive solution: wait for the client to call settle so the provider knows the client no longer needs the channel and can redeem all the vouchers as one. This is nice because it means the provider needn't pay for the settle gas costs.
    • Caveat: What if the client disappears for whatever reasons without calling settle, the provider must then have a way to collect their earnings. It also means the provider must subscribe to chain events.
    • Enhanced solution: The client must set a MinSettleHeight param on the vouchers which guarantees no one can call settle before then. The provider reads the value and can decide to update the payment channel using the voucher or just wait knowing more transfers might be coming. If the client doesn't call settle by the chain height, the provider can just redeem and settle the channel.
    • Security consideration: Providers shouldn't accept vouchers with a TimelockMin value 12h over the MinSettleHeight as it would mean the client can call settle and collect back their funds before the provider can redeem the vouchers.
    • Additional improvements: Subscribing to chain epochs from a lotus node RPC puts too much strain and dependency on 3rd party infrastructure. Nodes should connect to a few lotus peers directly and subscribe to the gossip topic announcing new blocks. This could be a good start for enabling pushing blocks directly in the future.
Ipfs-retriever - An application that retrieves files from IPFS network

ipfs-retriever This is an application that retrieves files from IPFS network. It

Jan 5, 2022
Tapestry is an underlying distributed object location and retrieval system (DOLR) which can be used to store and locate objects. This distributed system provides an interface for storing and retrieving key-value pairs.

Tapestry This project implements Tapestry, an underlying distributed object location and retrieval system (DOLR) which can be used to store and locate

Mar 16, 2022
A tool for checking the accessibility of your data by IPFS peers

ipfs-check Check if you can find your content on IPFS A tool for checking the accessibility of your data by IPFS peers Documentation Build go build wi

Dec 17, 2022
A minimal filecoin client library

filclient A standalone client library for interacting with the filecoin storage network Features Make storage deals with miners Query storage ask pric

Sep 8, 2022
A Filecoin Network sidecar for miners to bid in storage deal auctions.
A Filecoin Network sidecar for miners to bid in storage deal auctions.

bidbot Bidbot is a Filecoin Network sidecar for miners to bid in storage deal auctions. Join us on our public Slack channel for news, discussions, and

Nov 24, 2022
Jazigo is a tool written in Go for retrieving configuration for multiple devices, similar to rancid, fetchconfig, oxidized, Sweet.

Table of Contents About Jazigo Supported Platforms Features Requirements Quick Start - Short version Quick Start - Detailed version Global Settings Im

Jan 5, 2023
A simple tool for retrieving a request's IP address on the server.

reqip A simple tool for retrieving a request's IP address on the server. Inspired from request-ip Installation Via go get go get github.com/mo7zayed/r

Oct 26, 2022
Deece is an open, collaborative, and decentralised search mechanism for IPFS
Deece is an open, collaborative, and decentralised search mechanism for IPFS

Deece Deece is an open, collaborative, and decentralised search mechanism for IPFS. Any node running the client is able to crawl content on IPFS and a

Oct 29, 2022
🌐 (Web 3.0) Pastebin built on IPFS, securely served by Distributed Web and Edge Network.
🌐 (Web 3.0) Pastebin built on IPFS, securely served by Distributed Web and Edge Network.

pastebin-ipfs 简体中文 (IPFS Archivists) Still in development, Pull Requests are welcomed. Pastebin built on IPFS, securely served by Distributed Web and

Jan 1, 2023
IPFS implementation in Go
IPFS implementation in Go

go-ipfs What is IPFS? IPFS is a global, versioned, peer-to-peer filesystem. It combines good ideas from previous systems such as Git, BitTorrent, Kade

Jan 9, 2023
A standalone ipfs gateway

rainbow Because ipfs should just work like unicorns and rainbows Building go build Running rainbow Configuration NAME: rainbow - a standalone ipf

Nov 9, 2022
A minimal IPFS replacement for P2P IPLD apps

IPFS-Nucleus IPFS-Nucleus is a minimal block daemon for IPLD based services. You could call it an IPLDaemon. It implements the following http api call

Jan 4, 2023
Technical specifications for the IPFS protocol stack
Technical specifications for the IPFS protocol stack

IPFS Specifications This repository contains the specs for the IPFS Protocol and associated subsystems. Understanding the meaning of the spec badges a

Jan 7, 2023
Generates file.key file for IPFS Private Network.

ipfs-keygen Generates file.key file for IPFS Private Network. Installation go get -u github.com/reixmor/ipfs-keygen/ipfs-keygen Usage ipfs-keygen > ~/

Jan 18, 2022
Go-ipfs-pinner - The pinner system is responsible for keeping track of which objects a user wants to keep stored locally

go-ipfs-pinner Background The pinner system is responsible for keeping track of

Jan 18, 2022
Go client for the Foreign exchange rates and currency conversion API 💰

fixer Go client for Fixer.io (Foreign exchange rates and currency conversion API) You need to register for a free access key if using the default Fixe

Nov 14, 2022
Cert bound sts server - Certificate Bound Tokens using Security Token Exchange Server (STS)
Cert bound sts server - Certificate Bound Tokens using Security Token Exchange Server (STS)

Certificate Bound Tokens using Security Token Exchange Server (STS) Sample demonstration of Certificate Bound Tokens acquired from a Security Token Ex

Jan 2, 2022
Sixmap - Tool to visualize the SIX (Seattle Internet Exchange) route server coverage

Mapping the SIX route server This program generates an IPv4 map. In particular,

Nov 9, 2022
concurrent caching proxy and decoder library for collections of PMTiles

go-pmtiles A caching proxy for the serverless PMTiles archive format. Resolves several of the limitations of PMTiles by running a minimalistic, single

Jan 2, 2023