Filecoin sector recover

扇区修复

Filecoin在封装或挖矿过程中,可能面临扇区数据丢失,那么就要被销毁PreCommit预质押的FIL,或者终止扇区最大损失扇区的90天的收益。扇区修复能修复丢失的文件,来减少或者避免损失。

扇区丢失的原因

1.存储盘坏盘

矿商为了降低封装成本,不得不使用裸盘做存储,来降低成本,提高自己的竞争力,往往会直接使用裸盘做扇区的存储。 16T的盘,可以存储130多个32GiB扇区,如果损坏一个硬盘,数据无法恢复要终止扇区,最大损失扇区90天的全网平均收益。

2. NVMe缓存盘坏盘

在这个情况下,扇区有2个状态会造成损失。

  • 扇区已经提交了PreCommit消息,但是30内未提交ProveCommit消息,会被销毁PreCommit预质押的FIL;
  • 设置 FinalizeEarly=false,使用先提交ProveCommit再落到存储,等同丢失扇区需要终止扇区。

扇区修复的工作原理

从原因可以分析,所有需要恢复的扇区都是已经提交过PreCommit消息之后的扇区,一旦数据丢失,只能重新组装原始封装的数据,重新封装扇区。

p1o, err := ffi.SealPreCommitPhase1(
    sector.ProofType,
    paths.Cache,
    paths.Unsealed,
    paths.Sealed,
    sector.ID.Number,
    sector.ID.Miner,
    ticket,
    pieces,
)

p2...

重新封装CC扇区(无订单),pieces使用官方默认的生成方式即可,额外的需要获取ticket和ProofType。

这是链上记录的PreCommit消息体:

{
  "RegisteredProof": 9,
  "SectorNumber": 1322006,
  "SealedCid": "bagboea4b5abcasqanjadumno7blvgx4k5pk765cki6vurnpgs2q3trt2trkznhj3",
  "SealRandEpoch": 925221,
  "DealIds": [],
  "Expiration": 2480426,
  "ReplaceCapacity": false,
  "ReplaceSectorDeadline": 0,
  "ReplaceSectorPartition": 0,
  "ReplaceSector": 0
}

ProofType在不通网络版本和扇区大小下是不相同的,详细代码。直接使用消息体中的RegisteredProof更加方便。

通过扇区区ticket的高度SealRandEpoch,向链服务器再次提取出随机数。

ticket, err := fullNodeApi.ChainGetRandomnessFromTickets(ctx, ts.Key(), crypto.DomainSeparationTag_SealRandomness, ticketEpoch, buf.Bytes())
if err != nil {
    return nil, nil, err
}

比较链上记录的PreCommit消息体中的SealedCid和修复程序PreCommit2计算结果storage.SectorCids,如果结果cid一致表示修复成功!

Go

构建filecoin-sealer-recover,你需要安装Go 1.16.4 or higher:

wget -c https://golang.org/dl/go1.16.4.linux-amd64.tar.gz -O - | sudo tar -xz -C /usr/local

构建

构建需要下载一些Go模块。这些通常托管在Github上,而Github来自中国的带宽较低。要解决此问题,请在运行之前通过设置以下变量来使用本地代理:

export GOPROXY=https://goproxy.cn,direct  

Build and install

make clean all

sudo make install

使用方式

help

sealer-recover -h

启动:

export FULLNODE_API_INFO=链节点的token
sealer-recover --miner=f01000 \
    --sectorNum=0 \ 
    --sector-size=32GiB \ 
    --sealing-result=/sector \ 
    --sealing-temp=/temp

参数介绍

参数 含义 备注
miner 需要修复扇区的矿工号 必填
sectorNum 需要修复的扇区号 必填
sector-size 需要修复的扇区大小 默认值: 32GiB
sealing-result 修复后的扇区产物路径 默认值: ~/sector
sealing-temp 修复过程的中间产物路径,需要大空间,建议使用NVMe盘 默认值: ~/temp
最小空间:
32GiB # > 512GiB!
64GiB # > 1024GiB!

优化

使用自己修改的lotus进行打包,能优化修复速度

TODO

  • 支持有订单的扇区,通过链的订单id,检索到订单,重新生成pieces。
  • 批量并行执行修复程序

Contributing

欢迎PR、错误报告和问题建议!对于重大更改,请先在issues中提出问题,以便讨论兼容性和收益。

Other

License

Licensed under Apache 2.0

Owner
FrogHub
The most professional filecoin solution provider.
FrogHub
Comments
  • 无法导出信息

    无法导出信息

    出现以下错误的原因是什么?谢谢 2022-01-18T18:28:54.794+0800 ERROR export export/export.go:76 Getting sector (4996) precommit info error: Getting sector PreCommit info err:%!(EXTRA *jsonrpc.respError=(get sset) failed to load miner actor: load state tree: failed to load state tree bafy2bzaceals5hf2qsm5luniq73dg3hl3psvrytzzflscrzl3ftjjdqelh5lo: failed to load hamt node: blockstore: block not found)

  • PreCommitInfo报错

    PreCommitInfo报错

    执行StateSectorPreCommitInfo的时候报错,这是几个月前上链的一个扇区,现在OnChain和Active都是Yes

    EXTRA *jsonrpc.respError=(get sset) failed to load miner actor: load state tree: failed to load state tree bafy2bzacec3cvflydfyy64uifinqyrtt6rjovnlxuhp7c7r6ifddojgc7ujdi: failed to load hamt node: blockstore: block not found

    怀疑是因为对daemon同步的数据有裁剪,导致找不到几个月前的区块,但不是很确定

  • Which kind of lotus daemon can be used with filecoin-sealer-recover?

    Which kind of lotus daemon can be used with filecoin-sealer-recover?

    I wonder if sealer-recover should be connected to just any lotus daemon (empty isolated node without wallets), or exactly to that daemon that served the miner affected by sectors' corruption?

  • sealer-recover进程总是被系统Kill掉

    sealer-recover进程总是被系统Kill掉

    time="2021-11-16T09:44:56+08:00" level=info msg="Start sealer recovery!" time="2021-11-16T09:44:56+08:00" level=info msg="Start recover sector(1132569,165), registeredSealProof: 8, ticket: 968633efae73dfed06237073ec1d19ff538380a449323ac1ce49ae4602ad10ce" time="2021-11-16T09:44:56+08:00" level=info msg="Start running AP, sector (165)" 2021-11-16T09:45:53.664 INFO filcrypto::proofs::api > generate_data_commitment: start 2021-11-16T09:45:53.664 INFO filecoin_proofs::api::seal > compute_comm_d:start 2021-11-16T09:45:53.664 INFO filecoin_proofs::pieces > verifying 8192 pieces 2021-11-16T09:45:53.666 INFO filecoin_proofs::api::seal > compute_comm_d:finish 2021-11-16T09:45:53.666 INFO filcrypto::proofs::api > generate_data_commitment: finish time="2021-11-16T09:45:53+08:00" level=info msg="Complete AP, sector (165)" time="2021-11-16T09:45:53+08:00" level=info msg="Start running PreCommit1, sector (165)" 2021-11-16T09:45:53.667 INFO filcrypto::proofs::api > seal_pre_commit_phase1: start 2021-11-16T09:45:53.667 INFO filecoin_proofs::api::seal > seal_pre_commit_phase1:start: SectorId(165) 2021-11-16T09:46:20.258 INFO filecoin_proofs::api::seal > building merkle tree for the original data 2021-11-16T09:47:16.556 INFO filecoin_proofs::api::seal > verifying pieces 2021-11-16T09:47:16.556 INFO filecoin_proofs::pieces > verifying 1 pieces 2021-11-16T09:47:16.556 INFO storage_proofs_porep::stacked::vanilla::proof > replicate_phase1 2021-11-16T09:47:16.556 INFO storage_proofs_porep::stacked::vanilla::graph > using parent_cache[2048 / 1073741824] 2021-11-16T09:47:16.556 INFO storage_proofs_porep::stacked::vanilla::cache > parent cache: opening /nvme0n1/filecoin-proof-parent/v28-sdr-parent-21981246c370f9d76c7a77ab273d94bde0ceb4e938292334960bce05585dc117.cache, verify enabled: false 2021-11-16T09:47:16.556 INFO storage_proofs_porep::stacked::vanilla::proof > single core replication 2021-11-16T09:47:16.556 INFO storage_proofs_porep::stacked::vanilla::create_label::single > generate labels 2021-11-16T09:47:16.556 INFO storage_proofs_porep::stacked::vanilla::create_label::single > generating layer: 1 日记到这里,sealer-recover进程就被系统Kill掉 nohup sealer-recover --miner=f01133568 --sectors=165 --sectors=174 --sealing-result=/nvme1n1/recover --sealing-temp=/nvme1n1/recover/tmp > log.txt &

    echo $! > service.pid

  • Can recover support two Nvidia Card?

    Can recover support two Nvidia Card?

    https://github.com/froghub-io/filecoin-sealer-recover/blob/3ba9e489a92c431815ffd827e57e3874a92bc36d/recovery/recover.go#L248

    According to code below, It seems P2Lock is single, So only one P2 work to do at once?

    can two card run two P2 work at same time?

  • Where did the old command format go?

    Where did the old command format go?

    It was easy and convenient to work in the old command format. Why was it necessary to split the sector recovery into two operations (data export and recovery itself)?

  • Getting sector (8684) precommit info error

    Getting sector (8684) precommit info error

    I encounter this issue: $ sealer-recover export --miner=f01178141 8684 2022-03-04T11:32:14.749+0800 ERROR export export/export.go:76 Getting sector (8684) precommit info error: Getting sector PreCommit info err:%!(EXTRA *jsonrpc.respError=(get sset) failed to load miner actor: load state tree: failed to load state tree bafy2bzacea2hh3exmmcxj2vppvo2t4t7suagjhrdpt6mv3aimbqpd6viw57dc: failed to load hamt node: blockstore: block not found) export 0 sectors, failt sectors: [] , elapsed: 32.386442ms

    how to solve?

  • Sector (No.) , running PreCommit2  error: sealed cid mismatching!!!

    Sector (No.) , running PreCommit2 error: sealed cid mismatching!!!

    The following error occurs on some recovered sectors:

    2021-12-25T12:02:35.621 INFO storage_proofs_core::data > dropping data /media/sn750/recover-57853956948449/sealed/s-t01222595-5785 2021-12-25T12:02:38.745 INFO filecoin_proofs::api::seal > seal_pre_commit_phase2:finish 2021-12-25T12:02:38.745 INFO filcrypto::proofs::api > seal_pre_commit_phase2: finish INFO[2021-12-25T12:02:38+03:00] Complete PreCommit2, sector ({1222595 5785}) ERRO[2021-12-25T12:02:38+03:00] Sector (5785) , running PreCommit2 error: sealed cid mismatching!!! (sealedCID: bagboea4b5abcb42beroeboobypwj2xvsg2ryc5xttathfol3tre3cioidvpjyhb5, newSealedCID: bagboea4b5abcbtgdq542zczhlmvif7ca5brg374gg5oo5hxfbc6xyjgiuazrpuao) INFO[2021-12-25T12:05:20+03:00] Complete sector (5785)

    What causes it and how to fix it?

  • Successfully restored and moved to storage, notifying miner to declare the sector.

    Successfully restored and moved to storage, notifying miner to declare the sector.

    If the sector is restored successfully, the miner will be notified to declare the sector as soon as possible, then the failure time of this sector can be used. Need to use MINER_API_INFO to quickly declare the new location of the sector by calling the interface StorageDeclareSector.

Implementation of the Filecoin protocol, written in Go
Implementation of the Filecoin protocol, written in Go

Project Lotus - 莲 Lotus is an implementation of the Filecoin Distributed Storage Network. For more details about Filecoin, check out the Filecoin Spec

Jan 9, 2023
A Go client and CLI for Filecoin Storage Auctions.

go-auctions-client A Go library and CLI to interact with Filecoin Storage Auctions. Join us on our public Slack channel for news, discussions, and sta

Nov 24, 2022
Yet another filecoin secondary retrieval client
Yet another filecoin secondary retrieval client

fcr Yet another filecoin secondary retrieval client FCR is a filecoin secondary retrieval client featured with the ability to participate in an ipld r

Nov 9, 2022
An implementation of the Filecoin Distributed Storage Network
An implementation of the Filecoin Distributed Storage Network

Project Lotus - 莲 Lotus is an implementation of the Filecoin Distributed Storage Network. For more details about Filecoin, check out the Filecoin Spec

Oct 27, 2021
Rqlite-recover - k8 controller to create recover json for rqlite cluster nodes when needed.

Cluster Recover for RQLite running on a k8s cluster The goal is to be able to recover a rqlite cluster when the majority of nodes get re-schedule to d

Sep 8, 2022
Build a circular sector polygon feature spanning the angle between two given bearings, a center point and a radius. A pizza piece! 🍕
Build a circular sector polygon feature spanning the angle between two given bearings, a center point and a radius. A pizza piece! 🍕

sectr ?? Build a circular sector polygon feature (pizza piece ?? ) spanning the angle between two given bearings, a radius and a center point. install

Oct 1, 2022
Fuzz Go defer/panic/recover

deferfuzz deferfuzz is a fuzzer for Go defer/panic/recover. Caveat: I wrote this in a couple hours, and I'm more of a compiler engineer than a fuzzer

Mar 3, 2022
Rem is a CLI trash which makes it ridiculously easy to recover files.
Rem is a CLI trash which makes it ridiculously easy to recover files.

Rem is a CLI trash which makes it ridiculously easy to recover files. We've all had that moment when we've deleted something we realised we shouldn't have. It sucks. Let's fix that!

Dec 21, 2022
recover(4) go rewrite

This is a partial go rewrite of the original C version. Original C version authors: Gorka Guardiola, [email protected] Russ Cox, [email protected] Eric Va

Dec 19, 2022
Generic error handling with panic, recover, and defer.

Generic error handling with panic, recover, and defer.

Aug 25, 2022
sealer-recover

扇区修复 Filecoin在封装或挖矿过程中,可能面临扇区数据丢失,那么就要被销毁PreCommit预质押的FIL,或者终止扇区最大损失扇区的90天的收益。扇区修复能修复丢失的文件,来减少或者避免损失。 扇区丢失的原因 1.存储盘坏盘 矿商为了降低封装成本,不得不使用裸盘做存储,来降低成本,提高自己

Nov 6, 2021
DORY is a tool who enables people to recover their access to an Active Directory service, by changing, resetting or unlocking their account.

DORY - Server Expose a simple API to manipulate AD. Password reinitialization Password changer Account Unlocking You must have LDAPS (port 636) active

Oct 3, 2022
An IPFS bytes exchange for caching and retrieving data from Filecoin

?? go-hop-exchange An IPFS bytes exchange to allow any IPFS node to become a Filecoin retrieval provider and retrieve content from Filecoin Highlights

Aug 25, 2022
Multitiered file storage API built on Filecoin and IPFS
Multitiered file storage API built on Filecoin and IPFS

Powergate Powergate is a multitiered file storage API built on Filecoin and IPFS, and an index builder for Filecoin data. It's designed to be modular

Dec 20, 2022
Implementation of the Filecoin protocol, written in Go
Implementation of the Filecoin protocol, written in Go

Project Lotus - 莲 Lotus is an implementation of the Filecoin Distributed Storage Network. For more details about Filecoin, check out the Filecoin Spec

Jan 9, 2023
A minimal filecoin client library

filclient A standalone client library for interacting with the filecoin storage network Features Make storage deals with miners Query storage ask pric

Sep 8, 2022
🤖🤝A tool to test and analyze storage and retrieval deal capability on the Filecoin network.

Dealbot A tool to test and analyze storage and retrieval deal capability on the Filecoin network. Getting Started Clone the repo and build: git clone

Sep 10, 2022
A Filecoin Network sidecar for miners to bid in storage deal auctions.
A Filecoin Network sidecar for miners to bid in storage deal auctions.

bidbot Bidbot is a Filecoin Network sidecar for miners to bid in storage deal auctions. Join us on our public Slack channel for news, discussions, and

Nov 24, 2022