Checklist
- [X] This is not a security-related bug/issue. If it is, please follow please follow the security policy.
- [X] This is not a question or a support request. If you have any lotus related questions, please ask in the lotus forum.
- [X] This is not a new feature request. If it is, please file a feature request instead.
- [X] This is not an enhancement request. If it is, please file a improvement suggestion instead.
- [X] I have searched on the issue tracker and the lotus forum, and there is no existing related issue or discussion.
- [X] I am running the
Latest release
, or the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.
- [X] I did not make any code changes to lotus.
Lotus component
- [ ] lotus daemon - chain sync
- [ ] lotus miner - mining and block production
- [X] lotus miner/worker - sealing
- [ ] lotus miner - proving(WindowPoSt)
- [ ] lotus miner/market - storage deal
- [ ] lotus miner/market - retrieval deal
- [ ] lotus miner/market - data transfer
- [ ] lotus client
- [ ] lotus JSON-RPC API
- [ ] lotus message management (mpool)
- [ ] Other
Lotus Version
lotus-miner version
Daemon: 1.16.0-rc3+mainnet+git.824da5ea5+api1.5.0
Local: lotus-miner version 1.16.0-rc3+mainnet+git.824da5ea5
lotus-worker info
Worker version: 1.6.0
CLI version: lotus-worker version 1.16.0-rc3+mainnet+git.824da5ea5
Session: c4a57112-1d67-481d-8806-fde3a9adf1b5
Enabled: true
Hostname: sealer
CPUs: 128; GPUs: [GeForce RTX 3090]
RAM: 222.2 GiB/251.6 GiB; Swap: 201.4 GiB/526 GiB
Task types: FIN GET FRU C1 C2 PC2 PC1 PR1 PR2 RU AP DC GSK
536385c7-a3d1-479f-affc-9961eb10c052:
Weight: 10; Use: Seal
Local: /seal2/worker
c45a6c01-b823-4813-92b9-fa2841ea3523:
Weight: 10; Use: Seal
Local: /seal/worker
Describe the Bug
The scheduler is very inefficient now, my sealing worker has 256 GB of ram and used to seal 4xPC1 or 3xPC1 plus other tasks including AP, GET, PC2 or C2 concurrently.
Since the upgrade, The worker will do 4xPC1 and then when one of the PC1's completes do a single C2 and no other task. However the C2 tasks now takes 60 minutes instead of 12 minutes. Once all of the PC1s finish, the worker will only handle a single C2 or PC2 and other simple tasks like AP, PC1 will queue up.
Notice in the logging section below that the scheduler claims there are not enough threads even though only 6 of the 128 threads are in use (ThreadRipper 3990X)
2022-06-26T01:33:08.598Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
lotus-miner sealing jobs
ID | Sector | Worker | Hostname | Task | State | Time
-- | -- | -- | -- | -- | -- | --
7bc24c52 | 4546 | c4a57112 | sealer | PC2 | running | 3m31.7s
00000000 | 4547 | c4a57112 | sealer | PC2 | prepared | 3m31.6s
00000000 | 4548 | c4a57112 | sealer | PC2 | assigned(1) | 32m30.6s
00000000 | 4549 | c4a57112 | sealer | PC2 | assigned(2) | 4h31m28.5s
Boost Gui tasks waiting
Start | Deal ID | Size | Client | State
-- | -- | -- | -- | --
41m | 45acb7b1… | 29 GiB | f144zep4… | Adding to Sector
2h | 21413f38… | 29.8 GiB | f144zep4… | Adding to Sector
2h | 9925c70e… | 25.9 GiB | f144zep4… | Adding to Sector
3h | 215a35ed… | 29.3 GiB | f144zep4… | Adding to Sector
5h | 8d4c13e6… | 31.6 GiB | f144zep4… | Adding to Sector
6h | d7ec5bfe… | 30.8 GiB | f144zep4… | Adding to Sector
7h | 4785e707… | 349.3 MiB | f3vnq2cm… | Adding to Sector
7h | aa651bc3… | 422.3 MiB | f3vnq2cm… | Announcing
8h | 15b60750… | 22.5 GiB | f144zep4… | Sealer: PreCommit1
9h | ab5cba7a… | 438.2 MiB | f3vnq2cm… | Sealer: AddPiece
9h | 95605188… | 288.4 MiB | f3vnq2cm… | Sealer: AddPiece
10h | bab58273… | 31 GiB | f144zep4… | Sealer: PreCommit2
12h | 2dc49072… | 31.3 GiB | f144zep4… | Sealer: PreCommit2
15h | c54738cb… | 30.7 GiB | f144zep4… | Sealer: PreCommit2
15h | e05f6f04… | 31.4 GiB | f144zep4… | Sealer: PreCommit2
15h | 4a220393… | 23.6 GiB | f144zep4… | Sealer: PreCommit2
16h | a8bbbd74… | 31.5 GiB | f144zep4… | Sealer: PreCommit2
18h | 7b2449bd… | 20.7 GiB | f144zep4… | Sealer: WaitSeed
19h | 2a068d3f… | 29 GiB | f144zep4… | Sealer: WaitSeed
lotus-miner sealing sched-diag
{
"CallToWork": {
"1278-4545-8fe67f8d-4dea-4796-98f3-783e9f57c638": "seal/v0/precommit/2(8e6dd3fb4aec651f18b989424e1260d577661ddc0231f36046840a6519dd59d9)"
},
"EarlyRet": [
"1278-4543-1e25e027-3c22-43e0-aef3-09944b9204e0"
],
"ReturnedWork": null,
"SchedInfo": {
"OpenWindows": [
"2f7504d6-3059-4cc5-93b6-0df5ef990431",
"2f7504d6-3059-4cc5-93b6-0df5ef990431"
],
"Requests": [
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4549
},
"TaskType": "seal/v0/precommit/2"
},
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4550
},
"TaskType": "seal/v0/precommit/2"
},
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4552
},
"TaskType": "seal/v0/precommit/2"
},
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4553
},
"TaskType": "seal/v0/precommit/1"
},
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4551
},
"TaskType": "seal/v0/addpiece"
},
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4554
},
"TaskType": "seal/v0/addpiece"
},
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4555
},
"TaskType": "seal/v0/addpiece"
},
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4556
},
"TaskType": "seal/v0/addpiece"
},
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4557
},
"TaskType": "seal/v0/addpiece"
},
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4558
},
"TaskType": "seal/v0/addpiece"
},
{
"Priority": 1024,
"Sector": {
"Miner": 1278,
"Number": 4559
},
"TaskType": "seal/v0/addpiece"
}
]
},
"Waiting": [
"seal/v0/precommit/2(8e6dd3fb4aec651f18b989424e1260d577661ddc0231f36046840a6519dd59d9)"
]
}
Logging Information
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched.go:458 SCHED Acceptable win: [[2] [2] [2] [2] [2] [2] [2] [2] [2] [2] [2]]
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:0 sector 4549 to window 2 (awi:0)
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched.go:521 SCHED ASSIGNED {"sqi": 0, "sector": "4549", "task": "seal/v0/precommit/2", "window": 2, "worker": "c4a57112-1d67-481d-8806-fde3a9adf1b5", "utilization": 3}
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:1 sector 4550 to window 2 (awi:0)
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:2 sector 4552 to window 2 (awi:0)
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:3 sector 4553 to window 2 (awi:0)
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:4 sector 4551 to window 2 (awi:0)
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:5 sector 4554 to window 2 (awi:0)
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:6 sector 4555 to window 2 (awi:0)
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:7 sector 4556 to window 2 (awi:0)
2022-06-26T01:33:08.597Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
2022-06-26T01:33:08.598Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:8 sector 4557 to window 2 (awi:0)
2022-06-26T01:33:08.598Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
2022-06-26T01:33:08.598Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:9 sector 4558 to window 2 (awi:0)
2022-06-26T01:33:08.598Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
2022-06-26T01:33:08.598Z DEBUG advmgr sector-storage/sched.go:480 SCHED try assign sqi:10 sector 4559 to window 2 (awi:0)
2022-06-26T01:33:08.598Z DEBUG advmgr sector-storage/sched_resources.go:98 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for schedAssign; not enough threads, need 0, 6 in use, target 0
2022-06-26T01:33:08.598Z DEBUG advmgr sector-storage/sched_resources.go:104 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for compactWindows; GPU(s) in use
2022-06-26T01:33:08.598Z DEBUG advmgr sector-storage/sched_resources.go:104 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for startPreparing; GPU(s) in use
2022-06-26T01:33:08.615Z INFO sectors storage-sealing/states_sealing.go:413 submitting precommit for sector 4545 (deposit: 110136904108491213):
2022-06-26T01:33:08.655Z DEBUG advmgr sector-storage/sched_resources.go:104 sched: not scheduling on worker c4a57112-1d67-481d-8806-fde3a9adf1b5 for withResources; GPU(s) in use
Repo Steps
- Run '...'
- Do '...'
- See error '...'
...