|
Re: dmg pool operation stuck
Did you manage to get the engine log with DD_MASK=all, that will give us more information about why the engine is not completing start-up (and why you don’t have any joined ranks reported by "dmg
Did you manage to get the engine log with DD_MASK=all, that will give us more information about why the engine is not completing start-up (and why you don’t have any joined ranks reported by "dmg
|
By
Nabarro, Tom
·
#1499
·
|
|
Re: dmg pool operation stuck
Hi Tom,
I noticed an error in the engine log.
DAOS[11610/11614] bio DBUG src/bio/bio_xstream.c:662 load_blobstore() load blobstore failed -1025
Is it because of this? And what does "-1025" mean?
Some
Hi Tom,
I noticed an error in the engine log.
DAOS[11610/11614] bio DBUG src/bio/bio_xstream.c:662 load_blobstore() load blobstore failed -1025
Is it because of this? And what does "-1025" mean?
Some
|
By
Allen
·
#1498
·
|
|
Re: dmg pool operation stuck
Hi Tom,
Please see Attachment. The rerun terminal prints the message as follows.
daos_debug@sw2:~$ dmg -i storage format
Format Summary:
Hosts SCM Devices NVMe Devices
----- -----------
Hi Tom,
Please see Attachment. The rerun terminal prints the message as follows.
daos_debug@sw2:~$ dmg -i storage format
Format Summary:
Hosts SCM Devices NVMe Devices
----- -----------
|
By
Allen
·
#1497
·
|
|
Re: dmg pool operation stuck
Hi Tom,
I'm very sorry, it was my mistake. I accidentally set "/etc/hosts" wrong. It should be "172.20.148.244", but I incorrectly wrote it as "127.20.148.244".
Hi Tom,
I'm very sorry, it was my mistake. I accidentally set "/etc/hosts" wrong. It should be "172.20.148.244", but I incorrectly wrote it as "127.20.148.244".
|
By
Allen
·
#1496
·
|
|
Re: dmg pool operation stuck
The engine is actually not starting, in the server log you should see a message containing "…started on rank 0" and then if you run "dmg system query [--verbose]" it should report at least one
The engine is actually not starting, in the server log you should see a message containing "…started on rank 0" and then if you run "dmg system query [--verbose]" it should report at least one
|
By
Nabarro, Tom
·
#1493
·
|
|
Re: dmg pool operation stuck
This is unusual, normally the name resolution works.
After talking to a colleague (Mike), we suspect that the IsLocalAddr() test is failing to match the AP address with a local address, with the
This is unusual, normally the name resolution works.
After talking to a colleague (Mike), we suspect that the IsLocalAddr() test is failing to match the AP address with a local address, with the
|
By
Nabarro, Tom
·
#1492
·
|
|
Re: dmg pool operation stuck
Hi Tom,
A new question.
$ dmg -i pool create -z 100GB
Creating DAOS pool with automatic storage allocation: 100 GB NVMe + 6.00% SCM
ERROR: dmg: pool create failed: rpc error: code = Unknown desc =
Hi Tom,
A new question.
$ dmg -i pool create -z 100GB
Creating DAOS pool with automatic storage allocation: 100 GB NVMe + 6.00% SCM
ERROR: dmg: pool create failed: rpc error: code = Unknown desc =
|
By
Allen
·
#1491
·
|
|
Re: dmg pool operation stuck
Hi Tom,
I think I know why it timed out when creating the pool. Because I set the 'access_points:' in daos_server.yml to my hostname 'sw2', it should be set to 'localhost'. If it is set to hostname,
Hi Tom,
I think I know why it timed out when creating the pool. Because I set the 'access_points:' in daos_server.yml to my hostname 'sw2', it should be set to 'localhost'. If it is set to hostname,
|
By
Allen
·
#1490
·
|
|
Re: Where should I to look for release notes?
Thanks Michael,
I checked the DUG and the roadmap, it seems the self-healing feature is delivered in release1.2. I would like to confirm the status of self-healing since the README.md in
Thanks Michael,
I checked the DUG and the roadmap, it seems the self-healing feature is delivered in release1.2. I would like to confirm the status of self-healing since the README.md in
|
By
cheneydeng@...
·
#1489
·
|
|
Re: dmg pool operation stuck
Hi Tom,
The same issue still exists after setting nr_xs_helpers: 0.
$ ps aux|grep daos_engine | grep -v grep
daos_de+ 5301 394 0.1 135622300 771552 pts/0 RLl+ 11:45 5:43
Hi Tom,
The same issue still exists after setting nr_xs_helpers: 0.
$ ps aux|grep daos_engine | grep -v grep
daos_de+ 5301 394 0.1 135622300 771552 pts/0 RLl+ 11:45 5:43
|
By
Allen
·
#1488
·
|
|
Re: Where should I to look for release notes?
Hi,
For roadmap and feature updates, you can check out the slides and recordings from the DAOS User Group in November (http://dug.daos.io).
The release notes and other documentation will be
Hi,
For roadmap and feature updates, you can check out the slides and recordings from the DAOS User Group in November (http://dug.daos.io).
The release notes and other documentation will be
|
By
Hennecke, Michael
·
#1487
·
|
|
Re: dmg pool operation stuck
Can you try with nr_xs_helpers: 0 in the config please, you will need to reformat.
From: daos@daos.groups.io <daos@daos.groups.io>On Behalf Of allen.zhuo@...
Sent: Wednesday, December 1, 2021 10:40
Can you try with nr_xs_helpers: 0 in the config please, you will need to reformat.
From: daos@daos.groups.io <daos@daos.groups.io>On Behalf Of allen.zhuo@...
Sent: Wednesday, December 1, 2021 10:40
|
By
Nabarro, Tom
·
#1486
·
|
|
Re: dmg pool operation stuck
Hi Tom,
please see Attachment.
Hi Tom,
please see Attachment.
|
By
Allen
·
#1485
·
|
|
Re: dmg pool operation stuck
Hello,
The format is completing and the engine process is being spawned, now we need to look at the engine log which is specified in the server config file (consult the admin guide for more
Hello,
The format is completing and the engine process is being spawned, now we need to look at the engine log which is specified in the server config file (consult the admin guide for more
|
By
Nabarro, Tom
·
#1483
·
|
|
Where should I to look for release notes?
Hi DAOS,
I found the release notes on Github is very simple, I can't find a place to know what the progress of the roadmap and try to figure out if some features which I need are implemented or not.
Hi DAOS,
I found the release notes on Github is very simple, I can't find a place to know what the progress of the roadmap and try to figure out if some features which I need are implemented or not.
|
By
cheneydeng@...
·
#1482
·
|
|
Re: dmg pool operation stuck
Le 01/12/2021 à 10:23, allen.zhuo@... a écrit :
Hi Allen,
With my Dockere integration, I only have 1 daos_server with 1 HDD.
ps aux|grep -i daos
Le 01/12/2021 à 10:23, allen.zhuo@... a écrit :
Hi Allen,
With my Dockere integration, I only have 1 daos_server with 1 HDD.
ps aux|grep -i daos
|
By
PATEYRON Sacha
·
#1481
·
|
|
Re: dmg pool operation stuck
Hi Tom,
I added some debugging codes. I found that dmg create pool failed because Database replicaAddr.get() returned <nil>. So Database CheckReplica returned an error.
// CheckReplica returns an
Hi Tom,
I added some debugging codes. I found that dmg create pool failed because Database replicaAddr.get() returned <nil>. So Database CheckReplica returned an error.
// CheckReplica returns an
|
By
Allen
·
#1480
·
|
|
Re: dmg pool operation stuck
Hi Tom,
The same issue still exists after changing the hugepagesize to 2MB.
When dmg pool create, daos_server did not print any message. I think this is abnormal. So, can we add some debugging code?
$
Hi Tom,
The same issue still exists after changing the hugepagesize to 2MB.
When dmg pool create, daos_server did not print any message. I think this is abnormal. So, can we add some debugging code?
$
|
By
Allen
·
#1478
·
|
|
Re: dmg pool operation stuck
Le 30/11/2021 à 12:06, Nabarro, Tom a écrit :
Hi,
If that can help you.
Test OK in Docker Ubuntu 20.04.
sysctl -a | grep vm.nr_hugepages
Le 30/11/2021 à 12:06, Nabarro, Tom a écrit :
Hi,
If that can help you.
Test OK in Docker Ubuntu 20.04.
sysctl -a | grep vm.nr_hugepages
|
By
PATEYRON Sacha
·
#1477
·
|
|
Re: dmg pool operation stuck
Can you please try with 2M hugepagesize I’m not sure we have much test coverage using 1G hugepages and there may be some built-in assumptions that might cause problems if using them.
Can you please try with 2M hugepagesize I’m not sure we have much test coverage using 1G hugepages and there may be some built-in assumptions that might cause problems if using them.
|
By
Nabarro, Tom
·
#1476
·
|