Re: Cannot start DAOS - failure in hwloc


Jordan Henderson
 

Hi Johann,

this is on Slackware Linux (https://en.wikipedia.org/wiki/Slackware). It's still actively maintained, but it's a very old distribution so often times there are modern things missing. In any case, thank you for the quick fix!

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Lombardi, Johann via groups.io <johann.lombardi@...>
Sent: Tuesday, May 5, 2020 1:56 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Cannot start DAOS - failure in hwloc
 

Thanks for the report Jordan. I have just pushed a fix: https://github.com/daos-stack/daos/pull/2625

No, I’ve never seen this failing. Could you please advise what distribution you run?

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Jordan Henderson <jhenderson@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday 5 May 2020 at 02:24
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Cannot start DAOS - failure in hwloc

 

Hi all,

 

I recently updated my out-of-date DAOS installation to commit a7f093db2fc96aa1dc20cc8c293d44274474ef62 (I'm sure this wasn't the commit that changed this behavior, but good to have for reference), and it seems I can no longer start the DAOS I/O server due to a failure in hwloc_set_membind (https://github.com/daos-stack/daos/blob/master/src/iosrv/srv.c#L392).

 

I get the following lines in my log:

 

05/04-18:52:41.20 Talos DAOS[14193/14197] server ERR  src/iosrv/srv.c:395 dss_srv_handler() failed to set memory affinity: 38
05/04-18:52:41.20 Talos DAOS[14193/14193] server ERR  src/iosrv/init.c:521 server_init() DAOS cannot be initialized using the configured path (/mnt/daos_fs).   Please ensure it is on a PMDK compatible file system and writeable by the current user.

 

which I think corresponds to a return value of ENOSYS from hwloc_set_membind, probably due to some form of missing support for the type of memory binding requested. Has the DAOS team seen this issue on other machines? Removing the call from the DAOS source allows me to start the server once again, but of course that's just a temporary workaround for now.

 

Thanks in advance!

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Join daos@daos.groups.io to automatically receive all group messages.