Re: Timeouts/DAOS rendered useless when running IOR with SX/default object class
Rosenzweig, Joel B <joel.b.rosenzweig@...>
Sure thing. Unless you say otherwise, I’m planning to submit it against 1.2 and 2.0 branches.
https://github.com/daos-stack/daos/pull/5246
From: Lombardi, Johann <johann.lombardi@...>
Sent: Tuesday, March 30, 2021 3:19 PM To: daos@daos.groups.io; Rosenzweig, Joel B <joel.b.rosenzweig@...> Subject: Re: [daos] Timeouts/DAOS rendered useless when running IOR with SX/default object class
Hi Steffen,
Good catch! It sounds like we need to add a “LimitNOFILE” entry to our daos_server’s systemd unit file. @Rosenzweig, Joel B could you please take of this? Thanks in advance.
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Steffen Christgau <christgau@...>
A final "Hi" on that topic,
we have discovered the reason for the issue: The ulimit on the _server_ side was too low and it differs between regular users and daemons like the DAOS server. For the latter it was set to soft 1024/hard 4096. We increased it to 50000 respectively by modifying the service/unit file. With that we did multiple IOR runs with up to 48 processes and SX object class from a single client node without any errors.
We noted that the coredump end memlock limits are already "increased" in the server's unit file. Maybe it is a good idea to increase the file limit as well by default, although the limit may depend on the provider in use.
Regards, Steffen
|
|