Re: daos_test failing with Infiniband
Oganezov, Alexander A
Hi Peter,
I saw something similar a while ago when our mpi-based applications ended up compiling against ‘bad’ version of MPI, or more specifically MPI that links bad UCX (ucx provides libucs). There appears to be a bug in some UCX versions causing this segfault (e.g. https://github.com/open-mpi/ompi/issues/6789)
One thing to try is to see which MPIs you have installed and compile against different one from what you are using.
“module avail” will provide you list of installed mpi packages You can use then “module load <package>” and after that recompile daos via scons -c ; scons -c install; scons MPI_PKG=any -j 12 install
Let me know if this helps any.
Thanks, ~~Alex.
From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of
Lombardi, Johann
Sent: Tuesday, December 15, 2020 12:00 AM To: daos@daos.groups.io Subject: Re: [daos] daos_test failing with Infiniband
I see, then maybe libucs is somehow used under the hood. Are you using the MOFED stack? Maybe you could try to reduce FI_UNIVERSE_SIZE to 512 (i.e. export FI_UNIVERSE_SIZE=512).
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Peter <magpiesaresoawesome@...>
I have specified ofi+verbs;ofi_rxm --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|