Re: Timeouts/DAOS rendered useless when running IOR with SX/default object class


Steffen Christgau
 

Hi again once more,

meanwhile we checked the 'tcp' and the 'verbs' provider.

For 'tcp' we also experience the timeouts and an subsequently unusable DAOS system.

For 'verbs' (on an OmniPath network) we observe Mercury error on failed memory registrations:

03/29-12:36:21.95 bdaos15 DAOS[308011/308012] pool ERR src/pool/srv_pool.c:1899 transfer_map_buf() 4810a635: remote pool map buffer (4128) < required (5472)
03/29-12:36:50.65 bdaos15 DAOS[308011/308089] external ERR # HG -- error -- /builddir/build/BUILD/mercury-2.0.1rc1/src/mercury_bulk.c:846
# hg_bulk_register(): NA_Mem_register() failed (NA_PROTOCOL_ERROR)
03/29-12:36:50.65 bdaos15 DAOS[308011/308089] external ERR # HG -- error -- /builddir/build/BUILD/mercury-2.0.1rc1/src/mercury_bulk.c:762
# hg_bulk_create_na_mem_descs(): Could not register segment
03/29-12:36:50.65 bdaos15 DAOS[308011/308089] external ERR # HG -- error -- /builddir/build/BUILD/mercury-2.0.1rc1/src/mercury_bulk.c:626
# hg_bulk_create(): Could not create NA mem descriptors
03/29-12:36:50.65 bdaos15 DAOS[308011/308089] external ERR # HG -- error -- /builddir/build/BUILD/mercury-2.0.1rc1/src/mercury_bulk.c:2516
# HG_Bulk_create(): Could not create bulk handle
The version of all the employed providers is '111.10' - both on client and server side.

Maybe this help a little for further investigation.

Regards, Steffen

Join daos@daos.groups.io to automatically receive all group messages.