Re: Client application single value KV Put high latency using multiple threads (pthread)


Lombardi, Johann
 

Hi Ping,

 

Sorry, I should have provided more details in my previous email. After switching to ofi+tcp;ofi_rxm in the config file, you will have to reformat and restart the agent since we don’t support live provider change yet. It would be great if you could provide me with the output of “daos pool autotest” with both ofi+sockets and ofi+tcp;ofi_rxm so that I can compare it with results that I have on my side with 40Gbps.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "ping.wong via groups.io" <ping.wong@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Wednesday 3 February 2021 at 08:13
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Client application single value KV Put high latency using multiple threads (pthread)

 

[Edited Message Follows]

Hi Johann,

I have the control interface on 10Gbps Ethernet and the data plane interface is on 100Gbps Ethernet.
 
Per your recommendation, I tried ofi+tcp;ofi_rxm; however, the client application failed (marked with ******).  

 

Server1 - connected ofi+tcp;ofi_rxm

 

DEBUG 01:02:25.452378 mgmt_system.go:183: processing 1 join requests

DEBUG 01:02:25.458189 mgmt_system.go:255: updated system member: rank 0, uri ofi+tcp;ofi_rxm://11.11.200.46:31416, Joined->Joined

daos_io_server:0 DAOS I/O server (v1.1.2.1) process 215563 started on rank 0 with 4 target, 2 helper XS, firstcore 0, host test46.autocache.com.

 

Server2 - conected  ofi+tcp;ofi_rxm

 

DEBUG 01:02:09.275423 raft.go:204: no known peers, aborting election:

DEBUG 01:02:09.911677 instance_drpc.go:66: DAOS I/O Server instance 0 drpc ready: uri:"ofi+tcp;ofi_rxm://11.11.200.48:31416" nctxs:7 drpcListenerSock:"/tmp/daos_sockets/daos_io_server_28178.sock" ntgts:4

DEBUG 01:02:09.914435 system.go:155: DAOS system join request: sys:"daos_server" uuid:"e32fcef5-c6c4-491f-a25b-f21ae4d3a75f" rank:1 uri:"ofi+tcp;ofi_rxm://11.11.200.48:31416" nctxs:7 addr:"0.0.0.0:10001" srvFaultDomain:"/test48.sdmsl.net"

DEBUG 01:02:09.915330 rpc.go:213: request hosts: [test46:10001 test48:10001 test62:10001]

daos_io_server:0 DAOS I/O server (v1.1.2.1) process 28178 started on rank 1 with 4 target, 2 helper XS, firstcore 1, host test48.sdmsl.net.

 

Client Failed

=================

DAOS Flat KV test..

=================

[==========] Running 1 test(s).

setup: creating pool, SCM size=4 GB, NVMe size=16 GB

setup: created pool a9177073-f014-477b-9ad1-5fe36d334f07

setup: connecting to pool

daos_pool_connect failed, rc: -1020                                *******************************

[  FAILED  ] GROUP SETUP

[  ERROR   ] DAOS KV API tests

state not set, likely due to group-setup issue

[==========] 0 test(s) run.

[  PASSED  ] 0 test(s).

daos_fini() failed with -1001               

 

This is part of the client log (with errors):

 

02/03-01:04:24.76 test62 DAOS[29842/29842] mgmt DBUG src/mgmt/cli_mgmt.c:192 fill_sys_info() GetAttachInfo Provider: ofi+tcp;ofi_rxm, Interface: enp24s0f0, Domain: enp24s0f0,CRT_CTX_SHARE_ADDR: 0, CRT_TIMEOUT: 0

                                                                 ...

 

02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # NA -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/na/na_ofi.c:3431

 # na_ofi_addr_lookup(): Unrecognized provider type found from: sockets://11.11.200.48:31416

02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # HG -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/mercury_core.c:1220

 # hg_core_addr_lookup(): Could not lookup address ofi+sockets://11.11.200.48:31416 (NA_INVALID_ARG)

02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # HG -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/mercury_core.c:3850

 # HG_Core_addr_lookup2(): Could not lookup address

02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # HG -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/mercury.c:1490

 # HG_Addr_lookup2(): Could not lookup ofi+sockets://11.11.200.48:31416 (HG_INVALID_ARG) ************************************************************************************************

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_rpc.c:1038 crt_req_hg_addr_lookup() HG_Addr_lookup2() failed. uri=ofi+sockets://11.11.200.48:31416, hg_ret=11 **********************************

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_rpc.c:1133 crt_req_send_internal() crt_req_hg_addr_lookup() failed, rc -1020, opc: 0x1010003.

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_rpc.c:1234 crt_req_send(0x1f7ea90) [opc=0x1010003 (DAOS) rpcid=0x636fb8e100000000 rank:tag=1:0] crt_req_send_internal() failed, DER_HG(-1020): 'Transport layer mercury error'

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_rpc.c:1580 timeout_bp_node_exit(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] exiting the timeout binheap.

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:629 crt_req_timeout_untrack(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 4.

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:1017 crt_context_req_untrack(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 3.

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_context.c:309 crt_rpc_complete(0x1f7ea90) [opc=0x1010003 (DAOS) rpcid=0x636fb8e100000000 rank:tag=1:0] failed, DER_HG(-1020): 'Transport layer mercury error'

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:316 crt_rpc_complete(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] Invoking RPC callback (rank 1 tag 0) rc: DER_HG(-1020): 'Transport layer mercury error'

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:321 crt_rpc_complete(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 2.

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_rpc.c:1260 crt_req_send(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 1.

02/03-01:04:32.78 test62 DAOS[29842/29842] mgmt DBUG src/mgmt/cli_mgmt.c:808 dc_mgmt_get_pool_svc_ranks() a9177073: daos_rpc_send_wait() failed, DER_HG(-1020): 'Transport layer mercury error'

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_rpc.c:537 crt_req_decref(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 0.

02/03-01:04:32.78 test62 DAOS[29842/29842] hg   DBUG src/cart/crt_hg.c:971 crt_hg_req_destroy(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] destroying

 

02/03-01:04:32.78 test62 DAOS[29842/29842] crt  ERR  src/cart/crt_init.c:537 crt_finalize() cannot finalize, current ctx_num(1).    ***********************************

02/03-01:04:32.78 test62 DAOS[29842/29842] crt  ERR  src/cart/crt_init.c:596 crt_finalize() crt_finalize failed, rc: -1001.

02/03-01:04:32.78 test62 DAOS[29842/29842] client ERR  src/client/api/event.c:147 daos_eq_lib_fini() failed to shutdown crt: DER_NO_PERM(-1001): 'Operation not permitted'

02/03-01:04:32.78 test62 DAOS[29842/29842] client ERR  src/client/api/init.c:267 daos_fini() failed to finalize eq: DER_NO_PERM(-1001): 'Operation not permitted' ******************


I cannot find any documentation in the Deployment Guide about ofi+tcp;ofi_rxm settings on the server side and on the client side.   Perhaps, I missed some settings in some .yml file.


Thanks
Ping

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Join {daos@daos.groups.io to automatically receive all group messages.