Huge latency observed by DAOS client


KP (Kedar) Patwardhan <k.patwardhan@...>
 

Hello DAOS community,

 

 

I have a rather strange observation when working with DAOS. The DAOS client system reports way higher latency than I anticipated and am wondering if someone knows whether this is a known problem.

 

 

The diagram above depicts my test setup.

 

1.      What I see is that the client issues a single Put request (16 byte key, 4 KB value) “invoking daos_kv_put()” to the DAOS server.

2.      The DAOS server (master) responds to the Put request in about 0.28 msec.

 

3.      However, the client reports an overall latency of 1 msec. Client latency is time taken by daos_kv_put() to complete.

 

Observations:

1. Sometimes daos_kv_put() returns before dc_rw_cb() callback is called. Is that normal ?

2. Often dc_rw_cb() gets called quickly but the client does some work related to the mercurial module before returning from daos_kv_put(). Any idea what is the client doing even after dc_rw_cb() got called ?

               2.1 My observation above of server responding in 0.28 msec but the client returning from daos_kv_put() after 1ms relates to this case. I believe dc_rw_db() gets called in response to the DAOS server finishing the Put request but the client still does some work that I don’t understand why.

 

The client seems to call some mercurial functions between the time the server responded back to it and before daos_kv_put() could return. Is that expected ? Also, any idea what’s going on in-between that increases the overall latency from 0,28msec to 1msec ?

 

Thanks in advance and looking forward to root causing the problem soon.

 

Regards,

Kedar

Join daos@daos.groups.io to automatically receive all group messages.