Topics

Huge latency observed by DAOS client


k.patwardhan@...
 

Hello DAOS community,

 

 

I have a rather strange observation when working with DAOS. The DAOS client system reports way higher latency than I anticipated and am wondering if someone knows whether this is a known problem.

 

The diagram attached depicts my test setup.

 

1.      What I see is that the client issues a single Put request (16 byte key, 4 KB value) “invoking daos_kv_put()” to the DAOS server.

2.      The DAOS server (master) responds to the Put request in about 0.28 msec.

 

3.      However, the client reports an overall latency of 1 msec. Client latency is time taken by daos_kv_put() to complete.

 

Observations:

1. Sometimes daos_kv_put() returns before dc_rw_cb() callback is called. Is that normal ?

2. Often dc_rw_cb() gets called quickly but the client does some work related to the mercurial module before returning from daos_kv_put(). Any idea what is the client doing even after dc_rw_cb() got called ?

               2.1 My observation above of server responding in 0.28 msec but the client returning from daos_kv_put() after 1ms relates to this case. I believe dc_rw_db() gets called in response to the DAOS server finishing the Put request but the client still does some work that I don’t understand why.

 

The client seems to call some mercurial functions between the time the server responded back to it and before daos_kv_put() could return. Is that expected ? Also, any idea what’s going on in-between that increases the overall latency from 0,28msec to 1msec ?

 

Thanks in advance and looking forward to root causing the problem soon.

 

Regards,

Kedar


Lombardi, Johann
 

Hi Kedar,

 

I assume that you are passing event = NULL to daos_kv_put() and only have one operation in flight, right?

In the “normal” case (i.e. no membership change, no restart), the bulk of the work has been done once dc_rw_cb() completes and daos_kv_put() should return shortly after that. Only some minor clean-ups (e.g. freeing up allocated memory) remain and this should definitely not take 0.72ms. Could you please profile the APP and find out where the bulk of the time is spent?

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "k.patwardhan via groups.io" <k.patwardhan@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Saturday 6 February 2021 at 01:37
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Huge latency observed by DAOS client

 

Hello DAOS community,

 

 

I have a rather strange observation when working with DAOS. The DAOS client system reports way higher latency than I anticipated and am wondering if someone knows whether this is a known problem.

 

The diagram attached depicts my test setup.

 

1.      What I see is that the client issues a single Put request (16 byte key, 4 KB value) “invoking daos_kv_put()” to the DAOS server.

2.      The DAOS server (master) responds to the Put request in about 0.28 msec.

 

3.      However, the client reports an overall latency of 1 msec. Client latency is time taken by daos_kv_put() to complete.

 

Observations:

1. Sometimes daos_kv_put() returns before dc_rw_cb() callback is called. Is that normal ?

2. Often dc_rw_cb() gets called quickly but the client does some work related to the mercurial module before returning from daos_kv_put(). Any idea what is the client doing even after dc_rw_cb() got called ?

               2.1 My observation above of server responding in 0.28 msec but the client returning from daos_kv_put() after 1ms relates to this case. I believe dc_rw_db() gets called in response to the DAOS server finishing the Put request but the client still does some work that I don’t understand why.

 

The client seems to call some mercurial functions between the time the server responded back to it and before daos_kv_put() could return. Is that expected ? Also, any idea what’s going on in-between that increases the overall latency from 0,28msec to 1msec ?

 

Thanks in advance and looking forward to root causing the problem soon.

 

Regards,

Kedar

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.