High latency in metada write


shadow_vector@...
 

Hello everyone:
  Recently, I did some testing with daos array interface, found that after finishing nvme bio, it took me a long time for writing metadata and resource release ; The total latency of single channel IO is about 80us, and  vos_update_end taks about 30 us(about 17us in dkey akey update via SCM); I thought that IO via SCM should be very fast, but through the test, it's slower than the nvme IO. Is evtree operation takes too long or are there other reasons?

Best regards!


Lombardi, Johann
 

Hi there,

 

Could you please tell us more about your HW configuration and the test case (e.g. I/O size, read or write, object class, …)? We definitely see lower latency in our internal testing.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "shadow_vector@..." <shadow_vector@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday 11 January 2022 at 12:26
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] High latency in metada write

 

Hello everyone:
  Recently, I did some testing with daos array interface, found that after finishing nvme bio, it took me a long time for writing metadata and resource release ; The total latency of single channel IO is about 80us, and  vos_update_end taks about 30 us(about 17us in dkey akey update via SCM); I thought that IO via SCM should be very fast, but through the test, it's slower than the nvme IO. Is evtree operation takes too long or are there other reasons?

Best regards!

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


shadow_vector@...
 


Here is my HW configuration:

CPU: Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz
SCM:2 SCM ,256GB
I use array interface in this test and set the obj class to OC_RP_3GX.  32GB for hugepage config;The IO size is 4K,only write req was sent.


Thanks for response.


shadow_vector@...
 

Thansks for reponseing. And can you share your latency in you internal testing?

Best Regards!


Lombardi, Johann
 

Hi there,

 

I don’t think that we have ever run any benchmarks with only 2x Optane DIMMs. Maybe you could start by running vos_perf and compare to the results we have in the online documentation: https://docs.daos.io/v2.0/admin/performance_tuning/#daos_perf-vos_perf

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "shadow_vector@..." <shadow_vector@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday 25 January 2022 at 10:29
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] High latency in metada write

 

Thansks for reponseing. And can you share your latency in you internal testing?

Best Regards!

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


shadow_vector@...
 

Hi Johann:

Tansks for the information. I have checked the latency in the online documentation, about 12 us for update, much lower than that in my test. But I think SCM would be much faster while the 4K write latency is about 10us using NVMe. Does the SCM just resolve the WA problem?

Best Regards!


Lombardi, Johann
 

Hi,

 

A DAOS update operation actually results in several sequential latency-sensitive operations over SCM to locate the object/dkey/akey (and create the associated trees if those don’t exist already) that you want to update. Once this is resolved, the 4K data buffer will be stored in either SCM or NVMe. DAOS uses SCM internally to accelerate the sequential metadata operations.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "shadow_vector@..." <shadow_vector@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Wednesday 26 January 2022 at 03:26
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] High latency in metada write

 

Hi Johann:

Tansks for the information. I have checked the latency in the online documentation, about 12 us for update, much lower than that in my test. But I think SCM would be much faster while the 4K write latency is about 10us using NVMe. Does the SCM just resolve the WA problem?

Best Regards!

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


shadow_vector@...
 

Hi Johann:

So you mean that although the operation over SCM is fast, but several sequential operations over SCM may result in the high latency?
Here is the test in my server with vos_perf:


Using array type, the latency is much higher(wrose than my test). The type "array" here means to the array in DAOS interface? Is there something wrong?

Another question, DAOS write the data buffer before the SCM operation in VOS.  I think the leader would waite,until follower complete all the operation and reply and this result in a higher latency.  Is there some problem with my understanding?
I'm confuesd with the array update flow now.


Best Regards!

Jan 


Zhen, Liang
 

Hi, in the test, it writes 1K to DAOS server, the engine actually does:

  1. Search the dkey in SCM
  2. Create index for the dkey if it does not exist (b+tree stored in SCM)
  3. Do the same for akey
  4. Copy 1K data to SCM
  5. All the above writes to SCM are in the same PMDK transaction which has its own cost.

This is the reason that VOS write latency is higher than one SCM write. Array write latency should roughly be the same as single value, we will do some benchmark and check if there is any issue.

 

Liang

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of shadow_vector@... <shadow_vector@...>
Date: Thursday, January 27, 2022 at 2:25 PM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] High latency in metada write

Hi Johann

So you mean that although the operation over SCM is fast
but several sequential operations over SCM may result in the high latency
Here is the test in my server with vos_perf



Using array type
the latency is much higher(wrose than my test). The type "array" here means to the array in DAOS interface? Is there something wrong?

Another question, DAOS write the data buffer before the SCM operation in VOS.  I think the leader would waite
until follower complete all the operation and reply and this result in a higher latency.  Is there some problem with my understanding?
I'm confuesd with the array update flow now.


Best Regards!

Jan 


shadow_vector@...
 

Hi Liang:

   Thank you for the concern and interpretation. I get the reason. So I think meta data write in array update interface would take less time than the vos_perf test before due to less data would be written in SCM. Is there something wrong with my understanding? Looking forward to the benchmark result.

Best Regards!


shadow_vector@...
 

Hi Liang:
 
  Is there any result of array write test ? Is there something wrong with my test?


Best Regards!