Re: DPI_SPACE query after extending pool
Niu, Yawei
Could you double check if creating container would cause NVMe free space dropping? If it’s true, please open a ticket for further investigation. I can’t think of why container creation could consume NVMe space for now.
Thanks -Niu
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Tuffli, Chuck <chuck.tuffli@...> If the space change isn't caused by reservation, what else might be causing this? What other things might I check? From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Niu, Yawei <yawei.niu@...>
Hi, Chuck
The reserved space is per pool, it’s not relevant with container creation, so I think the space change you observed after container creation isn’t caused by space reservation.
FYI, we’ve just changed the space reservation a bit, NVMe reservation has been removed from current master and 2.2, only SCM reservation is kept.
As for space query, current client space query reports only total space and free space, I think that’s a common practice for most systems. I think It could be improved to report detailed usage like how much space is used for reservation in the future, thanks for the input.
Thanks -Niu
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Tuffli, Chuck <chuck.tuffli@...> Thank you, Niu
After thinking about what you said and running some additional experiments, I believe everything is working as you described.
I created a 500 GB pool and added a POSIX container. After creating the container, the free NVMe space dropped from 470 GB to 435 GB which roughly lines up with the 2 GB reserved per NVMe drive (this pool has 16 drives).
The free space dropped to 406 GB after writing 27 GB of file data to the container. After extending the pool, the free space increased to 815 GB (roughly linear). Following WangDi's suggestion, I waited several minutes and afterwards, observed the free space climb to 841 GB. This last number matches my expectation that free space should more than double after extending the pool with an additional storage node.
Can clients query DAOS to figure out how much storage it is using itself (e.g. reserved space)? As I can see now, reporting used space based on the difference between total and free doesn't convey quite the right message to consumers of this storage. From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Niu, Yawei <yawei.niu@...>
Hi, Chuck
The “used space” (total – free) is kind of OP (over-provisioning), you know that DAOS server has to reserve some space on both SCM and NVMe to ensure punch, container/object destroy, GC and aggregation not fail for ENOSPACE.
The size of this sys reservation is roughly: SCM: 5% of SCM total space; NVMe: 2% of NVMe total space; and the minimum reservation size is 2GB per pool target (for both SCM and NVMe). The reservation will be disabled if the pool size is tiny (when each pool target SCM and NVMe size is less than 5GB, we regard it as tiny pool, which is usually used for testing), so the operations I mentioned above could fail on the tiny pool when it’s running short of space.
There is an open ticket for reducing the OP, but it’s not on our schedule yet.
Thanks -Niu
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...> Hello Chunk
That is strange. According the output of dmg pool query, 83 objects were deleted after extend, so some space should be reclaimed.
“dmg pool query kiddie ……… Rebuild done, 83 objs, 0 recs “
before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
Hmm It seems SCM is ok, only NVME space are doubled after extend. If you do not see NVME free space get back after a few mins, probably need create a ticket.
Thanks WangDi
On 6/1/22, 10:56 AM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
Wangdi
When I checked this morning, the pool had been idle four days, but the values from daos_pool_query() have not changed.
As for object class, I'm not sure. Pool creation didn't specify a class. Here is the container query output: # daos cont query ERROR: daos: pool and container ID must be specified if --path not used ]# daos cont query kiddie whiz Container UUID : 5c61770a-2b56-4922-b95b-d025fa4d0527 Container Label : whiz Container Type : POSIX Pool UUID : 700bf1b6-38b8-467e-9f91-7131138210ba Number of snapshots : 0 Latest Persistent Snapshot : 0x0 Container redundancy factor: 0 Object Class : UNKNOWN Chunk Size : 1.0 MiB
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...>
Hello Chuck
Pool extend might migrate the data to the new pool target, then the original data will be delete asynchronous, so those space might be reclaimed a few mins later if the system is not busy.
You probably should do your daos_pool_query() a bit later. Btw: what are those objects class in your pool?
Thanks Wangdi
On 5/25/22, 2:54 PM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
# dmg pool query kiddie Pool 700bf1b6-38b8-467e-9f91-7131138210ba, ntarget=32, disabled=0, leader=0, version=18 Pool space info: - Target(VOS) count:32 - Storage tier 0 (SCM): Total size: 60 GB Free: 60 GB, min:1.9 GB, max:1.9 GB, mean:1.9 GB - Storage tier 1 (NVMe): Total size: 940 GB Free: 870 GB, min:27 GB, max:27 GB, mean:27 GB Rebuild done, 83 objs, 0 recs From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Hello Chuck,
Could you please run `dmg pool query` on the pool and show the results, this will give you a bit more info on pool usage.
Regards, Tom
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Tuffli, Chuck
I've been experimenting with extending a pool but don't quite understand the results. Any insights would be most appreciated.
The cluster is running with DAOS v2.0.2 and consists of a client and a pair of servers/storage nodes. To simulate adding a server to the cluster, I created a pool by specifying the ranks associated with one of the servers. I.e.: # dmg system query --verbose Rank UUID Control Address Fault Domain State Reason ---- ---- --------------- ------------ ----- ------ 0 654345f9-249c-48b1-b6dc-ec08dbf2aded x.150.0.3:10001 /d006 Joined 1 b384771a-ddbc-491a-8807-8d86544d7c2f x.150.0.4:10001 /d010 Joined 2 01c672cf-3365-476f-87ec-41a15a44e946 x.150.0.4:10001 /d010 Joined 3 93a8d382-b970-408a-9c21-e01c35265e77 x.150.0.3:10001 /d006 Joined # dmg pool create --ranks=0,3 --size=500G kiddie
I used the pool extend command to simulate adding a server: # dmg pool extend --ranks=1,2 kiddie
My application queried the pool size before and after the extension using daos_pool_query( ... DPI_SPACE ...). The numbers below are the info.pi_space.ps_space values for (DAOS_MEDIA_SCM, DAOS_MEDIA_NVME). before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
The total pool sized doubled (good), but the used space (i.e., s_total - s_free) also doubled. Naively, I expected the used space to remain the same as the pool has a redundancy factor of zero. Doing some arithmetic on the above works out to the used space being 35.036 GB before the expansion and 70.068 GB after. Note that, for the moment, I'm choosing to ignore that the used size is several orders of magnitude bigger that the data written (~600 KB).
Where did I goof in this methodology? TIA.
--chuck
|
|
Re: DPI_SPACE query after extending pool
Tuffli, Chuck
If the space change isn't caused by reservation, what else might be causing this? What other things might I check?
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Niu, Yawei <yawei.niu@...>
Sent: Wednesday, June 8, 2022 5:55 PM To: daos@daos.groups.io <daos@daos.groups.io> Subject: Re: [daos] DPI_SPACE query after extending pool Hi, Chuck
The reserved space is per pool, it’s not relevant with container creation, so I think the space change you observed after container creation isn’t caused by space reservation.
FYI, we’ve just changed the space reservation a bit, NVMe reservation has been removed from current master and 2.2, only SCM reservation is kept.
As for space query, current client space query reports only total space and free space, I think that’s a common practice for most systems. I think It could be improved to report detailed usage like how much space is used for reservation in the future, thanks for the input.
Thanks -Niu
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Tuffli, Chuck <chuck.tuffli@...> Thank you, Niu
After thinking about what you said and running some additional experiments, I believe everything is working as you described.
I created a 500 GB pool and added a POSIX container. After creating the container, the free NVMe space dropped from 470 GB to 435 GB which roughly lines up with the 2 GB reserved per NVMe drive (this pool has 16 drives).
The free space dropped to 406 GB after writing 27 GB of file data to the container. After extending the pool, the free space increased to 815 GB (roughly linear). Following WangDi's suggestion, I waited several minutes and afterwards, observed the free space climb to 841 GB. This last number matches my expectation that free space should more than double after extending the pool with an additional storage node.
Can clients query DAOS to figure out how much storage it is using itself (e.g. reserved space)? As I can see now, reporting used space based on the difference between total and free doesn't convey quite the right message to consumers of this storage. From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Niu, Yawei <yawei.niu@...>
Hi, Chuck
The “used space” (total – free) is kind of OP (over-provisioning), you know that DAOS server has to reserve some space on both SCM and NVMe to ensure punch, container/object destroy, GC and aggregation not fail for ENOSPACE.
The size of this sys reservation is roughly: SCM: 5% of SCM total space; NVMe: 2% of NVMe total space; and the minimum reservation size is 2GB per pool target (for both SCM and NVMe). The reservation will be disabled if the pool size is tiny (when each pool target SCM and NVMe size is less than 5GB, we regard it as tiny pool, which is usually used for testing), so the operations I mentioned above could fail on the tiny pool when it’s running short of space.
There is an open ticket for reducing the OP, but it’s not on our schedule yet.
Thanks -Niu
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...> Hello Chunk
That is strange. According the output of dmg pool query, 83 objects were deleted after extend, so some space should be reclaimed.
“dmg pool query kiddie ……… Rebuild done, 83 objs, 0 recs “
before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
Hmm It seems SCM is ok, only NVME space are doubled after extend. If you do not see NVME free space get back after a few mins, probably need create a ticket.
Thanks WangDi
On 6/1/22, 10:56 AM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
Wangdi
When I checked this morning, the pool had been idle four days, but the values from daos_pool_query() have not changed.
As for object class, I'm not sure. Pool creation didn't specify a class. Here is the container query output: # daos cont query ERROR: daos: pool and container ID must be specified if --path not used ]# daos cont query kiddie whiz Container UUID : 5c61770a-2b56-4922-b95b-d025fa4d0527 Container Label : whiz Container Type : POSIX Pool UUID : 700bf1b6-38b8-467e-9f91-7131138210ba Number of snapshots : 0 Latest Persistent Snapshot : 0x0 Container redundancy factor: 0 Object Class : UNKNOWN Chunk Size : 1.0 MiB
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...>
Hello Chuck
Pool extend might migrate the data to the new pool target, then the original data will be delete asynchronous, so those space might be reclaimed a few mins later if the system is not busy.
You probably should do your daos_pool_query() a bit later. Btw: what are those objects class in your pool?
Thanks Wangdi
On 5/25/22, 2:54 PM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
# dmg pool query kiddie Pool 700bf1b6-38b8-467e-9f91-7131138210ba, ntarget=32, disabled=0, leader=0, version=18 Pool space info: - Target(VOS) count:32 - Storage tier 0 (SCM): Total size: 60 GB Free: 60 GB, min:1.9 GB, max:1.9 GB, mean:1.9 GB - Storage tier 1 (NVMe): Total size: 940 GB Free: 870 GB, min:27 GB, max:27 GB, mean:27 GB Rebuild done, 83 objs, 0 recs From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Hello Chuck,
Could you please run `dmg pool query` on the pool and show the results, this will give you a bit more info on pool usage.
Regards, Tom
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Tuffli, Chuck
I've been experimenting with extending a pool but don't quite understand the results. Any insights would be most appreciated.
The cluster is running with DAOS v2.0.2 and consists of a client and a pair of servers/storage nodes. To simulate adding a server to the cluster, I created a pool by specifying the ranks associated with one of the servers. I.e.: # dmg system query --verbose Rank UUID Control Address Fault Domain State Reason ---- ---- --------------- ------------ ----- ------ 0 654345f9-249c-48b1-b6dc-ec08dbf2aded x.150.0.3:10001 /d006 Joined 1 b384771a-ddbc-491a-8807-8d86544d7c2f x.150.0.4:10001 /d010 Joined 2 01c672cf-3365-476f-87ec-41a15a44e946 x.150.0.4:10001 /d010 Joined 3 93a8d382-b970-408a-9c21-e01c35265e77 x.150.0.3:10001 /d006 Joined # dmg pool create --ranks=0,3 --size=500G kiddie
I used the pool extend command to simulate adding a server: # dmg pool extend --ranks=1,2 kiddie
My application queried the pool size before and after the extension using daos_pool_query( ... DPI_SPACE ...). The numbers below are the info.pi_space.ps_space values for (DAOS_MEDIA_SCM, DAOS_MEDIA_NVME). before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
The total pool sized doubled (good), but the used space (i.e., s_total - s_free) also doubled. Naively, I expected the used space to remain the same as the pool has a redundancy factor of zero. Doing some arithmetic on the above works out to the used space being 35.036 GB before the expansion and 70.068 GB after. Note that, for the moment, I'm choosing to ignore that the used size is several orders of magnitude bigger that the data written (~600 KB).
Where did I goof in this methodology? TIA.
--chuck
|
|
Re: DPI_SPACE query after extending pool
Niu, Yawei
Hi, Chuck
The reserved space is per pool, it’s not relevant with container creation, so I think the space change you observed after container creation isn’t caused by space reservation.
FYI, we’ve just changed the space reservation a bit, NVMe reservation has been removed from current master and 2.2, only SCM reservation is kept.
As for space query, current client space query reports only total space and free space, I think that’s a common practice for most systems. I think It could be improved to report detailed usage like how much space is used for reservation in the future, thanks for the input.
Thanks -Niu
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Tuffli, Chuck <chuck.tuffli@...> Thank you, Niu
After thinking about what you said and running some additional experiments, I believe everything is working as you described.
I created a 500 GB pool and added a POSIX container. After creating the container, the free NVMe space dropped from 470 GB to 435 GB which roughly lines up with the 2 GB reserved per NVMe drive (this pool has 16 drives).
The free space dropped to 406 GB after writing 27 GB of file data to the container. After extending the pool, the free space increased to 815 GB (roughly linear). Following WangDi's suggestion, I waited several minutes and afterwards, observed the free space climb to 841 GB. This last number matches my expectation that free space should more than double after extending the pool with an additional storage node.
Can clients query DAOS to figure out how much storage it is using itself (e.g. reserved space)? As I can see now, reporting used space based on the difference between total and free doesn't convey quite the right message to consumers of this storage. From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Niu, Yawei <yawei.niu@...>
Hi, Chuck
The “used space” (total – free) is kind of OP (over-provisioning), you know that DAOS server has to reserve some space on both SCM and NVMe to ensure punch, container/object destroy, GC and aggregation not fail for ENOSPACE.
The size of this sys reservation is roughly: SCM: 5% of SCM total space; NVMe: 2% of NVMe total space; and the minimum reservation size is 2GB per pool target (for both SCM and NVMe). The reservation will be disabled if the pool size is tiny (when each pool target SCM and NVMe size is less than 5GB, we regard it as tiny pool, which is usually used for testing), so the operations I mentioned above could fail on the tiny pool when it’s running short of space.
There is an open ticket for reducing the OP, but it’s not on our schedule yet.
Thanks -Niu
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...> Hello Chunk
That is strange. According the output of dmg pool query, 83 objects were deleted after extend, so some space should be reclaimed.
“dmg pool query kiddie ……… Rebuild done, 83 objs, 0 recs “
before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
Hmm It seems SCM is ok, only NVME space are doubled after extend. If you do not see NVME free space get back after a few mins, probably need create a ticket.
Thanks WangDi
On 6/1/22, 10:56 AM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
Wangdi
When I checked this morning, the pool had been idle four days, but the values from daos_pool_query() have not changed.
As for object class, I'm not sure. Pool creation didn't specify a class. Here is the container query output: # daos cont query ERROR: daos: pool and container ID must be specified if --path not used ]# daos cont query kiddie whiz Container UUID : 5c61770a-2b56-4922-b95b-d025fa4d0527 Container Label : whiz Container Type : POSIX Pool UUID : 700bf1b6-38b8-467e-9f91-7131138210ba Number of snapshots : 0 Latest Persistent Snapshot : 0x0 Container redundancy factor: 0 Object Class : UNKNOWN Chunk Size : 1.0 MiB
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...>
Hello Chuck
Pool extend might migrate the data to the new pool target, then the original data will be delete asynchronous, so those space might be reclaimed a few mins later if the system is not busy.
You probably should do your daos_pool_query() a bit later. Btw: what are those objects class in your pool?
Thanks Wangdi
On 5/25/22, 2:54 PM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
# dmg pool query kiddie Pool 700bf1b6-38b8-467e-9f91-7131138210ba, ntarget=32, disabled=0, leader=0, version=18 Pool space info: - Target(VOS) count:32 - Storage tier 0 (SCM): Total size: 60 GB Free: 60 GB, min:1.9 GB, max:1.9 GB, mean:1.9 GB - Storage tier 1 (NVMe): Total size: 940 GB Free: 870 GB, min:27 GB, max:27 GB, mean:27 GB Rebuild done, 83 objs, 0 recs From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Hello Chuck,
Could you please run `dmg pool query` on the pool and show the results, this will give you a bit more info on pool usage.
Regards, Tom
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Tuffli, Chuck
I've been experimenting with extending a pool but don't quite understand the results. Any insights would be most appreciated.
The cluster is running with DAOS v2.0.2 and consists of a client and a pair of servers/storage nodes. To simulate adding a server to the cluster, I created a pool by specifying the ranks associated with one of the servers. I.e.: # dmg system query --verbose Rank UUID Control Address Fault Domain State Reason ---- ---- --------------- ------------ ----- ------ 0 654345f9-249c-48b1-b6dc-ec08dbf2aded x.150.0.3:10001 /d006 Joined 1 b384771a-ddbc-491a-8807-8d86544d7c2f x.150.0.4:10001 /d010 Joined 2 01c672cf-3365-476f-87ec-41a15a44e946 x.150.0.4:10001 /d010 Joined 3 93a8d382-b970-408a-9c21-e01c35265e77 x.150.0.3:10001 /d006 Joined # dmg pool create --ranks=0,3 --size=500G kiddie
I used the pool extend command to simulate adding a server: # dmg pool extend --ranks=1,2 kiddie
My application queried the pool size before and after the extension using daos_pool_query( ... DPI_SPACE ...). The numbers below are the info.pi_space.ps_space values for (DAOS_MEDIA_SCM, DAOS_MEDIA_NVME). before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
The total pool sized doubled (good), but the used space (i.e., s_total - s_free) also doubled. Naively, I expected the used space to remain the same as the pool has a redundancy factor of zero. Doing some arithmetic on the above works out to the used space being 35.036 GB before the expansion and 70.068 GB after. Note that, for the moment, I'm choosing to ignore that the used size is several orders of magnitude bigger that the data written (~600 KB).
Where did I goof in this methodology? TIA.
--chuck
|
|
Re: DPI_SPACE query after extending pool
Tuffli, Chuck
Thank you, Niu
After thinking about what you said and running some additional experiments, I believe everything is working as you described.
I created a 500 GB pool and added a POSIX container. After creating the container, the free NVMe space dropped from 470 GB to 435 GB which roughly lines up with the 2 GB reserved per NVMe drive (this pool has 16 drives).
The free space dropped to 406 GB after writing 27 GB of file data to the container. After extending the pool, the free space increased to 815 GB (roughly linear). Following WangDi's suggestion, I waited several minutes and afterwards, observed the free space
climb to 841 GB. This last number matches my expectation that free space should more than double after extending the pool with an additional storage node.
Can clients query DAOS to figure out how much storage it is using itself (e.g. reserved space)? As I can see now, reporting used space based on the difference between total and free doesn't convey quite the right message to consumers of this storage.
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Niu, Yawei <yawei.niu@...>
Sent: Wednesday, June 1, 2022 6:16 PM To: daos@daos.groups.io <daos@daos.groups.io> Subject: Re: [daos] DPI_SPACE query after extending pool Hi, Chuck
The “used space” (total – free) is kind of OP (over-provisioning), you know that DAOS server has to reserve some space on both SCM and NVMe to ensure punch, container/object destroy, GC and aggregation not fail for ENOSPACE.
The size of this sys reservation is roughly: SCM: 5% of SCM total space; NVMe: 2% of NVMe total space; and the minimum reservation size is 2GB per pool target (for both SCM and NVMe). The reservation will be disabled if the pool size is tiny (when each pool target SCM and NVMe size is less than 5GB, we regard it as tiny pool, which is usually used for testing), so the operations I mentioned above could fail on the tiny pool when it’s running short of space.
There is an open ticket for reducing the OP, but it’s not on our schedule yet.
Thanks -Niu
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...> Hello Chunk
That is strange. According the output of dmg pool query, 83 objects were deleted after extend, so some space should be reclaimed.
“dmg pool query kiddie ……… Rebuild done, 83 objs, 0 recs “
before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
Hmm It seems SCM is ok, only NVME space are doubled after extend. If you do not see NVME free space get back after a few mins, probably need create a ticket.
Thanks WangDi
On 6/1/22, 10:56 AM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
Wangdi
When I checked this morning, the pool had been idle four days, but the values from daos_pool_query() have not changed.
As for object class, I'm not sure. Pool creation didn't specify a class. Here is the container query output: # daos cont query ERROR: daos: pool and container ID must be specified if --path not used ]# daos cont query kiddie whiz Container UUID : 5c61770a-2b56-4922-b95b-d025fa4d0527 Container Label : whiz Container Type : POSIX Pool UUID : 700bf1b6-38b8-467e-9f91-7131138210ba Number of snapshots : 0 Latest Persistent Snapshot : 0x0 Container redundancy factor: 0 Object Class : UNKNOWN Chunk Size : 1.0 MiB
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...>
Hello Chuck
Pool extend might migrate the data to the new pool target, then the original data will be delete asynchronous, so those space might be reclaimed a few mins later if the system is not busy.
You probably should do your daos_pool_query() a bit later. Btw: what are those objects class in your pool?
Thanks Wangdi
On 5/25/22, 2:54 PM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
# dmg pool query kiddie Pool 700bf1b6-38b8-467e-9f91-7131138210ba, ntarget=32, disabled=0, leader=0, version=18 Pool space info: - Target(VOS) count:32 - Storage tier 0 (SCM): Total size: 60 GB Free: 60 GB, min:1.9 GB, max:1.9 GB, mean:1.9 GB - Storage tier 1 (NVMe): Total size: 940 GB Free: 870 GB, min:27 GB, max:27 GB, mean:27 GB Rebuild done, 83 objs, 0 recs From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Hello Chuck,
Could you please run `dmg pool query` on the pool and show the results, this will give you a bit more info on pool usage.
Regards, Tom
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Tuffli, Chuck
I've been experimenting with extending a pool but don't quite understand the results. Any insights would be most appreciated.
The cluster is running with DAOS v2.0.2 and consists of a client and a pair of servers/storage nodes. To simulate adding a server to the cluster, I created a pool by specifying the ranks associated with one of the servers. I.e.: # dmg system query --verbose Rank UUID Control Address Fault Domain State Reason ---- ---- --------------- ------------ ----- ------ 0 654345f9-249c-48b1-b6dc-ec08dbf2aded x.150.0.3:10001 /d006 Joined 1 b384771a-ddbc-491a-8807-8d86544d7c2f x.150.0.4:10001 /d010 Joined 2 01c672cf-3365-476f-87ec-41a15a44e946 x.150.0.4:10001 /d010 Joined 3 93a8d382-b970-408a-9c21-e01c35265e77 x.150.0.3:10001 /d006 Joined # dmg pool create --ranks=0,3 --size=500G kiddie
I used the pool extend command to simulate adding a server: # dmg pool extend --ranks=1,2 kiddie
My application queried the pool size before and after the extension using daos_pool_query( ... DPI_SPACE ...). The numbers below are the info.pi_space.ps_space values for (DAOS_MEDIA_SCM, DAOS_MEDIA_NVME). before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
The total pool sized doubled (good), but the used space (i.e., s_total - s_free) also doubled. Naively, I expected the used space to remain the same as the pool has a redundancy factor of zero. Doing some arithmetic on the above works out to the used space being 35.036 GB before the expansion and 70.068 GB after. Note that, for the moment, I'm choosing to ignore that the used size is several orders of magnitude bigger that the data written (~600 KB).
Where did I goof in this methodology? TIA.
--chuck
|
|
Re: Slack Invite
Kevin Zhao
Hi Lombardi, Yes finally I got the invite, thanks!
On Wed, 1 Jun 2022 at 18:38, Lombardi, Johann <johann.lombardi@...> wrote:
--
Best Regards Kevin Zhao Tech Lead, LDCG Cloud Infrastructure Linaro Vertical Technologies IRC(freenode): kevinz Slack(kubernetes.slack.com): kevinz kevin.zhao@... | Mobile/Direct/Wechat: +86 18818270915
|
|
DAOS Community Update / June'22
Lombardi, Johann
Hi there,
Please find below the DAOS community newsletter for June 2022.
Past Events
Accelerating Data-driven Workflows with DAOS Johann Lombardi (Intel)
DAOS Next Generation Storage https://www.exascaleproject.org/event/ecp-community-bof-days-2022/ (registration is open) Kevin Harms (ANL) Mohamad Chaarawi (Intel) Johann Lombardi (Intel)
Advanced Storage and Memory Hierarchy in AI and HPC with DAOS Storage Andrey Kudryavtsev (Intel)
Advanced Storage and Memory Hierarchy in AI and HPC with DAOS Storage Andrey Kudryavtsev (Intel)
Accelerating AI with DAOS Storage Johann Lombardi (Intel)
Accelerating HPC and AI with DAOS Storage https://app.swapcard.com/widget/event/isc-high-performance-2022/planning/UGxhbm5pbmdfODYxMTYx Kevin Harms (ANL) Adrian Jackson (EPCC) Michael Hennecke (Intel) Mohamad Chaarawi (Intel) Johann Lombardi (Intel)
DAOS Features for Next Generation Platforms https://www.ixpug.org/events/isc22-ixpug-workshop Mohamad Chaarawi (Intel)
DAOS demonstration, Fireside chats, …
Upcoming Events
DAOS: Nextgen Storage Stack for HPC and AI https://sites.google.com/view/essa-2022/ Johann Lombardi (Intel)
One big happy family: sharing the S3 layer between Ceph, CORTX, and DAOS
https://iosea-project.eu/event/emoss-22-workshop/
Release
R&D
News
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Re: DPI_SPACE query after extending pool
Niu, Yawei
Hi, Chuck
The “used space” (total – free) is kind of OP (over-provisioning), you know that DAOS server has to reserve some space on both SCM and NVMe to ensure punch, container/object destroy, GC and aggregation not fail for ENOSPACE.
The size of this sys reservation is roughly: SCM: 5% of SCM total space; NVMe: 2% of NVMe total space; and the minimum reservation size is 2GB per pool target (for both SCM and NVMe). The reservation will be disabled if the pool size is tiny (when each pool target SCM and NVMe size is less than 5GB, we regard it as tiny pool, which is usually used for testing), so the operations I mentioned above could fail on the tiny pool when it’s running short of space.
There is an open ticket for reducing the OP, but it’s not on our schedule yet.
Thanks -Niu
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...> Hello Chunk
That is strange. According the output of dmg pool query, 83 objects were deleted after extend, so some space should be reclaimed.
“dmg pool query kiddie ……… Rebuild done, 83 objs, 0 recs “
before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
Hmm It seems SCM is ok, only NVME space are doubled after extend. If you do not see NVME free space get back after a few mins, probably need create a ticket.
Thanks WangDi
On 6/1/22, 10:56 AM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
Wangdi
When I checked this morning, the pool had been idle four days, but the values from daos_pool_query() have not changed.
As for object class, I'm not sure. Pool creation didn't specify a class. Here is the container query output: # daos cont query ERROR: daos: pool and container ID must be specified if --path not used ]# daos cont query kiddie whiz Container UUID : 5c61770a-2b56-4922-b95b-d025fa4d0527 Container Label : whiz Container Type : POSIX Pool UUID : 700bf1b6-38b8-467e-9f91-7131138210ba Number of snapshots : 0 Latest Persistent Snapshot : 0x0 Container redundancy factor: 0 Object Class : UNKNOWN Chunk Size : 1.0 MiB
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...>
Hello Chuck
Pool extend might migrate the data to the new pool target, then the original data will be delete asynchronous, so those space might be reclaimed a few mins later if the system is not busy.
You probably should do your daos_pool_query() a bit later. Btw: what are those objects class in your pool?
Thanks Wangdi
On 5/25/22, 2:54 PM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
# dmg pool query kiddie Pool 700bf1b6-38b8-467e-9f91-7131138210ba, ntarget=32, disabled=0, leader=0, version=18 Pool space info: - Target(VOS) count:32 - Storage tier 0 (SCM): Total size: 60 GB Free: 60 GB, min:1.9 GB, max:1.9 GB, mean:1.9 GB - Storage tier 1 (NVMe): Total size: 940 GB Free: 870 GB, min:27 GB, max:27 GB, mean:27 GB Rebuild done, 83 objs, 0 recs From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Hello Chuck,
Could you please run `dmg pool query` on the pool and show the results, this will give you a bit more info on pool usage.
Regards, Tom
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Tuffli, Chuck
I've been experimenting with extending a pool but don't quite understand the results. Any insights would be most appreciated.
The cluster is running with DAOS v2.0.2 and consists of a client and a pair of servers/storage nodes. To simulate adding a server to the cluster, I created a pool by specifying the ranks associated with one of the servers. I.e.: # dmg system query --verbose Rank UUID Control Address Fault Domain State Reason ---- ---- --------------- ------------ ----- ------ 0 654345f9-249c-48b1-b6dc-ec08dbf2aded x.150.0.3:10001 /d006 Joined 1 b384771a-ddbc-491a-8807-8d86544d7c2f x.150.0.4:10001 /d010 Joined 2 01c672cf-3365-476f-87ec-41a15a44e946 x.150.0.4:10001 /d010 Joined 3 93a8d382-b970-408a-9c21-e01c35265e77 x.150.0.3:10001 /d006 Joined # dmg pool create --ranks=0,3 --size=500G kiddie
I used the pool extend command to simulate adding a server: # dmg pool extend --ranks=1,2 kiddie
My application queried the pool size before and after the extension using daos_pool_query( ... DPI_SPACE ...). The numbers below are the info.pi_space.ps_space values for (DAOS_MEDIA_SCM, DAOS_MEDIA_NVME). before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
The total pool sized doubled (good), but the used space (i.e., s_total - s_free) also doubled. Naively, I expected the used space to remain the same as the pool has a redundancy factor of zero. Doing some arithmetic on the above works out to the used space being 35.036 GB before the expansion and 70.068 GB after. Note that, for the moment, I'm choosing to ignore that the used size is several orders of magnitude bigger that the data written (~600 KB).
Where did I goof in this methodology? TIA.
--chuck
|
|
Re: DPI_SPACE query after extending pool
Wang, Di
Hello Chunk
That is strange. According the output of dmg pool query, 83 objects were deleted after extend, so some space should be reclaimed.
“dmg pool query kiddie ……… Rebuild done, 83 objs, 0 recs “
before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
Hmm It seems SCM is ok, only NVME space are doubled after extend. If you do not see NVME free space get back after a few mins, probably need create a ticket.
Thanks WangDi
On 6/1/22, 10:56 AM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
Wangdi
When I checked this morning, the pool had been idle four days, but the values from daos_pool_query() have not changed.
As for object class, I'm not sure. Pool creation didn't specify a class. Here is the container query output: # daos cont query ERROR: daos: pool and container ID must be specified if --path not used ]# daos cont query kiddie whiz Container UUID : 5c61770a-2b56-4922-b95b-d025fa4d0527 Container Label : whiz Container Type : POSIX Pool UUID : 700bf1b6-38b8-467e-9f91-7131138210ba Number of snapshots : 0 Latest Persistent Snapshot : 0x0 Container redundancy factor: 0 Object Class : UNKNOWN Chunk Size : 1.0 MiB
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...>
Hello Chuck
Pool extend might migrate the data to the new pool target, then the original data will be delete asynchronous, so those space might be reclaimed a few mins later if the system is not busy.
You probably should do your daos_pool_query() a bit later. Btw: what are those objects class in your pool?
Thanks Wangdi
On 5/25/22, 2:54 PM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
# dmg pool query kiddie Pool 700bf1b6-38b8-467e-9f91-7131138210ba, ntarget=32, disabled=0, leader=0, version=18 Pool space info: - Target(VOS) count:32 - Storage tier 0 (SCM): Total size: 60 GB Free: 60 GB, min:1.9 GB, max:1.9 GB, mean:1.9 GB - Storage tier 1 (NVMe): Total size: 940 GB Free: 870 GB, min:27 GB, max:27 GB, mean:27 GB Rebuild done, 83 objs, 0 recs From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Hello Chuck,
Could you please run `dmg pool query` on the pool and show the results, this will give you a bit more info on pool usage.
Regards, Tom
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Tuffli, Chuck
I've been experimenting with extending a pool but don't quite understand the results. Any insights would be most appreciated.
The cluster is running with DAOS v2.0.2 and consists of a client and a pair of servers/storage nodes. To simulate adding a server to the cluster, I created a pool by specifying the ranks associated with one of the servers. I.e.: # dmg system query --verbose Rank UUID Control Address Fault Domain State Reason ---- ---- --------------- ------------ ----- ------ 0 654345f9-249c-48b1-b6dc-ec08dbf2aded x.150.0.3:10001 /d006 Joined 1 b384771a-ddbc-491a-8807-8d86544d7c2f x.150.0.4:10001 /d010 Joined 2 01c672cf-3365-476f-87ec-41a15a44e946 x.150.0.4:10001 /d010 Joined 3 93a8d382-b970-408a-9c21-e01c35265e77 x.150.0.3:10001 /d006 Joined # dmg pool create --ranks=0,3 --size=500G kiddie
I used the pool extend command to simulate adding a server: # dmg pool extend --ranks=1,2 kiddie
My application queried the pool size before and after the extension using daos_pool_query( ... DPI_SPACE ...). The numbers below are the info.pi_space.ps_space values for (DAOS_MEDIA_SCM, DAOS_MEDIA_NVME). before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
The total pool sized doubled (good), but the used space (i.e., s_total - s_free) also doubled. Naively, I expected the used space to remain the same as the pool has a redundancy factor of zero. Doing some arithmetic on the above works out to the used space being 35.036 GB before the expansion and 70.068 GB after. Note that, for the moment, I'm choosing to ignore that the used size is several orders of magnitude bigger that the data written (~600 KB).
Where did I goof in this methodology? TIA.
--chuck
|
|
Re: DPI_SPACE query after extending pool
Tuffli, Chuck
Wangdi
When I checked this morning, the pool had been idle four days, but the values from daos_pool_query() have not changed.
As for object class, I'm not sure. Pool creation didn't specify a class. Here is the container query output:
# daos cont query
ERROR: daos: pool and container ID must be specified if --path not used
]# daos cont query kiddie whiz
Container UUID : 5c61770a-2b56-4922-b95b-d025fa4d0527
Container Label : whiz
Container Type : POSIX
Pool UUID : 700bf1b6-38b8-467e-9f91-7131138210ba
Number of snapshots : 0
Latest Persistent Snapshot : 0x0
Container redundancy factor: 0
Object Class : UNKNOWN
Chunk Size : 1.0 MiB
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wang, Di <di.wang@...>
Sent: Wednesday, May 25, 2022 4:33 PM To: daos@daos.groups.io <daos@daos.groups.io> Subject: Re: [daos] DPI_SPACE query after extending pool Hello Chuck
Pool extend might migrate the data to the new pool target, then the original data will be delete asynchronous, so those space might be reclaimed a few mins later if the system is not busy.
You probably should do your daos_pool_query() a bit later. Btw: what are those objects class in your pool?
Thanks Wangdi
On 5/25/22, 2:54 PM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
# dmg pool query kiddie Pool 700bf1b6-38b8-467e-9f91-7131138210ba, ntarget=32, disabled=0, leader=0, version=18 Pool space info: - Target(VOS) count:32 - Storage tier 0 (SCM): Total size: 60 GB Free: 60 GB, min:1.9 GB, max:1.9 GB, mean:1.9 GB - Storage tier 1 (NVMe): Total size: 940 GB Free: 870 GB, min:27 GB, max:27 GB, mean:27 GB Rebuild done, 83 objs, 0 recs From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Hello Chuck,
Could you please run `dmg pool query` on the pool and show the results, this will give you a bit more info on pool usage.
Regards, Tom
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Tuffli, Chuck
I've been experimenting with extending a pool but don't quite understand the results. Any insights would be most appreciated.
The cluster is running with DAOS v2.0.2 and consists of a client and a pair of servers/storage nodes. To simulate adding a server to the cluster, I created a pool by specifying the ranks associated with one of the servers. I.e.: # dmg system query --verbose Rank UUID Control Address Fault Domain State Reason ---- ---- --------------- ------------ ----- ------ 0 654345f9-249c-48b1-b6dc-ec08dbf2aded x.150.0.3:10001 /d006 Joined 1 b384771a-ddbc-491a-8807-8d86544d7c2f x.150.0.4:10001 /d010 Joined 2 01c672cf-3365-476f-87ec-41a15a44e946 x.150.0.4:10001 /d010 Joined 3 93a8d382-b970-408a-9c21-e01c35265e77 x.150.0.3:10001 /d006 Joined # dmg pool create --ranks=0,3 --size=500G kiddie
I used the pool extend command to simulate adding a server: # dmg pool extend --ranks=1,2 kiddie
My application queried the pool size before and after the extension using daos_pool_query( ... DPI_SPACE ...). The numbers below are the info.pi_space.ps_space values for (DAOS_MEDIA_SCM, DAOS_MEDIA_NVME). before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
The total pool sized doubled (good), but the used space (i.e., s_total - s_free) also doubled. Naively, I expected the used space to remain the same as the pool has a redundancy factor of zero. Doing some arithmetic on the above works out to the used space being 35.036 GB before the expansion and 70.068 GB after. Note that, for the moment, I'm choosing to ignore that the used size is several orders of magnitude bigger that the data written (~600 KB).
Where did I goof in this methodology? TIA.
--chuck
|
|
Re: Slack Invite
Lombardi, Johann
Hi Kevin,
I see you on slack, so I assume that you eventually got the invite.
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Kevin Zhao <kevin.zhao@...>
Hi DAOS,
I've followed the guide to join the DAOS mailist, but I don't receive the automatically invite to join the Slack. Cloud you help to send me the invite? Thanks!
-- Best Regards Kevin Zhao Tech Lead, LDCG Cloud Infrastructure Linaro Vertical Technologies IRC(freenode): kevinz Slack(kubernetes.slack.com): kevinz kevin.zhao@... | Mobile/Direct/Wechat: +86 18818270915
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Slack Invite
Kevin Zhao
Hi DAOS, I've followed the guide to join the DAOS mailist, but I don't receive the automatically invite to join the Slack. Cloud you help to send me the invite? Thanks! Best Regards Kevin Zhao Tech Lead, LDCG Cloud Infrastructure Linaro Vertical Technologies IRC(freenode): kevinz Slack(kubernetes.slack.com): kevinz kevin.zhao@... | Mobile/Direct/Wechat: +86 18818270915
|
|
Re: mmap support with a dfuse-mounted posix conntainer
Faccini, Bruno
So, you may want to use mmap() without MAP_SHARED when using DFuse "--disable-caching" option ?? What do you think ?
From: <daos@daos.groups.io> on behalf of Bruno Faccini <bruno.faccini@...>
Ok, I can reproduce the behaviour you have reported.
Having a look to the related source code, this may come from the fact if caching is disabled, DFuse automatically/silently switches to direct-io mode, but then we may trigger the current Fuse Kernel module limitation that cause shared mmap() mappings to be unsupported (ENODEV).
From: <daos@daos.groups.io> on behalf of "shmatsuu@..." <shmatsuu@...>
Hi Bruno, --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Re: mmap support with a dfuse-mounted posix conntainer
Faccini, Bruno
Ok, I can reproduce the behaviour you have reported.
Having a look to the related source code, this may come from the fact if caching is disabled, DFuse automatically/silently switches to direct-io mode, but then we may trigger the current Fuse Kernel module limitation that cause shared mmap() mappings to be unsupported (ENODEV).
From: <daos@daos.groups.io> on behalf of "shmatsuu@..." <shmatsuu@...>
Hi Bruno, --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Re: mmap support with a dfuse-mounted posix conntainer
shmatsuu@...
Hi Bruno,
Thank you very much for checking! I'm using DAOS v2.0.1.One thing I forgot to mention previously is that I mounted the POSIX container with dfuse "--disable-caching" and got an error from mmap. Yes, without "--disable-caching" option, mmap succeeds with no errors in my environment as well. If you try to do dfuse mount with --disable-caching option, in your environment, does mmap succeed? I wanted to know if this is an expected behavior and to evaluate the effectiveness of caching in dfuse. Thank you very much for your help, --- Shohei
|
|
Re: mmap support with a dfuse-mounted posix conntainer
Faccini, Bruno
Well, I just tried your test-case on my test-bed and you program just works as expected, no error from mmap() ! : ============================================================================================= [bfaccini@wolf-5 ~]$ ps -ef | grep daos bfaccini 1632 5549 0 May26 pts/0 00:00:30 daos_agent -i -o /home/bfaccini/daos_agent.yml start bfaccini 36536 5549 0 09:35 pts/0 00:00:00 daos_server --config=/home/bfaccini/daos_server.yml start bfaccini 36603 36536 99 09:36 pts/0 00:21:54 /home/bfaccini/daos/install/bin/daos_engine -t 8 -x 2 -g daos_server -d /var/run/daos_server -T 1 -I 0 -r 0 -H 0 -s /mnt/daos/ bfaccini 37141 5549 0 09:54 pts/0 00:00:00 grep --color=auto daos [bfaccini@wolf-5 ~]$ ps -ef | grep dfuse bfaccini 36714 1 0 09:39 pts/0 00:00:00 dfuse --mountpoint /mnt/dfuse/ --pool 757c8377-7bf9-4cb1-a9d0-05a86901c9dd --cont a76ad1a2-c0f1-4b40-9d8e-6a042204198f bfaccini 37143 5549 0 09:54 pts/0 00:00:00 grep --color=auto dfuse [bfaccini@wolf-5 ~]$ dmg pool list -i -v Label UUID SvcReps SCM Size SCM Used SCM Imbalance NVME Size NVME Used NVME Imbalance Disabled ----- ---- ------- -------- -------- ------------- --------- --------- -------------- -------- foo 757c8377-7bf9-4cb1-a9d0-05a86901c9dd 0 4.0 GB 21 MB 0% 0 B 0 B 0% 0/8
[bfaccini@wolf-5 ~]$ daos pool list-cont foo UUID Label ---- ----- a76ad1a2-c0f1-4b40-9d8e-6a042204198f container_label_not_set [bfaccini@wolf-5 ~]$ [bfaccini@wolf-5 ~]$ cat test_mmap.c #include <stdio.h> #include <fcntl.h> #include <sys/mman.h> #include <string.h>
#define FILE_SIZE 4096
int main(int argc, char *argv[]) { int fd; char *map; size_t map_size;
fd = open(argv[1], O_CREAT | O_RDWR, 0665); if(fd < 0) { printf("Error : can't open file\n"); return -1; }
map_size=FILE_SIZE;
map = (char*)mmap(NULL, map_size, PROT_WRITE, MAP_SHARED, fd, 0);
if(map == MAP_FAILED) { printf("Error : mmap failed\n"); return -1; } }
[bfaccini@wolf-5 ~]$ gcc -o test_mmap test_mmap.c [bfaccini@wolf-5 ~]$ [bfaccini@wolf-5 ~]$ ls -la /mnt/dfuse total 19075 -rwxrwxr-x 1 bfaccini bfaccini 19531936 May 28 15:18 daos_admin [bfaccini@wolf-5 ~]$ df /mnt/dfuse/daos_admin Filesystem 1K-blocks Used Available Use% Mounted on dfuse 3906272 20526 3885747 1% /mnt/dfuse [bfaccini@wolf-5 ~]$ [bfaccini@wolf-5 ~]$ ./test_mmap /mnt/dfuse/daos_admin [bfaccini@wolf-5 ~]$ echo $? 0 [bfaccini@wolf-5 ~]$ [bfaccini@wolf-5 ~]$ strace -o /tmp/test_mmap.strace ./test_mmap /mnt/dfuse/daos_admin [bfaccini@wolf-5 ~]$ [bfaccini@wolf-5 ~]$ tail -10 /tmp/test_mmap.strace access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory) access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory) mprotect(0x7ff27cb54000, 16384, PROT_READ) = 0 mprotect(0x600000, 4096, PROT_READ) = 0 mprotect(0x7ff27cd80000, 4096, PROT_READ) = 0 munmap(0x7ff27cd67000, 97934) = 0 open("/mnt/dfuse/daos_admin", O_RDWR|O_CREAT, 0665) = 3 mmap(NULL, 4096, PROT_WRITE, MAP_SHARED, 3, 0) = 0x7ff27cd7e000 exit_group(2094522368) = ? +++ exited with 0 +++ [bfaccini@wolf-5 ~]$ =============================================================================================
So, can you double-check your setup ?? Also, which DAOS version are you running ? Have a good day! Bruno.
From: <daos@daos.groups.io> on behalf of "shmatsuu@..." <shmatsuu@...>
Hi, #include <stdio.h> #include <fcntl.h> #include <sys/mman.h> #include <string.h>
#define FILE_SIZE 4096
int main(int argc, char *argv[]) { int fd; char *map; size_t map_size;
fd = open(argv[1], O_CREAT | O_RDWR, 0665); if(fd < 0) { printf("Error : can't open file\n"); return -1; }
map_size=FILE_SIZE;
map = (char*)mmap(NULL, map_size, PROT_WRITE, MAP_SHARED, fd, 0);
if(map == MAP_FAILED) { printf("Error : mmap failed\n"); return -1; } === --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
mmap support with a dfuse-mounted posix conntainer
shmatsuu@...
Hi,
I have a very quick question about the below selection in Posix compliance. From a single DAOS client node, is mmap with MAP_SHARED supported from the client node against a file on a DAOS POSIX container, if it is mounted with dfuse to the client? https://docs.daos.io/v2.0/user/filesystem/ I've tried, but I get an error return from mmap. Below is the portion of my test code and the page size is 4KB. Thanks in advance! ==== #include <stdio.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <string.h>
#define FILE_SIZE 4096
int main(int argc, char *argv[]) {
int fd;
char *map;
size_t map_size;
fd = open(argv[1], O_CREAT | O_RDWR, 0665);
if(fd < 0) {
printf("Error : can't open file\n");
return -1;
}
map_size=FILE_SIZE;
map = (char*)mmap(NULL, map_size, PROT_WRITE, MAP_SHARED, fd, 0);
if(map == MAP_FAILED) {
printf("Error : mmap failed\n");
return -1;
}
===
|
|
Re: New to DAOS: Storage Pools
JACKSON Adrian
It's also possible to define pool that only use one particular storage
class (i.e. a pool of just SCM or a pool on NVME) but as Kevin says, it's up to you to put data in a specific pool. cheers adrianj On 26/05/2022 14:59, Harms, Kevin via groups.io wrote: This email was sent to you by someone outside the University.-- Tel: +44 131 6506470 skype: remoteadrianj The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
|
|
Re: New to DAOS: Storage Pools
That's an important differentiating point, Kevin, thank you, and it does make a lot of sense to me.
Thank you, David
|
|
Re: New to DAOS: Storage Pools
Harms, Kevin
In my opinion, I would not say that DAOS has tiers. It does support a configuration for small writes landing in PMEM and then a later aggregation service will batch them into the NVME. However, there is no general data movement service inside DAOS that moves data between different tiers of storage. There are data mover tools, but those are run manually in order to move data in and out of DAOS.
kevin ________________________________________ From: daos@daos.groups.io <daos@daos.groups.io> on behalf of TheCTEgroup <david@...> Sent: Wednesday, May 25, 2022 1:12 PM To: daos@daos.groups.io Subject: [daos] New to DAOS: Storage Pools As I was going through the documentation reading about the storage pools I saw the diagram that shows storage pool 1, 2, and 3 all a combination of SCM PMEM or NVMe, but I couldn’t find any detail on whether those pools have a tier priority, which I assume they do, and if they all had to be DAOS storage containers or if one could be an archive tier via S3 (cloud perhaps?) or some NAS. I know anything is possible, but just looking at whether is would be practical based on front end performance requirements balanced with long term retention to a potentially lower cost tier of storage thank you.
|
|
Re: DPI_SPACE query after extending pool
Wang, Di
Hello Chuck
Pool extend might migrate the data to the new pool target, then the original data will be delete asynchronous, so those space might be reclaimed a few mins later if the system is not busy.
You probably should do your daos_pool_query() a bit later. Btw: what are those objects class in your pool?
Thanks Wangdi
On 5/25/22, 2:54 PM, "daos@daos.groups.io on behalf of Tuffli, Chuck" <daos@daos.groups.io on behalf of chuck.tuffli@...> wrote:
# dmg pool query kiddie Pool 700bf1b6-38b8-467e-9f91-7131138210ba, ntarget=32, disabled=0, leader=0, version=18 Pool space info: - Target(VOS) count:32 - Storage tier 0 (SCM): Total size: 60 GB Free: 60 GB, min:1.9 GB, max:1.9 GB, mean:1.9 GB - Storage tier 1 (NVMe): Total size: 940 GB Free: 870 GB, min:27 GB, max:27 GB, mean:27 GB Rebuild done, 83 objs, 0 recs From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Hello Chuck,
Could you please run `dmg pool query` on the pool and show the results, this will give you a bit more info on pool usage.
Regards, Tom
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Tuffli, Chuck
I've been experimenting with extending a pool but don't quite understand the results. Any insights would be most appreciated.
The cluster is running with DAOS v2.0.2 and consists of a client and a pair of servers/storage nodes. To simulate adding a server to the cluster, I created a pool by specifying the ranks associated with one of the servers. I.e.: # dmg system query --verbose Rank UUID Control Address Fault Domain State Reason ---- ---- --------------- ------------ ----- ------ 0 654345f9-249c-48b1-b6dc-ec08dbf2aded x.150.0.3:10001 /d006 Joined 1 b384771a-ddbc-491a-8807-8d86544d7c2f x.150.0.4:10001 /d010 Joined 2 01c672cf-3365-476f-87ec-41a15a44e946 x.150.0.4:10001 /d010 Joined 3 93a8d382-b970-408a-9c21-e01c35265e77 x.150.0.3:10001 /d006 Joined # dmg pool create --ranks=0,3 --size=500G kiddie
I used the pool extend command to simulate adding a server: # dmg pool extend --ranks=1,2 kiddie
My application queried the pool size before and after the extension using daos_pool_query( ... DPI_SPACE ...). The numbers below are the info.pi_space.ps_space values for (DAOS_MEDIA_SCM, DAOS_MEDIA_NVME). before extend: s_total(30000021504,470000000000) s_free(29994849224,434968621056) after extend: s_total(60000043008,940000000000) s_free(59994841512,869937672192)
The total pool sized doubled (good), but the used space (i.e., s_total - s_free) also doubled. Naively, I expected the used space to remain the same as the pool has a redundancy factor of zero. Doing some arithmetic on the above works out to the used space being 35.036 GB before the expansion and 70.068 GB after. Note that, for the moment, I'm choosing to ignore that the used size is several orders of magnitude bigger that the data written (~600 KB).
Where did I goof in this methodology? TIA.
--chuck
|
|