Date   

Re: Question about Pool Size Expansion

Harms, Kevin
 

Shohei,

you can add additional servers, see this in the manual:

https://docs.daos.io/admin/pool_operations/#pool-extension

kevin

________________________________________
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of shmatsuu@yahoo-corp.jp <shmatsuu@yahoo-corp.jp>
Sent: Saturday, August 21, 2021 9:47 AM
To: daos@daos.groups.io
Subject: [daos] Question about Pool Size Expansion

Hi,

I have a question about expanding the size of existing pools online in DAOS v1.2? Is there any way to do it?

Thank you,
---
Shohei


unable to configure raft service;no dRPC client set (data plane not started?)

21960347@...
 

Hello DAOS developers and users

I am trying to start a DAOS server with real SCM and NVme Devices.

After formating the storage device,the server complain "instance 0 exited: failed to start system db: unable to configure raft service: invalid database" and "no dRPC client set (data plane not started?)"

I've pasted the daos_server.log and .yaml below,thanks

localhost.localdomain INFO 2021/08/23 16:17:11 DAOS Control Server v1.2 (pid 60280) listening on 0.0.0.0:10001
localhost.localdomain INFO 2021/08/23 16:17:11 Checking DAOS I/O Engine instance 0 storage ...
localhost.localdomain INFO 2021/08/23 16:17:13 Metadata format required on instance 0
localhost.localdomain INFO 2021/08/23 16:17:57 Formatting scm storage for DAOS I/O Engine instance 0 (reformat: true)
localhost.localdomain INFO 2021/08/23 16:17:57 Instance 0: starting format of SCM (dcpm:/mnt/daos1)
localhost.localdomain INFO 2021/08/23 16:18:00 Instance 0: finished format of SCM (dcpm:/mnt/daos1)
localhost.localdomain INFO 2021/08/23 16:18:00 Formatting nvme storage for DAOS I/O Engine instance 0
localhost.localdomain INFO 2021/08/23 16:18:00 Instance 0: starting format of nvme block devices [0000:87:00.0 0000:88:00.0 0000:da:00.0 0000:db:00.0]
localhost.localdomain INFO 2021/08/23 16:18:08 Instance 0: finished format of nvme block devices [0000:87:00.0 0000:88:00.0 0000:da:00.0 0000:db:00.0]
localhost.localdomain INFO 2021/08/23 16:18:08 DAOS I/O Engine instance 0 storage ready
localhost.localdomain INFO 2021/08/23 16:18:08 instance 0 exited: failed to start system db: unable to configure raft service: invalid database
localhost.localdomain ERROR 2021/08/23 16:18:08 removing socket file: removing instance 0 socket file: no dRPC client set (data plane not started?)
localhost.localdomain INFO 2021/08/23 16:18:08 &&& RAS EVENT id: [engine_status_down] ts: [2021-08-23T16:18:08.508477+0800] host: [localhost.localdomain] type: [STATE_CHANGE] sev: [ERROR] msg: [DAOS rank exited unexpectedly] pid: [60280]





# For a single-server system
 
name: daos_server
access_points: ['localhost']
provider: ofi+sockets
nr_hugepages: 4096
control_log_file: /tmp/daos_server.log
transport_config:
   allow_insecure: true


engines:
-
  targets: 1
  pinned_numa_node: 0
  nr_xs_helpers: 0
  fabric_iface: eno1
  fabric_iface_port: 31416
  log_file: /tmp/daos_engine.0.log
 
  env_vars:
  - DAOS_MD_CAP=1024
  - CRT_CTX_SHARE_ADDR=0
  - CRT_TIMEOUT=30
  - FI_SOCKETS_MAX_CONN_RETRY=1
  - FI_SOCKETS_CONN_TIMEOUT=2000

  scm_mount: /mnt/daos1 # map to -s /mnt/daos
  scm_class: dcpm
  scm_list: [/dev/pmem0]
 
  bdev_class: nvme
  bdev_list: ["0000:87:00.0","0000:88:00.0","0000:da:00.0","0000:db:00.0"]





 
 
 
 


Questions about Daos consistency

段世博
 

  DAOS uses two-phase commit to ensure consistency between replicas. According to "src/vos/readme.md", read requests can be sent to any replica, but if this replica server has a network partition(the client uses the old Pool map to access the old replica server), then the latest committed data cannot be seen. How does DAOS avoid this situation?

thanks~


Question about Pool Size Expansion

shmatsuu@...
 

Hi,

I have a question about expanding the size of existing pools online in DAOS v1.2? Is there any way to do it? 

Thank you,
---
Shohei


Re: Increasing FIO performance

Lombardi, Johann
 

Hi Eeheet,

 

You can use the fio DAOS engine (https://daos-stack.github.io/devbranch/admin/performance_tuning/#fio).

 

If you want to stick to dfuse, then I would advise to use the posixaio engine instead of pvsync. pvsync is synchronous/blocking and won’t be able to submit more than one I/O at a time (in the “IO depths” section of your fio output, you have “1 = 100%”). Please also note that the FUSE kernel module takes a per-file mutex on every write (and not on read), so all writes to a file are effectively serialized. I haven’t checked whether this has been improved with recent Linux kernels though.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "Hayer, Eeheet" <eeheet.hayer@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday 27 July 2021 at 17:57
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Increasing FIO performance

 

Hi,

 

I’m an intern on the HPC team, and I’m trying to increase the performance of FIO on my server (wolf-169), admin/client (wolf-57). I attached a screencap of what I’m currently getting and am told I should be getting much higher numbers.

 

My server has 2 SSDs, and some non-optane pmem.

 

The command I’m running:

$ /usr/bin/fio --name=random-write --ioengine=pvsync --rw=randwrite --bs=4k --size=128M --nrfiles=4 --directory=/tmp/daos_test1 --numjobs=8 --iodepth=16 --runtime=60 --time_based --direct=1 --buffered=0 --randrepeat=0 --norandommap --refill_buffers

 

 

 

-Eeheet

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Question about DFS

Lombardi, Johann
 

Hi,

 

The DFS layer does not track openers, so the other client will be able to proceed and delete the file. The original opener will thus suddenly see an empty file.

Please note that if the two processes run on the same client node and use the same dfuse mountpoint, then open-unlink files will work.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of 段世博 <duanshibo.d@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday 2 August 2021 at 07:02
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Question about DFS

 

[Edited Message Follows]

While a file is opened by one client, can other clients delete the file? If it cannot be deleted, how is this guaranteed?

Thanks~

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Increasing FIO performance

Hayer, Eeheet <eeheet.hayer@...>
 

Hi,

 

I’m an intern on the HPC team, and I’m trying to increase the performance of FIO on my server (wolf-169), admin/client (wolf-57). I attached a screencap of what I’m currently getting and am told I should be getting much higher numbers.

 

My server has 2 SSDs, and some non-optane pmem.

 

The command I’m running:

$ /usr/bin/fio --name=random-write --ioengine=pvsync --rw=randwrite --bs=4k --size=128M --nrfiles=4 --directory=/tmp/daos_test1 --numjobs=8 --iodepth=16 --runtime=60 --time_based --direct=1 --buffered=0 --randrepeat=0 --norandommap --refill_buffers

 

 

 

-Eeheet


Question about DFS

段世博
 
Edited

While a file is opened by one client, can other clients delete the file? If it cannot be deleted, how is this guaranteed?

Thanks~


Re: FIO Results & Running IO500

Lombardi, Johann
 

Hi Peter,

 

A few things to try/explore:

  • I don’t think that we have ever tested with 2x pmem DIMMs per socket. Maybe you could try with dram instead of pmem to see whether the performance increases.
  • 16x targets might be too much for 2x pmem DIMMs. You could try to reduce it to 8x targets and set “nr_xs_helpers” to 0.
  • It sounds like you run the benchmark (fio, IO500) and the DAOS engine on the same node. There might be interferences between both. You could try to change the affinity of the benchmark to run on CPU cores not used by the DAOS engine (e.g. with taskset(1) or mpirun args).

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Peter <magpiesaresoawesome@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday 13 July 2021 at 03:16
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] FIO Results & Running IO500

 

Hello again,

I've tried some more to improve these results, different DAOS versions (including the YUM repo), different MPI versions, DAOS configurations, etc.

I'm still unable to diagnose this issue, IOPS performance remains low for both FIO and IO500.

Does anyone have any input on how I can try to debug or resolve this issue?

Thanks for your help.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Creating a POSIX container first and opening it inside my code

Lombardi, Johann
 

Hi Lipeng,

 

Assuming that you are using 1.2, you can create a POSIX container with the daos utility:

$ daos cont create --pool <POOL_UUID> --type POSIX

 

You get back a container UUID that you can then pass to daos_cont_open() to open the container and then mount with dfs_mount() (see https://github.com/daos-stack/daos/blob/master/src/include/daos_fs.h#L121).

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "wanl via groups.io" <wanl@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday 6 July 2021 at 17:00
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Creating a POSIX container first and opening it inside my code

 

Hi,

How to create a POSIX container first by calling "daos cont create" and then open it inside my code using "daos_cont_open()"? Should I pass some environment variables into my code?

Thanks,
Lipeng

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Update to privileged helper

Nabarro, Tom
 

Try running the utils/setup_daos_admin.sh script as sudo.

 

The admin binary should be moved to /usr/bin/ with setuid.

Remove executable bit from install/bin/daos_admin.

These steps are performed by the script.

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Ethan Mallove
Sent: Tuesday, July 13, 2021 9:24 PM
To: daos@daos.groups.io
Subject: Re: [daos] Update to privileged helper

 

If you are running DAOS from source, and you want to run as a non-root user, then you will need to perform some manual setup steps on every server in order to ensure that the privileged helper has the correct permissions in order to perform privileged tasks

What are the manual steps?  I tried using setuid and checking immutable bit, but I still get the privileged helper (daos_admin) does not have root permissions, e.g.,

# chmod u+s install/bin/daos_admin

$ ls -ltrd install/bin/daos_admin

-rwsrwxr-x 1 emallovx emallovx 21838880 Jul 13 18:54 install/bin/daos_admin

# chown -R root:root install/bin/daos_admin

 

chown: changing ownership of ‘install/bin/daos_admin’: Operation not permitted

$ lsattr daos_admin

lsattr: Inappropriate ioctl for device While reading flags on daos_admin


Regards,
Ethan


Re: Update to privileged helper

Ethan Mallove
 

If you are running DAOS from source, and you want to run as a non-root user, then you will need to perform some manual setup steps on every server in order to ensure that the privileged helper has the correct permissions in order to perform privileged tasks

What are the manual steps?  I tried using setuid and checking immutable bit, but I still get the privileged helper (daos_admin) does not have root permissions, e.g.,

# chmod u+s install/bin/daos_admin

$ ls -ltrd install/bin/daos_admin

-rwsrwxr-x 1 emallovx emallovx 21838880 Jul 13 18:54 install/bin/daos_admin

# chown -R root:root install/bin/daos_admin

 

chown: changing ownership of ‘install/bin/daos_admin’: Operation not permitted

$ lsattr daos_admin

lsattr: Inappropriate ioctl for device While reading flags on daos_admin


Regards,
Ethan


Re: FIO Results & Running IO500

Peter
 

Thank you for the reply,

Yes, I have mounted the Optane modules as an ext4 filesystem; a quick FIO test is able to achieve > 1 MIOPs.

My thought was that it would be network related, however I've ran some MPI benchmarks and the scores line-up with EDR Infiniband.
Also, the 12.4 GB/s speed we get in the FIO Seq Read test shows we can get better than local-only performance.

The documentation mentions daos_test and daos_perf, are these still supported?


Re: FIO Results & Running IO500

JACKSON Adrian
 

Hi,

Have you tried benchmarking the hardware directly, rather than through
DAOS? i.e. running some benchmarks just on an ext4 filesystem mounted on
the Optane on a single node. Just to check that gives you expected
performance.

cheers

adrianj
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.


Re: FIO Results & Running IO500

Peter
 

Hello again,

I've tried some more to improve these results, different DAOS versions (including the YUM repo), different MPI versions, DAOS configurations, etc.

I'm still unable to diagnose this issue, IOPS performance remains low for both FIO and IO500.

Does anyone have any input on how I can try to debug or resolve this issue?

Thanks for your help.


Creating a POSIX container first and opening it inside my code

wanl@...
 

Hi,

How to create a POSIX container first by calling "daos cont create" and then open it inside my code using "daos_cont_open()"? Should I pass some environment variables into my code?

Thanks,
Lipeng

 


Re: Questions about DFS

Chaarawi, Mohamad
 

Hi,

 

  1. Permissions checks when using the DFS API are only done on the pool and container ACLs during dfs_mount().
  2. The DAOS/DFS client does not cache any data on the client side. So if you lookup a full path every time you are going to do the path traversal for each lookup. In DAOS master, we have added a new API (dfs_sys) where a DFS mountpoint can be done with a caching  option to cache the looked up path of parent directories to avoid such overhead:
    1. https://github.com/daos-stack/daos/blob/master/src/include/daos_fs_sys.h#L49

 

Thanks,

Mohamad

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of 段世博 <duanshibo.d@...>
Date: Saturday, July 3, 2021 at 3:44 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] Questions about DFS

Hi~, I have two questions about DFS metadata:

   1. When the client opens a file or directory, will it check the permissions for each directory on the path, or will it only check at dfs_mount()?

   2. Will the client cache the metadata of the directory to speed up path traversal? Or the metadata of this directory will be saved on the client after the directory is opened until it is closed?

 

thanks.


Questions about DFS

段世博
 

Hi~, I have two questions about DFS metadata:
   1. When the client opens a file or directory, will it check the permissions for each directory on the path, or will it only check at dfs_mount()?
   2. Will the client cache the metadata of the directory to speed up path traversal? Or the metadata of this directory will be saved on the client after the directory is opened until it is closed?
 
thanks.


Re: FIO Results & Running IO500

JACKSON Adrian
 

Actually, it's been pointed out to me I was confusing engines and
targets. So ignore me. :)

On 29/06/2021 10:09, JACKSON Adrian wrote:
It would be sensible to increase the number of engines per node. For our
system, where we have 48 cores per node, we're running 12 engines per
socket, 24 per node. This night be too many, but I think 1 engine per
node is too few.

cheers

adrianj

On 29/06/2021 08:32, Peter wrote:
This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that
the email is genuine and the content is safe.
Johann, my configuration is as follows:


4 nodes, 4 engines in total (1 per node)
2 x 128 GB Optane per socket, 2 x sockets per node (DAOS is currently
only using 1 socket per node)
(We also have 1x 1.5 TB NVMe drive / node, that we plan to eventually
configure DAOS to use)
Nodes are using Mellanox EDR Infiniband
Cent OS 7.9, have tried various MPI distributions.

The yaml file (was generated via the auto conf)

***********
port: 10001
transport_config:
allow_insecure: true
server_name: server
client_cert_dir: /etc/daos/certs/clients
ca_cert: /etc/daos/certs/daosCA.crt
cert: /etc/daos/certs/server.crt
key: /etc/daos/certs/server.key
servers: []
engines:
- targets: 16
nr_xs_helpers: 3
first_core: 0
name: daos_server
socket_dir: /var/run/daos_server
log_file: /tmp/daos_engine.0.log
scm_mount: /mnt/daos0
scm_class: dcpm
scm_list:
- /dev/pmem0
bdev_class: nvme
provider: ofi+verbs;ofi_rxm
fabric_iface: ib0
fabric_iface_port: 31416
pinned_numa_node: 0
disable_vfio: false
disable_vmd: true
nr_hugepages: 0
set_hugepages: false
control_log_mask: INFO
control_log_file: /tmp/daos_server.log
helper_log_file: ""
firmware_helper_log_file: ""
recreate_superblocks: false
fault_path: ""
name: daos_server
socket_dir: /var/run/daos_server
provider: ofi+verbs;ofi_rxm
modules: ""
access_points:
- 172.23.7.3:10001
fault_cb: ""
hyperthreads: false
path: ../etc/daos_server.yml
*******

And yes, I am running FIO from one of the nodes.

Is there anything you see that I should modify or investigate? Thank you
very much for the help!
--
Tel: +44 131 6506470 skype: remoteadrianj
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.




--
Tel: +44 131 6506470 skype: remoteadrianj


Re: FIO Results & Running IO500

JACKSON Adrian
 

It would be sensible to increase the number of engines per node. For our
system, where we have 48 cores per node, we're running 12 engines per
socket, 24 per node. This night be too many, but I think 1 engine per
node is too few.

cheers

adrianj

On 29/06/2021 08:32, Peter wrote:
This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that
the email is genuine and the content is safe.
Johann, my configuration is as follows:


4 nodes, 4 engines in total (1 per node)
2 x 128 GB Optane per socket, 2 x sockets per node (DAOS is currently
only using 1 socket per node)
(We also have 1x 1.5 TB NVMe drive / node, that we plan to eventually
configure DAOS to use)
Nodes are using Mellanox EDR Infiniband
Cent OS 7.9, have tried various MPI distributions.

The yaml file (was generated via the auto conf)

***********
port: 10001
transport_config:
allow_insecure: true
server_name: server
client_cert_dir: /etc/daos/certs/clients
ca_cert: /etc/daos/certs/daosCA.crt
cert: /etc/daos/certs/server.crt
key: /etc/daos/certs/server.key
servers: []
engines:
- targets: 16
nr_xs_helpers: 3
first_core: 0
name: daos_server
socket_dir: /var/run/daos_server
log_file: /tmp/daos_engine.0.log
scm_mount: /mnt/daos0
scm_class: dcpm
scm_list:
- /dev/pmem0
bdev_class: nvme
provider: ofi+verbs;ofi_rxm
fabric_iface: ib0
fabric_iface_port: 31416
pinned_numa_node: 0
disable_vfio: false
disable_vmd: true
nr_hugepages: 0
set_hugepages: false
control_log_mask: INFO
control_log_file: /tmp/daos_server.log
helper_log_file: ""
firmware_helper_log_file: ""
recreate_superblocks: false
fault_path: ""
name: daos_server
socket_dir: /var/run/daos_server
provider: ofi+verbs;ofi_rxm
modules: ""
access_points:
- 172.23.7.3:10001
fault_cb: ""
hyperthreads: false
path: ../etc/daos_server.yml
*******

And yes, I am running FIO from one of the nodes.

Is there anything you see that I should modify or investigate? Thank you
very much for the help!
--
Tel: +44 131 6506470 skype: remoteadrianj
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

21 - 40 of 1448