Date
1 - 2 of 2
DAOS configuration
Steffen Christgau
Dear all,
I successfully installed DAOS v0.6 on a small test-bed. I was able to create a pool that uses both SCM and an emulated NVMe storage. Although this quite nice for the moment, the configuration files of server, agent and client contain options which are not that clear to me. Here are some questions: * access_points The server configuration from utils/config states that "DAOS will need a quorum of access point nodes to be available [in order to operate] [...]". In other words: Is the list of hosts in the "access_point" configuration simply the list of hosts running DAOS server instances that form the DAOS system according to the storage model [1]? If that is the case and if I have n nodes where I run the daos_server on, do all n hostnames have to be listed in the "access point" variable? In addition: Should the value of access_points be identical on all nodes and also identical for the server, agent and client configuration? * hostlist In the client and agent configuration there is also a "hostlist" configuration variable. What is the actual meaning of this parameter? Do they have different semantics for the agent and for the client? From my experiments I assume that the daos_shell (a client) connects to the hosts specified in the hostlist from daos.yml. But what is then the meaning of access_points for the client in that case? * targets in daos_server.yml According to the storage model document "a target is the unit of fault" and "the number of target[s] exported by a DAOS server instance is configurable and depends on the underlying hardware (i.e. number of SCM modules)". For a system with six NVDIMMs per socket is the correct number of the "targets" setting six as each DIMM may fail? * multiple NVDIMM namespaces/NUMA configuration The nodes of the test-bed are dual-socket ones where each socket has NVDIMMs attached to it. Two namespaces have been created for each socket with ipmctl, formatted (ext4) and mounted (dax) under /mnt/daos/pmem{0,1}. How do I configure the daos_server for such a system? The comments in the example server configuration have a "single server instance per config file for now", the scm_mount variable is a single string (a list does not work), however scm_list is actually a list of namespaces/device files but "currently only one per server [is] supported". So for the NUMA case, do I have to create a second server configuration for the second NUMA domain and launch it together with the existing configuration (e.g. using orterun's appfile facility)? * Configuration for dmg This is a minor issue: Is there any configuration file for dmg? It always writes log file to /tmp/daos.log. The client configuration file appears to be ignored and strace shows no indication of a config file being read. In addition, I had to set the OFI_PORT and OFI_INTERFACE environment variables according to to get it working. Its no real problem, but it would be convenient to have these settings persisted. Some clarifications on these points would be quite helpful. Regards, Steffen [1] https://github.com/daos-stack/daos/blob/v0.6/doc/storage_model.md
|
|
Answers inline below
toggle quoted messageShow quoted text
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
-----Original Message-----
From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Steffen Christgau Sent: Thursday, August 15, 2019 4:12 PM To: daos@daos.groups.io Subject: [daos] DAOS configuration Dear all, I successfully installed DAOS v0.6 on a small test-bed. I was able to create a pool that uses both SCM and an emulated NVMe storage. Although this quite nice for the moment, the configuration files of server, agent and client contain options which are not that clear to me. Here are some questions: * access_points The server configuration from utils/config states that "DAOS will need a quorum of access point nodes to be available [in order to operate] [...]". In other words: Is the list of hosts in the "access_point" configuration simply the list of hosts running DAOS server instances that form the DAOS system according to the storage model [1]? If that is the case and if I have n nodes where I run the daos_server on, do all n hostnames have to be listed in the "access point" variable? In addition: Should the value of access_points be identical on all nodes and also identical for the server, agent and client configuration?if using orterun to launch daos_server (which is the only currently supported launcher), access_points is unused, this will be for future use. Please remove this parameter from config files. * hostlist In the client and agent configuration there is also a "hostlist" configuration variable. What is the actual meaning of this parameter? Do they have different semantics for the agent and for the client? From my experiments I assume that the daos_shell (a client) connects to the hosts specified in the hostlist from daos.yml. But what is then the meaning of access_points for the client in that case? * targets in daos_server.ymlhostlist determines the storage server hosts that will be acted upon from the client (as you have said), we are in the process of cleaning up the definitions/distinctions between that and access_points. For the moment it's best to ignore access_points please. According to the storage model document "a target is the unit of fault" and "the number of target[s] exported by a DAOS server instance is configurable and depends on the underlying hardware (i.e. number of SCM modules)". For a system with six NVDIMMs per socket is the correct number of the "targets" setting six as each DIMM may fail? * multiple NVDIMM namespaces/NUMA configurationtargets in the server configuration file refer to VOS instances (can be thought of as service threads). https://github.com/daos-stack/daos/blob/master/utils/config/daos_server.yml#L194 The nodes of the test-bed are dual-socket ones where each socket has NVDIMMs attached to it. Two namespaces have been created for each socket with ipmctl, formatted (ext4) and mounted (dax) under /mnt/daos/pmem{0,1}. How do I configure the daos_server for such a system? The comments in the example server configuration have a "single server instance per config file for now", the scm_mount variable is a single string (a list does not work), however scm_list is actually a list of namespaces/device files but "currently only one per server [is] supported". So for the NUMA case, do I have to create a second server configuration for the second NUMA domain and launch it together with the existing configuration (e.g. using orterun's appfile facility)? * Configuration for dmgsupport for multiple NUMA domains is in development, currently please just choose one pmem device/SCM Mount. Thanks for your patience. This is a minor issue: Is there any configuration file for dmg? It always writes log file to /tmp/daos.log. The client configuration file appears to be ignored and strace shows no indication of a config file being read. In addition, I had to set the OFI_PORT and OFI_INTERFACE environment variables according to to get it working. Its no real problem, but it would be convenient to have these settings persisted. Some clarifications on these points would be quite helpful.The environment variable is D_LOG_FILE Regards, Steffen [1] https://github.com/daos-stack/daos/blob/v0.6/doc/storage_model.md --------------------------------------------------------------------- Intel Corporation (UK) Limited Registered No. 1134945 (England) Registered Office: Pipers Way, Swindon SN3 1RJ VAT No: 860 2173 47 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
|
|