[External] Re: [daos] failed to create pool: -1023 #chat
Shengyu SY19 Zhang
Hello,
Thank you for your help, since the ompi/pmix is depends projects of daos, I didn’t touch them, just follow quick-start in github, to build and run it. I enter the path _build.external/ompi and pmix, and then make && make install, all finished successfully. For report-uri, since I’m running daos server in the /root/daos, I can see the uri file was created and can see the content, therefor just specify full path for the client, seems connection between client and server is OK (if I specify wrong, or nic not started, there is another error). And also if I run: daosctl create-pool testpool, will get the same issue. So I’m wondering where the problem is. There are some information in the logs: PMIx_Lookup group daos_server failed, rc: -46, value.type 0.
Best Regards From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of xuezhao.liu@...
Hi,
|
|
Lombardi, Johann
Hi,
First of all, please note that we are in the process of implementing our own wire-up protocol that will eventually allow us to start the DAOS server on each storage node independently. At that point, we won’t require opmi/pmix any longer and will be able to start the DAOS servers via systemd, kubernetes or any parallel launchers (e.g. pdsh, …). This feature will be available this summer.
Meanwhile, it would be great to see the output of the “orterun … daos_server" command. I suspect that the backend storage hasn’t been formatted and the data plane not started yet. Could you please file a ticket on https://jira.hpdd.intel.com and attach the daos_server output? Thanks.
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello,
Thank you for your help, since the ompi/pmix is depends projects of daos, I didn’t touch them, just follow quick-start in github, to build and run it. I enter the path _build.external/ompi and pmix, and then make && make install, all finished successfully. For report-uri, since I’m running daos server in the /root/daos, I can see the uri file was created and can see the content, therefor just specify full path for the client, seems connection between client and server is OK (if I specify wrong, or nic not started, there is another error). And also if I run: daosctl create-pool testpool, will get the same issue. So I’m wondering where the problem is. There are some information in the logs: PMIx_Lookup group daos_server failed, rc: -46, value.type 0.
Best Regards From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of xuezhao.liu@...
Hi, --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Shengyu SY19 Zhang
Hello Johann,
Since there is no nvdimm on my system, I’m using tempfs mountted to /mnt/daos as described in the document, and nvme is leaving unformatted, it is using via SPDK. I can’t post file on the jira, I haven’t got a portal to register on it. Here is the outpus of the server side :
2019/06/20 18:18:12 config.go:108: debug: DAOS config read from /usr/local/etc/daos_server.yml 2019/06/20 18:18:12 config.go:144: debug: Active config saved to /usr/local/etc/.daos_server.active.yml (read-only) 2019/06/20 18:18:12 config.go:353: debug: Switching control log level to DEBUG 2019/06/20 18:18:12 config.go:368: debug: daos_server logging to file /tmp/daos_control.log Starting SPDK v18.07-pre / DPDK 18.02.0 initialization... [ DPDK EAL parameters: spdk -c 0x1 --file-prefix=spdk1929786431 --base-virtaddr=0x200000000000 --proc-type=auto ] EAL: Detected 20 lcore(s) EAL: Auto-detected process type: PRIMARY EAL: No free hugepages reported in hugepages-1048576kB EAL: Multi-process socket /var/run/.spdk1929786431_unix EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:03:00.0 on NUMA socket 0 EAL: probe driver: 8086:953 spdk_nvme EAL: using IOMMU type 1 (Type 1) EAL: PCI device 0000:05:00.0 on NUMA socket 0 EAL: probe driver: 8086:953 spdk_nvme no NVDIMMs found! waiting for storage format on server 0
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi,
First of all, please note that we are in the process of implementing our own wire-up protocol that will eventually allow us to start the DAOS server on each storage node independently. At that point, we won’t require opmi/pmix any longer and will be able to start the DAOS servers via systemd, kubernetes or any parallel launchers (e.g. pdsh, …). This feature will be available this summer.
Meanwhile, it would be great to see the output of the “orterun … daos_server" command. I suspect that the backend storage hasn’t been formatted and the data plane not started yet. Could you please file a ticket on https://jira.hpdd.intel.com and attach the daos_server output? Thanks.
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello,
Thank you for your help, since the ompi/pmix is depends projects of daos, I didn’t touch them, just follow quick-start in github, to build and run it. I enter the path _build.external/ompi and pmix, and then make && make install, all finished successfully. For report-uri, since I’m running daos server in the /root/daos, I can see the uri file was created and can see the content, therefor just specify full path for the client, seems connection between client and server is OK (if I specify wrong, or nic not started, there is another error). And also if I run: daosctl create-pool testpool, will get the same issue. So I’m wondering where the problem is. There are some information in the logs: PMIx_Lookup group daos_server failed, rc: -46, value.type 0.
Best Regards From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of xuezhao.liu@...
Hi, --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Liu, Xuezhao
looks like your daos_server was not started successfully.
you may check the details in the config file /usr/local/etc/daos_server.yml, try to change some setting to see if it can work, for example can test to comment out (add "#" to start of the line) all the "bdev_" started options. If still cannot work, you may post your daos_server.yml and the daos log (path configured by "log_file" option, can set "log_mask: DEBUG") to jira ticket or here if jira does not work for you.
|
|
Lombardi, Johann
Right, the storage engine isn’t started since the backend storage hasn’t been formatted: waiting for storage format on server 0
Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine: $ daos_shell storage format
As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.
To list the SSDs available on the system, you can run the following commands: $ daos_server storage prep-nvme 2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024 2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root $ daos_server storage scan […] NVMe: - model: 'INTEL SSDPED1K375GA ' serial: 'PHKS7335009W375AGN ' pciaddr: 0000:87:00.0 fwrev: E2010324 namespaces: - id: 1 capacity: 375 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5226001C1P6DGN ' pciaddr: 0000:da:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5506004Z1P6DGN ' pciaddr: 0000:81:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600
And then populate the yaml file with the devices that you want to use, for instance: bdev_class: nvme bdev_list: ["0000:81:00.0", "0000:da:00.0"]
We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.
HTH
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
looks like your daos_server was not started successfully.
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Shengyu SY19 Zhang
Hello,
I got first issue resolved after run: $ daos_shell storage format I think you could add this step into the quick start document. Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml. Now the storage server seems formatted, but there is new issue happen : When I run dmg create, it encounter another issue : failed to create pool: -1007
If I execute: daos_shell pool create –s 1G It says: 2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml Active connections: [localhost:10001]
Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio) parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax
I’m trying to resolve by my ways, any hints will be appreciated.
Regards, Shengyu From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Right, the storage engine isn’t started since the backend storage hasn’t been formatted: waiting for storage format on server 0
Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine: $ daos_shell storage format
As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.
To list the SSDs available on the system, you can run the following commands: $ daos_server storage prep-nvme 2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024 2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root $ daos_server storage scan […] NVMe: - model: 'INTEL SSDPED1K375GA ' serial: 'PHKS7335009W375AGN ' pciaddr: 0000:87:00.0 fwrev: E2010324 namespaces: - id: 1 capacity: 375 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5226001C1P6DGN ' pciaddr: 0000:da:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5506004Z1P6DGN ' pciaddr: 0000:81:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600
And then populate the yaml file with the devices that you want to use, for instance: bdev_class: nvme bdev_list: ["0000:81:00.0", "0000:da:00.0"]
We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.
HTH
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
looks like your daos_server was not started successfully.
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Lombardi, Johann
Hi Shengyu,
We are about to retire the quick start document in favor of the admin guide that has been integrated into the source code (https://github.com/daos-stack/daos/tree/master/doc/admin) The documentation for format was actually landed this morning: https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#basic-workflow
As for the -1007 error, it means that you don’t have enough space available to allocate the pool (https://github.com/daos-stack/daos/blob/master/doc/admin/troubleshooting.md#daos-errors). How much space have you allocated with tmpfs under /mnt/daos?
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello,
I got first issue resolved after run: $ daos_shell storage format I think you could add this step into the quick start document. Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml. Now the storage server seems formatted, but there is new issue happen : When I run dmg create, it encounter another issue : failed to create pool: -1007
If I execute: daos_shell pool create –s 1G It says: 2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml Active connections: [localhost:10001]
Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio) parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax
I’m trying to resolve by my ways, any hints will be appreciated.
Regards, Shengyu
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
I’m afraid the patch for this has not landed yet (regarding the handling of the request), it’s going through a round of reviews, https://github.com/daos-stack/daos/pull/637 .
Please feel free to experiment with the patch, as it should work, otherwise please use the "dmg" tool to create pools in the interim
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
From: daos@daos.groups.io [mailto:daos@daos.groups.io]
On Behalf Of Shengyu SY19 Zhang
Hello,
I got first issue resolved after run: $ daos_shell storage format I think you could add this step into the quick start document. Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml. Now the storage server seems formatted, but there is new issue happen : When I run dmg create, it encounter another issue : failed to create pool: -1007
If I execute: daos_shell pool create –s 1G It says: 2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml Active connections: [localhost:10001]
Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio) parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax
I’m trying to resolve by my ways, any hints will be appreciated.
Regards, Shengyu From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Right, the storage engine isn’t started since the backend storage hasn’t been formatted: waiting for storage format on server 0
Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine: $ daos_shell storage format
As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.
To list the SSDs available on the system, you can run the following commands: $ daos_server storage prep-nvme 2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024 2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root $ daos_server storage scan […] NVMe: - model: 'INTEL SSDPED1K375GA ' serial: 'PHKS7335009W375AGN ' pciaddr: 0000:87:00.0 fwrev: E2010324 namespaces: - id: 1 capacity: 375 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5226001C1P6DGN ' pciaddr: 0000:da:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5506004Z1P6DGN ' pciaddr: 0000:81:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600
And then populate the yaml file with the devices that you want to use, for instance: bdev_class: nvme bdev_list: ["0000:81:00.0", "0000:da:00.0"]
We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.
HTH
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
looks like your daos_server was not started successfully.
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Shengyu SY19 Zhang
Hello Johann,
Great. For /mnt/daos, its space should be sufficient, here is the outputs of mount: tmpfs on /mnt/daos type tmpfs (rw,nosuid,nodev,noexec,noatime,seclabel,size=6291456k) However I can see it was already used 88% of its space, then I remount a larger one (20G), now I’m able to create storage pool.
Regards, Shengyu.
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi Shengyu,
We are about to retire the quick start document in favor of the admin guide that has been integrated into the source code (https://github.com/daos-stack/daos/tree/master/doc/admin) The documentation for format was actually landed this morning: https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#basic-workflow
As for the -1007 error, it means that you don’t have enough space available to allocate the pool (https://github.com/daos-stack/daos/blob/master/doc/admin/troubleshooting.md#daos-errors). How much space have you allocated with tmpfs under /mnt/daos?
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello,
I got first issue resolved after run: $ daos_shell storage format I think you could add this step into the quick start document. Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml. Now the storage server seems formatted, but there is new issue happen : When I run dmg create, it encounter another issue : failed to create pool: -1007
If I execute: daos_shell pool create –s 1G It says: 2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml Active connections: [localhost:10001]
Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio) parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax
I’m trying to resolve by my ways, any hints will be appreciated.
Regards, Shengyu
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Shengyu SY19 Zhang
Hello Tom,
Thank you for the infor. Now I’m able to create storage pool, via dmg tool, I’ll try the patch later time when I need.
Regards, Shengyu. From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Nabarro, Tom
I’m afraid the patch for this has not landed yet (regarding the handling of the request), it’s going through a round of reviews, https://github.com/daos-stack/daos/pull/637 .
Please feel free to experiment with the patch, as it should work, otherwise please use the "dmg" tool to create pools in the interim
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
From:
daos@daos.groups.io [mailto:daos@daos.groups.io]
On Behalf Of Shengyu SY19 Zhang
Hello,
I got first issue resolved after run: $ daos_shell storage format I think you could add this step into the quick start document. Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml. Now the storage server seems formatted, but there is new issue happen : When I run dmg create, it encounter another issue : failed to create pool: -1007
If I execute: daos_shell pool create –s 1G It says: 2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml Active connections: [localhost:10001]
Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio) parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax
I’m trying to resolve by my ways, any hints will be appreciated.
Regards, Shengyu From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Right, the storage engine isn’t started since the backend storage hasn’t been formatted: waiting for storage format on server 0
Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine: $ daos_shell storage format
As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.
To list the SSDs available on the system, you can run the following commands: $ daos_server storage prep-nvme 2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024 2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root $ daos_server storage scan […] NVMe: - model: 'INTEL SSDPED1K375GA ' serial: 'PHKS7335009W375AGN ' pciaddr: 0000:87:00.0 fwrev: E2010324 namespaces: - id: 1 capacity: 375 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5226001C1P6DGN ' pciaddr: 0000:da:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5506004Z1P6DGN ' pciaddr: 0000:81:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600
And then populate the yaml file with the devices that you want to use, for instance: bdev_class: nvme bdev_list: ["0000:81:00.0", "0000:da:00.0"]
We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.
HTH
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
looks like your daos_server was not started successfully.
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Shengyu SY19 Zhang
Hello,
I hope to get additional help in how to make DAOS working in its ecosystem, run it fuse, or hdfs, or testing for iops. dmg query and fuse not work in my environment, and I noticed the admin document of DAOS is mismatch with code of fuse part. Pool created OK however dmg query and fuse mount always returns invalid parameters error code (1003). orterun … dmg query --pool 06c10125-c3ea-4040-a030-10a9e5f10004 --svc 1 Therefor as for now if I can get any guides to test/run DAOS in its ecosystem is better to learn more about the project, any information will be appreciated.
Best Regards, Shengyu From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Tom,
Thank you for the infor. Now I’m able to create storage pool, via dmg tool, I’ll try the patch later time when I need.
Regards, Shengyu. From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Nabarro, Tom
I’m afraid the patch for this has not landed yet (regarding the handling of the request), it’s going through a round of reviews, https://github.com/daos-stack/daos/pull/637 .
Please feel free to experiment with the patch, as it should work, otherwise please use the "dmg" tool to create pools in the interim
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
From:
daos@daos.groups.io [mailto:daos@daos.groups.io]
On Behalf Of Shengyu SY19 Zhang
Hello,
I got first issue resolved after run: $ daos_shell storage format I think you could add this step into the quick start document. Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml. Now the storage server seems formatted, but there is new issue happen : When I run dmg create, it encounter another issue : failed to create pool: -1007
If I execute: daos_shell pool create –s 1G It says: 2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml Active connections: [localhost:10001]
Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio) parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax
I’m trying to resolve by my ways, any hints will be appreciated.
Regards, Shengyu From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Right, the storage engine isn’t started since the backend storage hasn’t been formatted: waiting for storage format on server 0
Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine: $ daos_shell storage format
As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.
To list the SSDs available on the system, you can run the following commands: $ daos_server storage prep-nvme 2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024 2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root $ daos_server storage scan […] NVMe: - model: 'INTEL SSDPED1K375GA ' serial: 'PHKS7335009W375AGN ' pciaddr: 0000:87:00.0 fwrev: E2010324 namespaces: - id: 1 capacity: 375 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5226001C1P6DGN ' pciaddr: 0000:da:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5506004Z1P6DGN ' pciaddr: 0000:81:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600
And then populate the yaml file with the devices that you want to use, for instance: bdev_class: nvme bdev_list: ["0000:81:00.0", "0000:da:00.0"]
We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.
HTH
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
looks like your daos_server was not started successfully.
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Lombardi, Johann
Hi Shengyu,
I assume that you have followed the instructions to set up /var/run/daos_agent, correct? If not, please check https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#runtime-directory-setup
We also landed support for the daos_agent to v0.5 (David sent an email to the list), but the admin guide hasn’t been updated yet: https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#agent-configuration On the compute nodes, you should just run “daos_agent &”. A systemd script to automate this should be available soon.
David, could you please submit a PR to document in the admin guide how to setup & start the daos_agent? Thanks in advance.
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello,
I hope to get additional help in how to make DAOS working in its ecosystem, run it fuse, or hdfs, or testing for iops. dmg query and fuse not work in my environment, and I noticed the admin document of DAOS is mismatch with code of fuse part. Pool created OK however dmg query and fuse mount always returns invalid parameters error code (1003). orterun … dmg query --pool 06c10125-c3ea-4040-a030-10a9e5f10004 --svc 1 Therefor as for now if I can get any guides to test/run DAOS in its ecosystem is better to learn more about the project, any information will be appreciated.
Best Regards, Shengyu From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Tom,
Thank you for the infor. Now I’m able to create storage pool, via dmg tool, I’ll try the patch later time when I need.
Regards, Shengyu. From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Nabarro, Tom
I’m afraid the patch for this has not landed yet (regarding the handling of the request), it’s going through a round of reviews, https://github.com/daos-stack/daos/pull/637 .
Please feel free to experiment with the patch, as it should work, otherwise please use the "dmg" tool to create pools in the interim
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
From:
daos@daos.groups.io [mailto:daos@daos.groups.io]
On Behalf Of Shengyu SY19 Zhang
Hello,
I got first issue resolved after run: $ daos_shell storage format I think you could add this step into the quick start document. Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml. Now the storage server seems formatted, but there is new issue happen : When I run dmg create, it encounter another issue : failed to create pool: -1007
If I execute: daos_shell pool create –s 1G It says: 2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml Active connections: [localhost:10001]
Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio) parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax
I’m trying to resolve by my ways, any hints will be appreciated.
Regards, Shengyu From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Right, the storage engine isn’t started since the backend storage hasn’t been formatted: waiting for storage format on server 0
Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine: $ daos_shell storage format
As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.
To list the SSDs available on the system, you can run the following commands: $ daos_server storage prep-nvme 2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024 2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root $ daos_server storage scan […] NVMe: - model: 'INTEL SSDPED1K375GA ' serial: 'PHKS7335009W375AGN ' pciaddr: 0000:87:00.0 fwrev: E2010324 namespaces: - id: 1 capacity: 375 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5226001C1P6DGN ' pciaddr: 0000:da:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5506004Z1P6DGN ' pciaddr: 0000:81:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600
And then populate the yaml file with the devices that you want to use, for instance: bdev_class: nvme bdev_list: ["0000:81:00.0", "0000:da:00.0"]
We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.
HTH
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
looks like your daos_server was not started successfully.
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Hello Shengyu,
Also the “svc” parameter to dmg query is comma separated list of 0-based indices so you might want “--svc 0” (refer to the second item returned from the create call)
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
From: daos@daos.groups.io [mailto:daos@daos.groups.io]
On Behalf Of Lombardi, Johann
Hi Shengyu,
I assume that you have followed the instructions to set up /var/run/daos_agent, correct? If not, please check https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#runtime-directory-setup
We also landed support for the daos_agent to v0.5 (David sent an email to the list), but the admin guide hasn’t been updated yet: https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#agent-configuration On the compute nodes, you should just run “daos_agent &”. A systemd script to automate this should be available soon.
David, could you please submit a PR to document in the admin guide how to setup & start the daos_agent? Thanks in advance.
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello,
I hope to get additional help in how to make DAOS working in its ecosystem, run it fuse, or hdfs, or testing for iops. dmg query and fuse not work in my environment, and I noticed the admin document of DAOS is mismatch with code of fuse part. Pool created OK however dmg query and fuse mount always returns invalid parameters error code (1003). orterun … dmg query --pool 06c10125-c3ea-4040-a030-10a9e5f10004 --svc 1 Therefor as for now if I can get any guides to test/run DAOS in its ecosystem is better to learn more about the project, any information will be appreciated.
Best Regards, Shengyu From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Tom,
Thank you for the infor. Now I’m able to create storage pool, via dmg tool, I’ll try the patch later time when I need.
Regards, Shengyu. From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Nabarro, Tom
I’m afraid the patch for this has not landed yet (regarding the handling of the request), it’s going through a round of reviews, https://github.com/daos-stack/daos/pull/637 .
Please feel free to experiment with the patch, as it should work, otherwise please use the "dmg" tool to create pools in the interim
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
From:
daos@daos.groups.io [mailto:daos@daos.groups.io]
On Behalf Of Shengyu SY19 Zhang
Hello,
I got first issue resolved after run: $ daos_shell storage format I think you could add this step into the quick start document. Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml. Now the storage server seems formatted, but there is new issue happen : When I run dmg create, it encounter another issue : failed to create pool: -1007
If I execute: daos_shell pool create –s 1G It says: 2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml Active connections: [localhost:10001]
Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio) parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax
I’m trying to resolve by my ways, any hints will be appreciated.
Regards, Shengyu From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Right, the storage engine isn’t started since the backend storage hasn’t been formatted: waiting for storage format on server 0
Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine: $ daos_shell storage format
As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.
To list the SSDs available on the system, you can run the following commands: $ daos_server storage prep-nvme 2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024 2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root $ daos_server storage scan […] NVMe: - model: 'INTEL SSDPED1K375GA ' serial: 'PHKS7335009W375AGN ' pciaddr: 0000:87:00.0 fwrev: E2010324 namespaces: - id: 1 capacity: 375 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5226001C1P6DGN ' pciaddr: 0000:da:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600 - model: 'INTEL SSDPEDMD016T4 ' serial: 'CVFT5506004Z1P6DGN ' pciaddr: 0000:81:00.0 fwrev: 8DV10171 namespaces: - id: 1 capacity: 1600
And then populate the yaml file with the devices that you want to use, for instance: bdev_class: nvme bdev_list: ["0000:81:00.0", "0000:da:00.0"]
We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.
HTH
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
looks like your daos_server was not started successfully.
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|