Date   

Re: [External] Re: [daos] failed to create pool: -1023 #chat

Liu, Xuezhao
 

looks like your daos_server was not started successfully.
you may check the details in the config file /usr/local/etc/daos_server.yml, try to change some setting to see if it can work, for example can test to comment out (add "#" to start of the line) all the "bdev_" started options.
If still cannot work, you may post your daos_server.yml and the daos log (path configured by "log_file" option, can set "log_mask: DEBUG") to jira ticket or here if jira does not work for you.


Re: [External] Re: [daos] failed to create pool: -1023 #chat

Lombardi, Johann
 

Right, the storage engine isn’t started since the backend storage hasn’t been formatted:

waiting for storage format on server 0

 

Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine:

$ daos_shell storage format

 

As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.

 

To list the SSDs available on the system, you can run the following commands:

$ daos_server storage prep-nvme

2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024

2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root

$ daos_server storage scan

[…]

NVMe:

- model: 'INTEL SSDPED1K375GA '

  serial: 'PHKS7335009W375AGN  '

  pciaddr: 0000:87:00.0

  fwrev: E2010324

  namespaces:

  - id: 1

    capacity: 375

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5226001C1P6DGN  '

  pciaddr: 0000:da:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5506004Z1P6DGN  '

  pciaddr: 0000:81:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

 

And then populate the yaml file with the devices that you want to use, for instance:

  bdev_class: nvme

  bdev_list: ["0000:81:00.0", "0000:da:00.0"]

 

We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.

 

HTH

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 20 June 2019 at 13:24
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

looks like your daos_server was not started successfully.
you may check the details in the config file /usr/local/etc/daos_server.yml, try to change some setting to see if it can work, for example can test to comment out (add "#" to start of the line) all the "bdev_" started options.
If still cannot work, you may post your daos_server.yml and the daos log (path configured by "log_file" option, can set "log_mask: DEBUG") to jira ticket or here if jira does not work for you.

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: [External] Re: [daos] failed to create pool: -1023 #chat

Shengyu SY19 Zhang
 

Hello,

 

I got first issue resolved after run: $ daos_shell storage format

I think you could add this step into the quick start document.

Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml.

Now the storage server seems formatted, but there is new issue happen :

When I run dmg create, it encounter another issue :

failed to create pool: -1007
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
orterun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[27363,1],0]
  Exit code:    1

 

 

If I execute: daos_shell pool create –s 1G

It says:

2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml

Active connections: [localhost:10001]

 

Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio)

parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax

 

I’m trying to resolve by my ways, any hints will be appreciated.

 

Regards,

Shengyu

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Lombardi, Johann
Sent: Friday, June 21, 2019 3:14 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Right, the storage engine isn’t started since the backend storage hasn’t been formatted:

waiting for storage format on server 0

 

Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine:

$ daos_shell storage format

 

As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.

 

To list the SSDs available on the system, you can run the following commands:

$ daos_server storage prep-nvme

2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024

2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root

$ daos_server storage scan

[…]

NVMe:

- model: 'INTEL SSDPED1K375GA '

  serial: 'PHKS7335009W375AGN  '

  pciaddr: 0000:87:00.0

  fwrev: E2010324

  namespaces:

  - id: 1

    capacity: 375

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5226001C1P6DGN  '

  pciaddr: 0000:da:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5506004Z1P6DGN  '

  pciaddr: 0000:81:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

 

And then populate the yaml file with the devices that you want to use, for instance:

  bdev_class: nvme

  bdev_list: ["0000:81:00.0", "0000:da:00.0"]

 

We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.

 

HTH

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 20 June 2019 at 13:24
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

looks like your daos_server was not started successfully.
you may check the details in the config file /usr/local/etc/daos_server.yml, try to change some setting to see if it can work, for example can test to comment out (add "#" to start of the line) all the "bdev_" started options.
If still cannot work, you may post your daos_server.yml and the daos log (path configured by "log_file" option, can set "log_mask: DEBUG") to jira ticket or here if jira does not work for you.

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: [External] Re: [daos] failed to create pool: -1023 #chat

Lombardi, Johann
 

Hi Shengyu,

 

We are about to retire the quick start document in favor of the admin guide that has been integrated into the source code (https://github.com/daos-stack/daos/tree/master/doc/admin)

The documentation for format was actually landed this morning: https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#basic-workflow

 

As for the -1007 error, it means that you don’t have enough space available to allocate the pool (https://github.com/daos-stack/daos/blob/master/doc/admin/troubleshooting.md#daos-errors).

How much space have you allocated with tmpfs under /mnt/daos?

 

Cheers,

Johann

 

 

From: <daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Friday 21 June 2019 at 11:35
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello,

 

I got first issue resolved after run: $ daos_shell storage format

I think you could add this step into the quick start document.

Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml.

Now the storage server seems formatted, but there is new issue happen :

When I run dmg create, it encounter another issue :

failed to create pool: -1007
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
orterun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[27363,1],0]
  Exit code:    1

 

 

If I execute: daos_shell pool create –s 1G

It says:

2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml

Active connections: [localhost:10001]

 

Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio)

parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax

 

I’m trying to resolve by my ways, any hints will be appreciated.

 

Regards,

Shengyu

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: [External] Re: [daos] failed to create pool: -1023 #chat

Nabarro, Tom
 

I’m afraid the patch for this has not landed yet (regarding the handling of the request), it’s going through a round of reviews, https://github.com/daos-stack/daos/pull/637 .

 

Please feel free to experiment with the patch, as it should work, otherwise please use the "dmg" tool to create pools in the interim

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang
Sent: Friday, June 21, 2019 10:35 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello,

 

I got first issue resolved after run: $ daos_shell storage format

I think you could add this step into the quick start document.

Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml.

Now the storage server seems formatted, but there is new issue happen :

When I run dmg create, it encounter another issue :

failed to create pool: -1007
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
orterun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[27363,1],0]
  Exit code:    1

 

 

If I execute: daos_shell pool create –s 1G

It says:

2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml

Active connections: [localhost:10001]

 

Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio)

parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax

 

I’m trying to resolve by my ways, any hints will be appreciated.

 

Regards,

Shengyu

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Lombardi, Johann
Sent: Friday, June 21, 2019 3:14 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Right, the storage engine isn’t started since the backend storage hasn’t been formatted:

waiting for storage format on server 0

 

Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine:

$ daos_shell storage format

 

As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.

 

To list the SSDs available on the system, you can run the following commands:

$ daos_server storage prep-nvme

2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024

2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root

$ daos_server storage scan

[…]

NVMe:

- model: 'INTEL SSDPED1K375GA '

  serial: 'PHKS7335009W375AGN  '

  pciaddr: 0000:87:00.0

  fwrev: E2010324

  namespaces:

  - id: 1

    capacity: 375

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5226001C1P6DGN  '

  pciaddr: 0000:da:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5506004Z1P6DGN  '

  pciaddr: 0000:81:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

 

And then populate the yaml file with the devices that you want to use, for instance:

  bdev_class: nvme

  bdev_list: ["0000:81:00.0", "0000:da:00.0"]

 

We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.

 

HTH

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 20 June 2019 at 13:24
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

looks like your daos_server was not started successfully.
you may check the details in the config file /usr/local/etc/daos_server.yml, try to change some setting to see if it can work, for example can test to comment out (add "#" to start of the line) all the "bdev_" started options.
If still cannot work, you may post your daos_server.yml and the daos log (path configured by "log_file" option, can set "log_mask: DEBUG") to jira ticket or here if jira does not work for you.

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: [External] Re: [daos] failed to create pool: -1023 #chat

Shengyu SY19 Zhang
 

Hello Johann,

 

Great.

For /mnt/daos, its space should be sufficient, here is the outputs of mount:

tmpfs on /mnt/daos type tmpfs (rw,nosuid,nodev,noexec,noatime,seclabel,size=6291456k)

However I can see it was already used 88% of its space, then I remount a larger one (20G), now I’m able to create storage pool.

 

Regards,

Shengyu.

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Lombardi, Johann
Sent: Friday, June 21, 2019 8:28 PM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hi Shengyu,

 

We are about to retire the quick start document in favor of the admin guide that has been integrated into the source code (https://github.com/daos-stack/daos/tree/master/doc/admin)

The documentation for format was actually landed this morning: https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#basic-workflow

 

As for the -1007 error, it means that you don’t have enough space available to allocate the pool (https://github.com/daos-stack/daos/blob/master/doc/admin/troubleshooting.md#daos-errors).

How much space have you allocated with tmpfs under /mnt/daos?

 

Cheers,

Johann

 

 

From: <daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Friday 21 June 2019 at 11:35
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello,

 

I got first issue resolved after run: $ daos_shell storage format

I think you could add this step into the quick start document.

Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml.

Now the storage server seems formatted, but there is new issue happen :

When I run dmg create, it encounter another issue :

failed to create pool: -1007
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
orterun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[27363,1],0]
  Exit code:    1

 

 

If I execute: daos_shell pool create –s 1G

It says:

2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml

Active connections: [localhost:10001]

 

Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio)

parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax

 

I’m trying to resolve by my ways, any hints will be appreciated.

 

Regards,

Shengyu

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: [External] Re: [daos] failed to create pool: -1023 #chat

Shengyu SY19 Zhang
 

Hello Tom,

 

Thank you for the infor.

Now I’m able to create storage pool, via dmg tool, I’ll try the patch later time when I need.

 

Regards,

Shengyu.

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Nabarro, Tom
Sent: Friday, June 21, 2019 9:06 PM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

I’m afraid the patch for this has not landed yet (regarding the handling of the request), it’s going through a round of reviews, https://github.com/daos-stack/daos/pull/637 .

 

Please feel free to experiment with the patch, as it should work, otherwise please use the "dmg" tool to create pools in the interim

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang
Sent: Friday, June 21, 2019 10:35 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello,

 

I got first issue resolved after run: $ daos_shell storage format

I think you could add this step into the quick start document.

Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml.

Now the storage server seems formatted, but there is new issue happen :

When I run dmg create, it encounter another issue :

failed to create pool: -1007
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
orterun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[27363,1],0]
  Exit code:    1

 

 

If I execute: daos_shell pool create –s 1G

It says:

2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml

Active connections: [localhost:10001]

 

Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio)

parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax

 

I’m trying to resolve by my ways, any hints will be appreciated.

 

Regards,

Shengyu

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Lombardi, Johann
Sent: Friday, June 21, 2019 3:14 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Right, the storage engine isn’t started since the backend storage hasn’t been formatted:

waiting for storage format on server 0

 

Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine:

$ daos_shell storage format

 

As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.

 

To list the SSDs available on the system, you can run the following commands:

$ daos_server storage prep-nvme

2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024

2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root

$ daos_server storage scan

[…]

NVMe:

- model: 'INTEL SSDPED1K375GA '

  serial: 'PHKS7335009W375AGN  '

  pciaddr: 0000:87:00.0

  fwrev: E2010324

  namespaces:

  - id: 1

    capacity: 375

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5226001C1P6DGN  '

  pciaddr: 0000:da:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5506004Z1P6DGN  '

  pciaddr: 0000:81:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

 

And then populate the yaml file with the devices that you want to use, for instance:

  bdev_class: nvme

  bdev_list: ["0000:81:00.0", "0000:da:00.0"]

 

We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.

 

HTH

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 20 June 2019 at 13:24
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

looks like your daos_server was not started successfully.
you may check the details in the config file /usr/local/etc/daos_server.yml, try to change some setting to see if it can work, for example can test to comment out (add "#" to start of the line) all the "bdev_" started options.
If still cannot work, you may post your daos_server.yml and the daos log (path configured by "log_file" option, can set "log_mask: DEBUG") to jira ticket or here if jira does not work for you.

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Introducing Certificate Support for DAOS

Quigley, David
 

Hello,

 

In the near future a series of patches will land that introduces secure communications support for all of the gRPC connections in DAOS.  These channels are used for communications between the Go components in daos (daos_shell, daos_agent, and daos_server). By default certificates are required however it is easy to turn them off. The two ways of turning off certificate support are as follows

 

1)      In daos.yml, daos_agent.yml, and daos_server.yml you can add the line insecure:true. This will tell all of the component not to attempt to load any certificates and will keep all of the channels insecure (plain text http/2).

2)      When starting daos_agent, daos_server, and daos_shell pass either –i or --insecure on the command-line (this is the approach taken in the various tests in DAOS.

 

Regardless of which method you chose make sure all 3 components are either running with certificates or without. Mixing the components will cause the system to fail. It should notify you in the error logs that it is a TLS failure but it might not always be obvious.

 

Once the patches are merged I will present on how to use the certificate support if desired for testing. There is already a script for generating a set of certificates including a Certificate Authority for the DAOS cluster. For now though it is best to either modify your configuration files or pass in the appropriate command line flags once the patches are merged.

 

Dave Quigley

 

 


Re: [External] Re: [daos] failed to create pool: -1023 #chat

Shengyu SY19 Zhang
 

Hello,

 

I hope to get additional help in how to make DAOS working in its ecosystem, run it fuse, or hdfs, or testing for iops.

dmg query and fuse not work in my environment, and I noticed the admin document of DAOS is mismatch with code of fuse part.

Pool created OK however dmg query and fuse mount always returns invalid parameters error code (1003).

orterun … dmg query --pool 06c10125-c3ea-4040-a030-10a9e5f10004 --svc 1

Therefor as for now if I can get any guides to test/run DAOS in its ecosystem is better to learn more about the project, any information will be appreciated.

 

Best Regards,

Shengyu

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang
Sent: Monday, June 24, 2019 4:52 PM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello Tom,

 

Thank you for the infor.

Now Im able to create storage pool, via dmg tool, Ill try the patch later time when I need.

 

Regards,

Shengyu.

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Nabarro, Tom
Sent: Friday, June 21, 2019 9:06 PM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

I’m afraid the patch for this has not landed yet (regarding the handling of the request), it’s going through a round of reviews, https://github.com/daos-stack/daos/pull/637 .

 

Please feel free to experiment with the patch, as it should work, otherwise please use the "dmg" tool to create pools in the interim

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang
Sent: Friday, June 21, 2019 10:35 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello,

 

I got first issue resolved after run: $ daos_shell storage format

I think you could add this step into the quick start document.

Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml.

Now the storage server seems formatted, but there is new issue happen :

When I run dmg create, it encounter another issue :

failed to create pool: -1007
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
orterun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[27363,1],0]
  Exit code:    1

 

 

If I execute: daos_shell pool create –s 1G

It says:

2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml

Active connections: [localhost:10001]

 

Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio)

parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax

 

I’m trying to resolve by my ways, any hints will be appreciated.

 

Regards,

Shengyu

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Lombardi, Johann
Sent: Friday, June 21, 2019 3:14 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Right, the storage engine isn’t started since the backend storage hasn’t been formatted:

waiting for storage format on server 0

 

Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine:

$ daos_shell storage format

 

As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.

 

To list the SSDs available on the system, you can run the following commands:

$ daos_server storage prep-nvme

2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024

2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root

$ daos_server storage scan

[…]

NVMe:

- model: 'INTEL SSDPED1K375GA '

  serial: 'PHKS7335009W375AGN  '

  pciaddr: 0000:87:00.0

  fwrev: E2010324

  namespaces:

  - id: 1

    capacity: 375

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5226001C1P6DGN  '

  pciaddr: 0000:da:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5506004Z1P6DGN  '

  pciaddr: 0000:81:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

 

And then populate the yaml file with the devices that you want to use, for instance:

  bdev_class: nvme

  bdev_list: ["0000:81:00.0", "0000:da:00.0"]

 

We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.

 

HTH

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 20 June 2019 at 13:24
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

looks like your daos_server was not started successfully.
you may check the details in the config file /usr/local/etc/daos_server.yml, try to change some setting to see if it can work, for example can test to comment out (add "#" to start of the line) all the "bdev_" started options.
If still cannot work, you may post your daos_server.yml and the daos log (path configured by "log_file" option, can set "log_mask: DEBUG") to jira ticket or here if jira does not work for you.

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: [External] Re: [daos] failed to create pool: -1023 #chat

Lombardi, Johann
 

Hi Shengyu,

 

I assume that you have followed the instructions to set up /var/run/daos_agent, correct?

If not, please check https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#runtime-directory-setup

 

We also landed support for the daos_agent to v0.5 (David sent an email to the list), but the admin guide hasn’t been updated yet:

https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#agent-configuration

On the compute nodes, you should just run “daos_agent &”. A systemd script to automate this should be available soon.

 

David, could you please submit a PR to document in the admin guide how to setup & start the daos_agent? Thanks in advance.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday 1 July 2019 at 11:19
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello,

 

I hope to get additional help in how to make DAOS working in its ecosystem, run it fuse, or hdfs, or testing for iops.

dmg query and fuse not work in my environment, and I noticed the admin document of DAOS is mismatch with code of fuse part.

Pool created OK however dmg query and fuse mount always returns invalid parameters error code (1003).

orterun … dmg query --pool 06c10125-c3ea-4040-a030-10a9e5f10004 --svc 1

Therefor as for now if I can get any guides to test/run DAOS in its ecosystem is better to learn more about the project, any information will be appreciated.

 

Best Regards,

Shengyu

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang
Sent: Monday, June 24, 2019 4:52 PM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello Tom,

 

Thank you for the infor.

Now Im able to create storage pool, via dmg tool, Ill try the patch later time when I need.

 

Regards,

Shengyu.

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Nabarro, Tom
Sent: Friday, June 21, 2019 9:06 PM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

I’m afraid the patch for this has not landed yet (regarding the handling of the request), it’s going through a round of reviews, https://github.com/daos-stack/daos/pull/637 .

 

Please feel free to experiment with the patch, as it should work, otherwise please use the "dmg" tool to create pools in the interim

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang
Sent: Friday, June 21, 2019 10:35 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello,

 

I got first issue resolved after run: $ daos_shell storage format

I think you could add this step into the quick start document.

Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml.

Now the storage server seems formatted, but there is new issue happen :

When I run dmg create, it encounter another issue :

failed to create pool: -1007
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
orterun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[27363,1],0]
  Exit code:    1

 

 

If I execute: daos_shell pool create –s 1G

It says:

2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml

Active connections: [localhost:10001]

 

Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio)

parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax

 

I’m trying to resolve by my ways, any hints will be appreciated.

 

Regards,

Shengyu

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Lombardi, Johann
Sent: Friday, June 21, 2019 3:14 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Right, the storage engine isn’t started since the backend storage hasn’t been formatted:

waiting for storage format on server 0

 

Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine:

$ daos_shell storage format

 

As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.

 

To list the SSDs available on the system, you can run the following commands:

$ daos_server storage prep-nvme

2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024

2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root

$ daos_server storage scan

[…]

NVMe:

- model: 'INTEL SSDPED1K375GA '

  serial: 'PHKS7335009W375AGN  '

  pciaddr: 0000:87:00.0

  fwrev: E2010324

  namespaces:

  - id: 1

    capacity: 375

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5226001C1P6DGN  '

  pciaddr: 0000:da:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5506004Z1P6DGN  '

  pciaddr: 0000:81:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

 

And then populate the yaml file with the devices that you want to use, for instance:

  bdev_class: nvme

  bdev_list: ["0000:81:00.0", "0000:da:00.0"]

 

We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.

 

HTH

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 20 June 2019 at 13:24
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

looks like your daos_server was not started successfully.
you may check the details in the config file /usr/local/etc/daos_server.yml, try to change some setting to see if it can work, for example can test to comment out (add "#" to start of the line) all the "bdev_" started options.
If still cannot work, you may post your daos_server.yml and the daos log (path configured by "log_file" option, can set "log_mask: DEBUG") to jira ticket or here if jira does not work for you.

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: [External] Re: [daos] failed to create pool: -1023 #chat

Nabarro, Tom
 

Hello Shengyu,

 

Also the “svc” parameter to dmg query is comma separated list of 0-based  indices so you might want “--svc 0” (refer to the second item returned from the create call)

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Lombardi, Johann
Sent: Monday, July 1, 2019 1:07 PM
To: daos@daos.groups.io; Quigley, David <david.quigley@...>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hi Shengyu,

 

I assume that you have followed the instructions to set up /var/run/daos_agent, correct?

If not, please check https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#runtime-directory-setup

 

We also landed support for the daos_agent to v0.5 (David sent an email to the list), but the admin guide hasn’t been updated yet:

https://github.com/daos-stack/daos/blob/master/doc/admin/deployment.md#agent-configuration

On the compute nodes, you should just run “daos_agent &”. A systemd script to automate this should be available soon.

 

David, could you please submit a PR to document in the admin guide how to setup & start the daos_agent? Thanks in advance.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday 1 July 2019 at 11:19
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello,

 

I hope to get additional help in how to make DAOS working in its ecosystem, run it fuse, or hdfs, or testing for iops.

dmg query and fuse not work in my environment, and I noticed the admin document of DAOS is mismatch with code of fuse part.

Pool created OK however dmg query and fuse mount always returns invalid parameters error code (1003).

orterun … dmg query --pool 06c10125-c3ea-4040-a030-10a9e5f10004 --svc 1

Therefor as for now if I can get any guides to test/run DAOS in its ecosystem is better to learn more about the project, any information will be appreciated.

 

Best Regards,

Shengyu

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang
Sent: Monday, June 24, 2019 4:52 PM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello Tom,

 

Thank you for the infor.

Now I’m able to create storage pool, via dmg tool, I’ll try the patch later time when I need.

 

Regards,

Shengyu.

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Nabarro, Tom
Sent: Friday, June 21, 2019 9:06 PM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

I’m afraid the patch for this has not landed yet (regarding the handling of the request), it’s going through a round of reviews, https://github.com/daos-stack/daos/pull/637 .

 

Please feel free to experiment with the patch, as it should work, otherwise please use the "dmg" tool to create pools in the interim

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang
Sent: Friday, June 21, 2019 10:35 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Hello,

 

I got first issue resolved after run: $ daos_shell storage format

I think you could add this step into the quick start document.

Yes I have already created daos_server.yml, from the one at install/etc/daos_server.yml.

Now the storage server seems formatted, but there is new issue happen :

When I run dmg create, it encounter another issue :

failed to create pool: -1007
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
orterun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[27363,1],0]
  Exit code:    1

 

 

If I execute: daos_shell pool create –s 1G

It says:

2019/06/21 17:29:19 config.go:122: debug: DAOS Client config read from /usr/local/etc/daos.yml

Active connections: [localhost:10001]

 

Creating DAOS pool with 1GB SCM and 0B NvMe storage (1.000 ratio)

parsing rank list: element 0: strconv.ParseUint: parsing "": invalid syntax

 

I’m trying to resolve by my ways, any hints will be appreciated.

 

Regards,

Shengyu

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Lombardi, Johann
Sent: Friday, June 21, 2019 3:14 AM
To: daos@daos.groups.io
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

Right, the storage engine isn’t started since the backend storage hasn’t been formatted:

waiting for storage format on server 0

 

Even with no NVDIMMs in the system, we still need to wipe out the SSDs so that all blocks are marked as not allocated (i.e. for wear leveling). The following command should allow you to format & start the engine:

$ daos_shell storage format

 

As suggested by Xuezhao, you should create your own daos_server.yml with the list of SSDs you want to use.

 

To list the SSDs available on the system, you can run the following commands:

$ daos_server storage prep-nvme

2019/06/20 19:06:24 storage_nvme.go:96: debug: spdk setup with _NRHUGE=1024

2019/06/20 19:06:24 storage_nvme.go:100: debug: spdk setup with _TARGET_USER=root

$ daos_server storage scan

[…]

NVMe:

- model: 'INTEL SSDPED1K375GA '

  serial: 'PHKS7335009W375AGN  '

  pciaddr: 0000:87:00.0

  fwrev: E2010324

  namespaces:

  - id: 1

    capacity: 375

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5226001C1P6DGN  '

  pciaddr: 0000:da:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

- model: 'INTEL SSDPEDMD016T4 '

  serial: 'CVFT5506004Z1P6DGN  '

  pciaddr: 0000:81:00.0

  fwrev: 8DV10171

  namespaces:

  - id: 1

    capacity: 1600

 

And then populate the yaml file with the devices that you want to use, for instance:

  bdev_class: nvme

  bdev_list: ["0000:81:00.0", "0000:da:00.0"]

 

We are working on automatic storage configuration with CPU affinity detection, but this feature isn’t available yet.

 

HTH

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "xuezhao.liu@..." <xuezhao.liu@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 20 June 2019 at 13:24
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [External] Re: [daos] failed to create pool: -1023 #chat

 

looks like your daos_server was not started successfully.
you may check the details in the config file /usr/local/etc/daos_server.yml, try to change some setting to see if it can work, for example can test to comment out (add "#" to start of the line) all the "bdev_" started options.
If still cannot work, you may post your daos_server.yml and the daos log (path configured by "log_file" option, can set "log_mask: DEBUG") to jira ticket or here if jira does not work for you.

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Issues creating a DAOS pool

Rene Salmon <salmonr@...>
 

Hi Daos list,

I am trying to bring up DAOS using various docs on the github page.  That said I am running into trouble while trying to create a DAOS Pool.

I have three DAOS servers and one client.
daos-1 = client
daos-[2-4] = servers

[user@daos-1 ~]$ orterun -np 1 --ompi-server file:/tmp/urifile.txt dmg create --size=2G

failed to create pool: -1005

-------------------------------------------------------

Primary job  terminated normally, but 1 process returned

a non-zero exit code. Per user-direction, the job has been aborted.

-------------------------------------------------------

--------------------------------------------------------------------------

orterun detected that one or more processes exited with non-zero status, thus causing

the job to be terminated. The first process to do so was:


  Process name: [[40764,1],0]

  Exit code:    1

--------------------------------------------------------------------------


Any ideas where to look for a hint?
Thanks

Rene


Re: Issues creating a DAOS pool

Chaarawi, Mohamad
 

On the issue below, it seems that the uri file that the server generates is not written to a place where the client can read:

/tmp/urifile.txt

 

I had an offline chat with Rene who will retry this after writing the uri file to a shared FS, but wanted to updated the mailing list on the issue.

 

Thanks,

Mohamad

 

From: <daos@daos.groups.io> on behalf of "Rene Salmon via Groups.Io" <salmonr@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, July 18, 2019 at 5:29 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Issues creating a DAOS pool

 

Hi Daos list,

 

I am trying to bring up DAOS using various docs on the github page.  That said I am running into trouble while trying to create a DAOS Pool.

 

I have three DAOS servers and one client.

daos-1 = client

daos-[2-4] = servers

 

[user@daos-1 ~]$ orterun -np 1 --ompi-server file:/tmp/urifile.txt dmg create --size=2G

failed to create pool: -1005

-------------------------------------------------------

Primary job  terminated normally, but 1 process returned

a non-zero exit code. Per user-direction, the job has been aborted.

-------------------------------------------------------

--------------------------------------------------------------------------

orterun detected that one or more processes exited with non-zero status, thus causing

the job to be terminated. The first process to do so was:

 

  Process name: [[40764,1],0]

  Exit code:    1

--------------------------------------------------------------------------

 

Any ideas where to look for a hint?

Thanks

 

Rene


DAOS hardware requirements

BASDEN, ALASTAIR G.
 

Hi,

Having read
https://github.com/daos-stack/daos/blob/master/doc/storage_model.md
is it the case that all storage nodes must have DCPMM (apache pass)
memory?

I was hoping to install a test installation using a metadata server with
DCPMM memory, and storage nodes with only NVMe drivers (these are not
cascade lake, so no DPCMM). However, I am now unclear whether this will
be possible.

Thanks,
Alastair.


Re: DAOS hardware requirements

Carrier, John
 

DAOS metadata resides in persistent memory (DCPMM) on the same node as the block storage (NVMe drives). DAOS is an object store, not a file system. The metadata in the DCPMM are the data structures that each DAOS server maintains to access the application data stored in DAOS containers and objects on the SSDs.

--jc

-----Original Message-----
From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of BASDEN, ALASTAIR G.
Sent: Friday, July 26, 2019 3:41 AM
To: daos@daos.groups.io
Subject: [daos] DAOS hardware requirements

Hi,

Having read
https://github.com/daos-stack/daos/blob/master/doc/storage_model.md
is it the case that all storage nodes must have DCPMM (apache pass)
memory?

I was hoping to install a test installation using a metadata server with
DCPMM memory, and storage nodes with only NVMe drivers (these are not
cascade lake, so no DPCMM). However, I am now unclear whether this will
be possible.

Thanks,
Alastair.


Re: DAOS hardware requirements

Chaarawi, Mohamad
 

Note that the DCPMM also captures small I/Os (< 4k) in addition to the DAOS metadata.

That being said, for a quick test setup, you can use DRAM instead of DCPMM (use a tmpfs mount).
Again this is just for testing (if you reboot of course all data / metadata in your tmpfs is gone).

Thanks,
Mohamad

On 7/26/19, 8:56 AM, "daos@daos.groups.io on behalf of Carrier, John" <daos@daos.groups.io on behalf of john.carrier@...> wrote:

DAOS metadata resides in persistent memory (DCPMM) on the same node as the block storage (NVMe drives). DAOS is an object store, not a file system. The metadata in the DCPMM are the data structures that each DAOS server maintains to access the application data stored in DAOS containers and objects on the SSDs.

--jc

-----Original Message-----
From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of BASDEN, ALASTAIR G.
Sent: Friday, July 26, 2019 3:41 AM
To: daos@daos.groups.io
Subject: [daos] DAOS hardware requirements

Hi,

Having read
https://github.com/daos-stack/daos/blob/master/doc/storage_model.md
is it the case that all storage nodes must have DCPMM (apache pass)
memory?

I was hoping to install a test installation using a metadata server with
DCPMM memory, and storage nodes with only NVMe drivers (these are not
cascade lake, so no DPCMM). However, I am now unclear whether this will
be possible.

Thanks,
Alastair.


Build using Docker

Colin Ngam
 

Hi,

 

I am trying to build using Docker on Mac. I am seeing the following error:

 

$ docker build -t daos -f Dockerfile.centos\:7 github.com/daos-stack/daos#:utils/docker

unable to prepare context: unable to 'git clone' to temporary context directory: error initializing submodules: Submodule 'scons_local' (https://github.com/daos-stack/scons_local.git) registered for path 'scons_local'

Submodule 'raft' (https://github.com/daos-stack/raft.git) registered for path 'src/rdb/raft'

https://groups.io

Cloning into '/private/var/folders/b5/b52tjd4j71z15bj53sms7jrw006f5s/T/docker-build-git049590752/scons_local'...

Cloning into '/private/var/folders/b5/b52tjd4j71z15bj53sms7jrw006f5s/T/docker-build-git049590752/src/rdb/raft'...

error: Server does not allow request for unadvertised object 171c9b254d9c40463c77d7e7567fd93633a9d78a

Fetched in submodule path 'scons_local', but it did not contain 171c9b254d9c40463c77d7e7567fd93633a9d78a. Direct fetching of that commit failed.

: exit status 1

 

Thanks.

 

Colin

 


Re: Build using Docker

Olivier, Jeffrey V
 

Hi Colin,

 

I’ve not seen that before but after cloning DAOS, the build should be doing a git submodule update which would clone scons_local and raft submodules using the urls you see in the messages.   The scons_local repository has that git hash so I’m wondering if something is happening at the initial clone step.  Do you have an https proxy that needs to be set perhaps?

 

-Jeff

 

From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Colin Ngam
Sent: Tuesday, August 13, 2019 10:56 AM
To: daos@daos.groups.io
Subject: [daos] Build using Docker

 

Hi,

 

I am trying to build using Docker on Mac. I am seeing the following error:

 

$ docker build -t daos -f Dockerfile.centos\:7 github.com/daos-stack/daos#:utils/docker

unable to prepare context: unable to 'git clone' to temporary context directory: error initializing submodules: Submodule 'scons_local' (https://github.com/daos-stack/scons_local.git) registered for path 'scons_local'

Submodule 'raft' (https://github.com/daos-stack/raft.git) registered for path 'src/rdb/raft'

https://groups.io

Cloning into '/private/var/folders/b5/b52tjd4j71z15bj53sms7jrw006f5s/T/docker-build-git049590752/scons_local'...

Cloning into '/private/var/folders/b5/b52tjd4j71z15bj53sms7jrw006f5s/T/docker-build-git049590752/src/rdb/raft'...

error: Server does not allow request for unadvertised object 171c9b254d9c40463c77d7e7567fd93633a9d78a

Fetched in submodule path 'scons_local', but it did not contain 171c9b254d9c40463c77d7e7567fd93633a9d78a. Direct fetching of that commit failed.

: exit status 1

 

Thanks.

 

Colin

 


Re: Build using Docker

Colin Ngam
 

Hi,

 

I can do this manually:

 

cngam@cngam daos $ git clone https://github.com/daos-stack/daos.git

Cloning into 'daos'...

remote: Enumerating objects: 229, done.

remote: Counting objects: 100% (229/229), done.

remote: Compressing objects: 100% (201/201), done.

remote: Total 35637 (delta 103), reused 64 (delta 28), pack-reused 35408

Receiving objects: 100% (35637/35637), 24.55 MiB | 6.29 MiB/s, done.

Resolving deltas: 100% (27005/27005), done.

cngam@cngam daos $ cd daos

cngam@cngam daos $ git submodule init

Submodule 'scons_local' (https://github.com/daos-stack/scons_local.git) registered for path 'scons_local'

Submodule 'raft' (https://github.com/daos-stack/raft.git) registered for path 'src/rdb/raft'

cngam@cngam daos $ git submodule update

Cloning into '/Users/cngam/DAOS/daos/scons_local'...

Cloning into '/Users/cngam/DAOS/daos/src/rdb/raft'...

Submodule path 'scons_local': checked out '171c9b254d9c40463c77d7e7567fd93633a9d78a'

Submodule path 'src/rdb/raft': checked out 'e6c4369635d5ca8cbe71173eb4eb15fb3809510f'

 

So, I do not believe it is a https proxy issue.

 

Thanks.

 

Colin

 

 

From: <daos@daos.groups.io> on behalf of "Olivier, Jeffrey V" <jeffrey.v.olivier@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday, August 13, 2019 at 12:22 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Cc: "Olivier, Jeffrey V" <jeffrey.v.olivier@...>
Subject: Re: [daos] Build using Docker

 

Hi Colin,

 

I’ve not seen that before but after cloning DAOS, the build should be doing a git submodule update which would clone scons_local and raft submodules using the urls you see in the messages.   The scons_local repository has that git hash so I’m wondering if something is happening at the initial clone step.  Do you have an https proxy that needs to be set perhaps?

 

-Jeff

 

From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Colin Ngam
Sent: Tuesday, August 13, 2019 10:56 AM
To: daos@daos.groups.io
Subject: [daos] Build using Docker

 

Hi,

 

I am trying to build using Docker on Mac. I am seeing the following error:

 

$ docker build -t daos -f Dockerfile.centos\:7 github.com/daos-stack/daos#:utils/docker

unable to prepare context: unable to 'git clone' to temporary context directory: error initializing submodules: Submodule 'scons_local' (https://github.com/daos-stack/scons_local.git) registered for path 'scons_local'

Submodule 'raft' (https://github.com/daos-stack/raft.git) registered for path 'src/rdb/raft'

https://groups.io

Cloning into '/private/var/folders/b5/b52tjd4j71z15bj53sms7jrw006f5s/T/docker-build-git049590752/scons_local'...

Cloning into '/private/var/folders/b5/b52tjd4j71z15bj53sms7jrw006f5s/T/docker-build-git049590752/src/rdb/raft'...

error: Server does not allow request for unadvertised object 171c9b254d9c40463c77d7e7567fd93633a9d78a

Fetched in submodule path 'scons_local', but it did not contain 171c9b254d9c40463c77d7e7567fd93633a9d78a. Direct fetching of that commit failed.

: exit status 1

 

Thanks.

 

Colin

 


Re: Build using Docker

Olivier, Jeffrey V
 

Hi Colin,

 

Since you can do it manually, it may be quickest to do checkout outside of the container and mount the volume in the container using –v as described here: https://docs.docker.com/storage/volumes/

 

-Jeff

 

From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Colin Ngam
Sent: Tuesday, August 13, 2019 1:25 PM
To: daos@daos.groups.io
Subject: Re: [daos] Build using Docker

 

Hi,

 

I can do this manually:

 

cngam@cngam daos $ git clone https://github.com/daos-stack/daos.git

Cloning into 'daos'...

remote: Enumerating objects: 229, done.

remote: Counting objects: 100% (229/229), done.

remote: Compressing objects: 100% (201/201), done.

remote: Total 35637 (delta 103), reused 64 (delta 28), pack-reused 35408

Receiving objects: 100% (35637/35637), 24.55 MiB | 6.29 MiB/s, done.

Resolving deltas: 100% (27005/27005), done.

cngam@cngam daos $ cd daos

cngam@cngam daos $ git submodule init

Submodule 'scons_local' (https://github.com/daos-stack/scons_local.git) registered for path 'scons_local'

Submodule 'raft' (https://github.com/daos-stack/raft.git) registered for path 'src/rdb/raft'

cngam@cngam daos $ git submodule update

Cloning into '/Users/cngam/DAOS/daos/scons_local'...

Cloning into '/Users/cngam/DAOS/daos/src/rdb/raft'...

Submodule path 'scons_local': checked out '171c9b254d9c40463c77d7e7567fd93633a9d78a'

Submodule path 'src/rdb/raft': checked out 'e6c4369635d5ca8cbe71173eb4eb15fb3809510f'

 

So, I do not believe it is a https proxy issue.

 

Thanks.

 

Colin

 

 

From: <daos@daos.groups.io> on behalf of "Olivier, Jeffrey V" <jeffrey.v.olivier@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday, August 13, 2019 at 12:22 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Cc: "Olivier, Jeffrey V" <jeffrey.v.olivier@...>
Subject: Re: [daos] Build using Docker

 

Hi Colin,

 

I’ve not seen that before but after cloning DAOS, the build should be doing a git submodule update which would clone scons_local and raft submodules using the urls you see in the messages.   The scons_local repository has that git hash so I’m wondering if something is happening at the initial clone step.  Do you have an https proxy that needs to be set perhaps?

 

-Jeff

 

From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Colin Ngam
Sent: Tuesday, August 13, 2019 10:56 AM
To: daos@daos.groups.io
Subject: [daos] Build using Docker

 

Hi,

 

I am trying to build using Docker on Mac. I am seeing the following error:

 

$ docker build -t daos -f Dockerfile.centos\:7 github.com/daos-stack/daos#:utils/docker

unable to prepare context: unable to 'git clone' to temporary context directory: error initializing submodules: Submodule 'scons_local' (https://github.com/daos-stack/scons_local.git) registered for path 'scons_local'

Submodule 'raft' (https://github.com/daos-stack/raft.git) registered for path 'src/rdb/raft'

https://groups.io

Cloning into '/private/var/folders/b5/b52tjd4j71z15bj53sms7jrw006f5s/T/docker-build-git049590752/scons_local'...

Cloning into '/private/var/folders/b5/b52tjd4j71z15bj53sms7jrw006f5s/T/docker-build-git049590752/src/rdb/raft'...

error: Server does not allow request for unadvertised object 171c9b254d9c40463c77d7e7567fd93633a9d78a

Fetched in submodule path 'scons_local', but it did not contain 171c9b254d9c40463c77d7e7567fd93633a9d78a. Direct fetching of that commit failed.

: exit status 1

 

Thanks.

 

Colin