Topics

DAOS with NVMe-over-Fabrics


anton.brekhov@...
 
Edited

Our partners at Intel said that it's possible to connect NVMe disks by NVMeOF with DAOS. Could you explain how to do this? 

Thanks!

 
 
 
 
 


Niu, Yawei
 

So far DAOS server supports only local PCIe attached NVMe, but it won’t be very difficult to support NVMeOF in the future, it requires only server configuration changes, all others are transparent to DAOS.

If you are talking about client directly connects to NVMe disks through NVMeOF, that’ll require exposing block level protocol to client, we don’t have plan for this in the near term.

 

Thanks

-Niu

 

From: <daos@daos.groups.io> on behalf of "anton.brekhov@..." <anton.brekhov@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, September 17, 2020 at 1:08 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] DAOS with NVMe-over-Fabrics

 

[Edited Message Follows]

Our partners at Intel said that it's possible to connect NVMe disks by NVMeOF with DAOS. Could you explain how to do this? 

Thanks!

 

 

 

 

 


Lombardi, Johann
 

As pointed out by Niu, the main gap is really on the DAOS control plane (i.e. dmg storage scan & support in the yaml config file). SPDK has built-in support for a NVMe-oF initiator/host (see https://spdk.io/doc/nvme.html#nvme_fabrics_host), so we “just” need to generate a SPDK config file (i.e. available under $SCM_MOUNTPOINT/daos_nvme.conf) with the right parameters.

 

This is an example from my local set up:

 

$ cat /mnt/daos0/daos_nvme.conf

[Nvme]

    TransportID "trtype:PCIe traddr:0000:81:00.0" Nvme_wolf-118.wolf.hpdd.intel.com_0

    TransportID "trtype:PCIe traddr:0000:da:00.0" Nvme_wolf-118.wolf.hpdd.intel.com_1

    RetryCount 4

    TimeoutUsec 0

    ActionOnTimeout None

    AdminPollRate 100000

    HotplugEnable No

    HotplugPollRate 0

 

For NVMe-oF, the TransportID should be modified to something like this:

    TransportId "trtype:rdma adrfam:IPv4 traddr:10.9.1.118 trsvcid:4420 subnqn:test" Nvme0_wolf-118.wolf.hpdd.intel.com_0

 

Ideally, we would just support a new “nvmeof” bdev_class in the DAOS yaml file that will automatically generate a daos_nvme.conf with NVMe-oF support, e.g.:

 

bdev_class: nvmeof

bdev_list: ["IPv4:10.9.1.118:4420:test”]

 

Any patches to support this in the control plane would be welcomed 😊

 

Another other option is to use the Linux kernel NVMe-oF driver that gives you a block device that you can then use in the DAOS yaml file via “bdev_class: kdev”. That’s definitely not the most optimal path regarding performance, but for sure the most straightforward way to use NVMe-oF with DAOS until the control plane supports it.

 

HTH

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "Niu, Yawei" <yawei.niu@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 17 September 2020 at 03:02
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

So far DAOS server supports only local PCIe attached NVMe, but it won’t be very difficult to support NVMeOF in the future, it requires only server configuration changes, all others are transparent to DAOS.

If you are talking about client directly connects to NVMe disks through NVMeOF, that’ll require exposing block level protocol to client, we don’t have plan for this in the near term.

 

Thanks

-Niu

 

From: <daos@daos.groups.io> on behalf of "anton.brekhov@..." <anton.brekhov@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, September 17, 2020 at 1:08 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] DAOS with NVMe-over-Fabrics

 

[Edited Message Follows]

Our partners at Intel said that it's possible to connect NVMe disks by NVMeOF with DAOS. Could you explain how to do this? 

Thanks!

 

 

 

 

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


anton.brekhov@...
 

Johann and Niu thanks a lot!

I'll try to test it and send result to this topic!


anton.brekhov@...
 

On Thu, Sep 17, 2020 at 12:56 AM, Lombardi, Johann wrote:
adrfam:IPv4 traddr:10.9.1.118 trsvcid:4420 subnqn:test

I've tried to change daos_nvme.conf in runtime of daos server and before starting due to connect disk through rdma. In both ways I cannot see it in daos_system. nvme discover see exported disk.

When SPDK taking this disk in system? Or should I write this in other files? My daos_nvme.conf
[Nvme]

    TransportID "trtype:PCIe traddr:0000:b1:00.0" Nvme_apache512_0

    TransportID "trtype:PCIe traddr:0000:b2:00.0" Nvme_apache512_1

    TransportID "trtype:PCIe traddr:0000:b3:00.0" Nvme_apache512_2

    TransportID "trtype:PCIe traddr:0000:b4:00.0" Nvme_apache512_3

    TransportID "trtype:rdma adrfam:IPv4 traddr:10.0.1.2 trsvcid:4420 subnqn:nvme-subsystem-name" Nvme_apache512_4

    RetryCount 4

    TimeoutUsec 0

    ActionOnTimeout None

    AdminPollRate 100000

    HotplugEnable No

    HotplugPollRate 0

And nvme discover output:

[root@apache512 ~]# nvme discover -t rdma -a 10.0.1.2 -s 4420

 

Discovery Log Number of Records 1, Generation counter 2

=====Discovery Log Entry 0======

trtype:  rdma

adrfam:  ipv4

subtype: nvme subsystem

treq:    not specified, sq flow control disable supported

portid:  1

trsvcid: 4420

subnqn:  nvme-subsystem-name

traddr:  10.0.1.2

rdma_prtype: not specified

rdma_qptype: connected

rdma_cms:    rdma-cm

 

rdma_pkey: 0x0000


Lombardi, Johann
 

Hi Anton,

 

The DAOS control plane does not support NVMe-oF target yet, that’s why you don’t see them via storage scan. That being said, my hope was that, by changing the daos_nvme.conf manually as you did, you would be able you to start the I/O server with NVMe-oF targets. Maybe some glue is still missing.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "anton.brekhov@..." <anton.brekhov@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Wednesday 7 October 2020 at 16:09
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

On Thu, Sep 17, 2020 at 12:56 AM, Lombardi, Johann wrote:

adrfam:IPv4 traddr:10.9.1.118 trsvcid:4420 subnqn:test

I've tried to change daos_nvme.conf in runtime of daos server and before starting due to connect disk through rdma. In both ways I cannot see it in daos_system. nvme discover see exported disk.

When SPDK taking this disk in system? Or should I write this in other files? My daos_nvme.conf
[Nvme]

    TransportID "trtype:PCIe traddr:0000:b1:00.0" Nvme_apache512_0

    TransportID "trtype:PCIe traddr:0000:b2:00.0" Nvme_apache512_1

    TransportID "trtype:PCIe traddr:0000:b3:00.0" Nvme_apache512_2

    TransportID "trtype:PCIe traddr:0000:b4:00.0" Nvme_apache512_3

    TransportID "trtype:rdma adrfam:IPv4 traddr:10.0.1.2 trsvcid:4420 subnqn:nvme-subsystem-name" Nvme_apache512_4

    RetryCount 4

    TimeoutUsec 0

    ActionOnTimeout None

    AdminPollRate 100000

    HotplugEnable No

    HotplugPollRate 0

And nvme discover output:

[root@apache512 ~]# nvme discover -t rdma -a 10.0.1.2 -s 4420

 

Discovery Log Number of Records 1, Generation counter 2

=====Discovery Log Entry 0======

trtype:  rdma

adrfam:  ipv4

subtype: nvme subsystem

treq:    not specified, sq flow control disable supported

portid:  1

trsvcid: 4420

subnqn:  nvme-subsystem-name

traddr:  10.0.1.2

rdma_prtype: not specified

rdma_qptype: connected

rdma_cms:    rdma-cm

 

rdma_pkey: 0x0000

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Nabarro, Tom
 

In order to get DAOS up and running with NVMe-oF I think some internal changes in bio might be needed.

 

Does the use of NVMe-oF provide specific benefits in a DAOS installation? Would like to understand the incentives behind using NVMe-oF with DAOS.

 

Command/log output would also be useful.

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Lombardi, Johann
Sent: Friday, October 9, 2020 7:02 AM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

Hi Anton,

 

The DAOS control plane does not support NVMe-oF target yet, that’s why you don’t see them via storage scan. That being said, my hope was that, by changing the daos_nvme.conf manually as you did, you would be able you to start the I/O server with NVMe-oF targets. Maybe some glue is still missing.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "anton.brekhov@..." <anton.brekhov@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Wednesday 7 October 2020 at 16:09
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

On Thu, Sep 17, 2020 at 12:56 AM, Lombardi, Johann wrote:

adrfam:IPv4 traddr:10.9.1.118 trsvcid:4420 subnqn:test

I've tried to change daos_nvme.conf in runtime of daos server and before starting due to connect disk through rdma. In both ways I cannot see it in daos_system. nvme discover see exported disk.

When SPDK taking this disk in system? Or should I write this in other files? My daos_nvme.conf
[Nvme]

    TransportID "trtype:PCIe traddr:0000:b1:00.0" Nvme_apache512_0

    TransportID "trtype:PCIe traddr:0000:b2:00.0" Nvme_apache512_1

    TransportID "trtype:PCIe traddr:0000:b3:00.0" Nvme_apache512_2

    TransportID "trtype:PCIe traddr:0000:b4:00.0" Nvme_apache512_3

    TransportID "trtype:rdma adrfam:IPv4 traddr:10.0.1.2 trsvcid:4420 subnqn:nvme-subsystem-name" Nvme_apache512_4

    RetryCount 4

    TimeoutUsec 0

    ActionOnTimeout None

    AdminPollRate 100000

    HotplugEnable No

    HotplugPollRate 0

And nvme discover output:

[root@apache512 ~]# nvme discover -t rdma -a 10.0.1.2 -s 4420

 

Discovery Log Number of Records 1, Generation counter 2

=====Discovery Log Entry 0======

trtype:  rdma

adrfam:  ipv4

subtype: nvme subsystem

treq:    not specified, sq flow control disable supported

portid:  1

trsvcid: 4420

subnqn:  nvme-subsystem-name

traddr:  10.0.1.2

rdma_prtype: not specified

rdma_qptype: connected

rdma_cms:    rdma-cm

 

rdma_pkey: 0x0000

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


anton.brekhov@...
 

Hi Tom!
In my opinion it can be useful for some data centric workflows. Especially in HPC sector or for fast cache. 
 


anton.brekhov@...
 

Does SPDK under DAOS configured with rdma support? 

./configure --with-rdma <other config parameters>

make



Nabarro, Tom
 

Yes, this is how we configure the build, thanks for the use case explanation.

 

293                 commands=['./configure --prefix="$SPDK_PREFIX"' \

294                           ' --disable-tests --without-vhost --without-crypto' \

295                           ' --without-pmdk --without-vpp --without-rbd' \

296                           ' --with-rdma --with-shared' \

297                           ' --without-iscsi-initiator --without-isal' \

298                           ' --without-vtune', 'make $JOBS_OPT', 'make install',

299                           'cp dpdk/build/lib/* "$SPDK_PREFIX/lib"',

300                           'mkdir -p "$SPDK_PREFIX/share/spdk"',

301                           'cp -r include scripts "$SPDK_PREFIX/share/spdk"'],

 

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of anton.brekhov@...
Sent: Friday, October 9, 2020 12:30 PM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

Does SPDK under DAOS configured with rdma support? 

./configure --with-rdma <other config parameters>

make

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


anton.brekhov@...
 

Tom, thank you! 

So if SPDK configured with rdma support we have opportunity to connect disk with spdk API. 

But I've faced issue to connect to spdk socket:

[root@apache512 ~]# spdk-rpc -v -s /var/run/dpdk/spdk125408/mp_socket get_rpc_methods

INFO: Log level set to 20

Traceback (most recent call last):

  File "/usr/sbin/spdk-rpc", line 1813, in <module>

    args.client = rpc.client.JSONRPCClient(args.server_addr, args.port, args.timeout, log_level=getattr(logging, args.verbose.upper()))

  File "/usr/share/spdk/scripts/rpc/client.py", line 49, in __init__

    "Error details: %s" % (addr, ex))

rpc.client.JSONRPCException


Do I use right socket?


Nabarro, Tom
 

I’m sorry I’m unfamiliar with the spdk rpc client, we manually parse the ini-style config file and directly call the SPDK configuration API all within src/bio/bio_xstream.c: bio_nvme_init().

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of anton.brekhov@...
Sent: Friday, October 9, 2020 12:51 PM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

Tom, thank you! 

So if SPDK configured with rdma support we have opportunity to connect disk with spdk API. 

But I've faced issue to connect to spdk socket:

[root@apache512 ~]# spdk-rpc -v -s /var/run/dpdk/spdk125408/mp_socket get_rpc_methods

INFO: Log level set to 20

Traceback (most recent call last):

  File "/usr/sbin/spdk-rpc", line 1813, in <module>

    args.client = rpc.client.JSONRPCClient(args.server_addr, args.port, args.timeout, log_level=getattr(logging, args.verbose.upper()))

  File "/usr/share/spdk/scripts/rpc/client.py", line 49, in __init__

    "Error details: %s" % (addr, ex))

rpc.client.JSONRPCException


Do I use right socket?

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


anton.brekhov@...
 

Tom, thank you! 

We'll try to connect!


anton.brekhov@...
 

Is there any way to prevent spdk to take over local NVMe drives? We want to connect all nvme disks through kdev, but after server start all of them are disappears from /dev/nvmeX path.

bdev_exclude didn't help


anton.brekhov@...
 

Is it OK that daos_server storage scan says that both pmem on one socket:

[root@apache512 ib0]# daos_server storage scan

Scanning locally-attached storage...

NVMe controllers and namespaces:

PCI:0000:b3:00.0 Model:INTEL SSDPELKX010T8  FW:VCV10300 Socket:1 Capacity:1.0 TB

PCI:0000:b4:00.0 Model:INTEL SSDPELKX010T8  FW:VCV10300 Socket:1 Capacity:1.0 TB

SCM Namespaces:

Device:pmem0 Socket:0 Capacity:981 GB

Device:pmem1 Socket:0 Capacity:981 GB


But ipmctl says that dcpmm is on two separate sockets?
[root@apache512 ib0]# ipmctl show -topology -socket 0
 DimmID | MemoryType                  | Capacity    | PhysicalID| DeviceLocator
================================================================================
 0x0001 | Logical Non-Volatile Device | 502.563 GiB | 0x001e    | CPU1_DIMM_A2
 0x0101 | Logical Non-Volatile Device | 502.563 GiB | 0x0024    | CPU1_DIMM_D2
 N/A    | DDR4                        | 16.000 GiB  | 0x001d    | CPU1_DIMM_A1
 N/A    | DDR4                        | 16.000 GiB  | 0x001f    | CPU1_DIMM_B1
 N/A    | DDR4                        | 16.000 GiB  | 0x0021    | CPU1_DIMM_C1
 N/A    | DDR4                        | 16.000 GiB  | 0x0023    | CPU1_DIMM_D1
 N/A    | DDR4                        | 16.000 GiB  | 0x0025    | CPU1_DIMM_E1
 N/A    | DDR4                        | 16.000 GiB  | 0x0027    | CPU1_DIMM_F1
 
[root@apache512 ib0]# ipmctl show -topology -socket 1
 DimmID | MemoryType                  | Capacity    | PhysicalID| DeviceLocator
================================================================================
 0x1001 | Logical Non-Volatile Device | 502.563 GiB | 0x002a    | CPU2_DIMM_A2
 0x1101 | Logical Non-Volatile Device | 502.563 GiB | 0x0030    | CPU2_DIMM_D2
 N/A    | DDR4                        | 16.000 GiB  | 0x0029    | CPU2_DIMM_A1
 N/A    | DDR4                        | 16.000 GiB  | 0x002b    | CPU2_DIMM_B1
 N/A    | DDR4                        | 16.000 GiB  | 0x002d    | CPU2_DIMM_C1
 N/A    | DDR4                        | 16.000 GiB  | 0x002f    | CPU2_DIMM_D1
 N/A    | DDR4                        | 16.000 GiB  | 0x0031    | CPU2_DIMM_E1
 N/A    | DDR4                        | 16.000 GiB  | 0x0033    | CPU2_DIMM_F1
 


Nabarro, Tom
 

Yes , this is a bug with current centos7 kernel and ndctl, I will find the ticket number

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of anton.brekhov@...
Sent: Wednesday, October 14, 2020 12:58 PM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

Is it OK that daos_server storage scan says that both pmem on one socket:

[root@apache512 ib0]# daos_server storage scan

Scanning locally-attached storage...

NVMe controllers and namespaces:

PCI:0000:b3:00.0 Model:INTEL SSDPELKX010T8  FW:VCV10300 Socket:1 Capacity:1.0 TB

PCI:0000:b4:00.0 Model:INTEL SSDPELKX010T8  FW:VCV10300 Socket:1 Capacity:1.0 TB

SCM Namespaces:

Device:pmem0 Socket:0 Capacity:981 GB

Device:pmem1 Socket:0 Capacity:981 GB


But ipmctl says that dcpmm is on two separate sockets?

[root@apache512 ib0]# ipmctl show -topology -socket 0

 DimmID | MemoryType                  | Capacity    | PhysicalID| DeviceLocator

================================================================================

 0x0001 | Logical Non-Volatile Device | 502.563 GiB | 0x001e    | CPU1_DIMM_A2

 0x0101 | Logical Non-Volatile Device | 502.563 GiB | 0x0024    | CPU1_DIMM_D2

 N/A    | DDR4                        | 16.000 GiB  | 0x001d    | CPU1_DIMM_A1

 N/A    | DDR4                        | 16.000 GiB  | 0x001f    | CPU1_DIMM_B1

 N/A    | DDR4                        | 16.000 GiB  | 0x0021    | CPU1_DIMM_C1

 N/A    | DDR4                        | 16.000 GiB  | 0x0023    | CPU1_DIMM_D1

 N/A    | DDR4                        | 16.000 GiB  | 0x0025    | CPU1_DIMM_E1

 N/A    | DDR4                        | 16.000 GiB  | 0x0027    | CPU1_DIMM_F1

 

[root@apache512 ib0]# ipmctl show -topology -socket 1

 DimmID | MemoryType                  | Capacity    | PhysicalID| DeviceLocator

================================================================================

 0x1001 | Logical Non-Volatile Device | 502.563 GiB | 0x002a    | CPU2_DIMM_A2

 0x1101 | Logical Non-Volatile Device | 502.563 GiB | 0x0030    | CPU2_DIMM_D2

 N/A    | DDR4                        | 16.000 GiB  | 0x0029    | CPU2_DIMM_A1

 N/A    | DDR4                        | 16.000 GiB  | 0x002b    | CPU2_DIMM_B1

 N/A    | DDR4                        | 16.000 GiB  | 0x002d    | CPU2_DIMM_C1

 N/A    | DDR4                        | 16.000 GiB  | 0x002f    | CPU2_DIMM_D1

 N/A    | DDR4                        | 16.000 GiB  | 0x0031    | CPU2_DIMM_E1

 N/A    | DDR4                        | 16.000 GiB  | 0x0033    | CPU2_DIMM_F1

 

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Nabarro, Tom
 

https://github.com/pmem/ndctl/issues/130 regression is fixed in Centos8

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Nabarro, Tom
Sent: Wednesday, October 14, 2020 2:13 PM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

Yes , this is a bug with current centos7 kernel and ndctl, I will find the ticket number

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of anton.brekhov@...
Sent: Wednesday, October 14, 2020 12:58 PM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

Is it OK that daos_server storage scan says that both pmem on one socket:

[root@apache512 ib0]# daos_server storage scan

Scanning locally-attached storage...

NVMe controllers and namespaces:

PCI:0000:b3:00.0 Model:INTEL SSDPELKX010T8  FW:VCV10300 Socket:1 Capacity:1.0 TB

PCI:0000:b4:00.0 Model:INTEL SSDPELKX010T8  FW:VCV10300 Socket:1 Capacity:1.0 TB

SCM Namespaces:

Device:pmem0 Socket:0 Capacity:981 GB

Device:pmem1 Socket:0 Capacity:981 GB


But ipmctl says that dcpmm is on two separate sockets?

[root@apache512 ib0]# ipmctl show -topology -socket 0

 DimmID | MemoryType                  | Capacity    | PhysicalID| DeviceLocator

================================================================================

 0x0001 | Logical Non-Volatile Device | 502.563 GiB | 0x001e    | CPU1_DIMM_A2

 0x0101 | Logical Non-Volatile Device | 502.563 GiB | 0x0024    | CPU1_DIMM_D2

 N/A    | DDR4                        | 16.000 GiB  | 0x001d    | CPU1_DIMM_A1

 N/A    | DDR4                        | 16.000 GiB  | 0x001f    | CPU1_DIMM_B1

 N/A    | DDR4                        | 16.000 GiB  | 0x0021    | CPU1_DIMM_C1

 N/A    | DDR4                        | 16.000 GiB  | 0x0023    | CPU1_DIMM_D1

 N/A    | DDR4                        | 16.000 GiB  | 0x0025    | CPU1_DIMM_E1

 N/A    | DDR4                        | 16.000 GiB  | 0x0027    | CPU1_DIMM_F1

 

[root@apache512 ib0]# ipmctl show -topology -socket 1

 DimmID | MemoryType                  | Capacity    | PhysicalID| DeviceLocator

================================================================================

 0x1001 | Logical Non-Volatile Device | 502.563 GiB | 0x002a    | CPU2_DIMM_A2

 0x1101 | Logical Non-Volatile Device | 502.563 GiB | 0x0030    | CPU2_DIMM_D2

 N/A    | DDR4                        | 16.000 GiB  | 0x0029    | CPU2_DIMM_A1

 N/A    | DDR4                        | 16.000 GiB  | 0x002b    | CPU2_DIMM_B1

 N/A    | DDR4                        | 16.000 GiB  | 0x002d    | CPU2_DIMM_C1

 N/A    | DDR4                        | 16.000 GiB  | 0x002f    | CPU2_DIMM_D1

 N/A    | DDR4                        | 16.000 GiB  | 0x0031    | CPU2_DIMM_E1

 N/A    | DDR4                        | 16.000 GiB  | 0x0033    | CPU2_DIMM_F1

 

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


anton.brekhov@...
 

Tom, thanks for explanation!

Do you know anything about this:

Is there any way to prevent spdk to take over local NVMe drives? We want to connect all nvme disks through kdev, but after server start all of them are disappears from /dev/nvmeX path.

bdev_exclude didn't help

 


Steffen Christgau
 

On 15/10/2020 13.21, anton.brekhov@... wrote:
Is there any way to prevent spdk to take over local NVMe drives? We want
to connect all nvme disks through kdev, but after server start all of
them are disappears from /dev/nvmeX path.
AFAIK a mounted drive is not "taken over" by SPDK/DAOS. You could
therefore temporarily mount the drives, start the server and unmount
them afterwards. It's only a workaround rather than a clean solution but
maybe it works for you.

Regards, Steffen


Nabarro, Tom
 

Bdev_exclude should work in this situation, could you please share the server config file and enable and share the privileged helper log file by setting "helper_log_file: /tmp/daos_admin.log” (or similar) on the global section of the server config file. That should give us more insight.o

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of anton.brekhov@...
Sent: Thursday, October 15, 2020 12:21 PM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

Tom, thanks for explanation!

Do you know anything about this:

Is there any way to prevent spdk to take over local NVMe drives? We want to connect all nvme disks through kdev, but after server start all of them are disappears from /dev/nvmeX path.

bdev_exclude didn't help

 

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.