Re: DAOS with NVMe-over-Fabrics


Nabarro, Tom
 

Hello Anton

 

Doesn’t look like the bdev_exclude list is being honoured, I’m on leave today but will verify behavior on master tomorrow and supply fix if broken.

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of anton.brekhov@...
Sent: Sunday, October 18, 2020 1:28 PM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS with NVMe-over-Fabrics

 

Tom, thanks for comments! 

I've edited my previous message. I'll copy it below:

Tom here is the active config file content:

name: daos_server

access_points: ['apache512']

#access_points: ['localhost']

port: 10001

#provider: ofi+sockets

provider: ofi+verbs;ofi_rxm

nr_hugepages: 4096

control_log_file: /tmp/daos_control.log

helper_log_file: /tmp/daos_admin.log
bdev_exclude: ["0000:b1:00.0","0000:b2:00.0","0000:b3:00.0","0000:b4:00.0"]

transport_config:

   allow_insecure: true

 

servers:

-

  targets: 4

  first_core: 0

  nr_xs_helpers: 0

  fabric_iface: ib0

  fabric_iface_port: 31416

  log_mask: ERR

  log_file: /tmp/daos_server.log

 

  env_vars:

  - DAOS_MD_CAP=1024

  - CRT_CTX_SHARE_ADDR=0

  - CRT_TIMEOUT=30

  - FI_SOCKETS_MAX_CONN_RETRY=1

  - FI_SOCKETS_CONN_TIMEOUT=2000

  #- OFI_INTERFACE=ib0

  #- OFI_DOMAIN=mlx5_0

  #- CRT_PHY_ADDR_STR=ofi+verbs;ofi_rxm

 

  # Storage definitions

 

  # When scm_class is set to ram, tmpfs will be used to emulate SCM.

  # The size of ram is specified by scm_size in GB units.

  scm_mount: /mnt/daos  # map to -s /mnt/daos

  #scm_class: ram

  #scm_size: 8

  scm_class: dcpm

  scm_list: [/dev/pmem0]

 

  bdev_class: nvme

  #bdev_list: ["0000:b1:00.0","0000:b2:00.0"]

 

Before starting server:

[root@apache512 ~]# nvme list

Node             SN                   Model                                    Namespace Usage                      Format           FW Rev

---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------

/dev/nvme0n1     BTLJ81460E1M1P0I     INTEL SSDPELKX010T8                      1           1,00  TB /   1,00  TB    512   B +  0 B   VCV10300

/dev/nvme1n1     BTLJ81460E031P0I     INTEL SSDPELKX010T8                      1           1,00  TB /   1,00  TB    512   B +  0 B   VCV10300

/dev/nvme2n1     BTLJ81460E1J1P0I     INTEL SSDPELKX010T8                      1           1,00  TB /   1,00  TB    512   B +  0 B   VCV10300

/dev/nvme3n1     BTLJ81460E341P0I     INTEL SSDPELKX010T8                      1           1,00  TB /   1,00  TB    512   B +  0 B   VCV10300

After starting server:

[root@apache512 ~]# nvme list

[root@apache512 ~]#

The helper_log_file :

calling into script: /usr/share/daos/control/../../spdk/scripts/setup.sh

0000:b1:00.0 (8086 0a54): nvme -> uio_pci_generic

0000:b2:00.0 (8086 0a54): nvme -> uio_pci_generic

0000:b3:00.0 (8086 0a54): nvme -> uio_pci_generic

0000:b4:00.0 (8086 0a54): nvme -> uio_pci_generic

0000:00:04.0 (8086 2021): no driver -> uio_pci_generic

0000:00:04.1 (8086 2021): Already using the uio_pci_generic driver

0000:00:04.2 (8086 2021): Already using the uio_pci_generic driver

0000:00:04.3 (8086 2021): Already using the uio_pci_generic driver

0000:00:04.4 (8086 2021): Already using the uio_pci_generic driver

0000:00:04.5 (8086 2021): Already using the uio_pci_generic driver

0000:00:04.6 (8086 2021): Already using the uio_pci_generic driver

0000:00:04.7 (8086 2021): Already using the uio_pci_generic driver

0000:80:04.0 (8086 2021): Already using the uio_pci_generic driver

0000:80:04.1 (8086 2021): Already using the uio_pci_generic driver

0000:80:04.2 (8086 2021): Already using the uio_pci_generic driver

0000:80:04.3 (8086 2021): Already using the uio_pci_generic driver

0000:80:04.4 (8086 2021): Already using the uio_pci_generic driver

0000:80:04.5 (8086 2021): Already using the uio_pci_generic driver

0000:80:04.6 (8086 2021): Already using the uio_pci_generic driver

0000:80:04.7 (8086 2021): Already using the uio_pci_generic driver

RUN: ls -d /dev/hugepages | xargs -r chown -R

rootRUN: ls -d /dev/uio* | xargs -r chown -R

rootRUN: ls -d /sys/class/uio/uio*/device/config | xargs -r chown -R

rootRUN: ls -d /sys/class/uio/uio*/device/resource* | xargs -r chown -R

rootSetting VFIO file permissions for unprivileged access

RUN: chmod /dev/vfio

OK

RUN: chmod /dev/vfio/*

OK

 

DEBUG 17:41:13.340092 nvme.go:176: discovered nvme ssds: [0000:b4:00.0 0000:b3:00.0 0000:b1:00.0 0000:b2:00.0]

DEBUG 17:41:13.340794 nvme.go:133: removed lockfiles: [/tmp/spdk_pci_lock_0000:b4:00.0 /tmp/spdk_pci_lock_0000:b3:00.0 /tmp/spdk_pci_lock_0000:b1:00.0 /tmp/spdk_pci_lock_0000:b2:00.0]

DEBUG 17:41:13.502291 ipmctl.go:104: discovered 4 DCPM modules

DEBUG 17:41:13.517978 ipmctl.go:356: discovered 2 DCPM namespaces

DEBUG 17:41:13.775184 ipmctl.go:133: show region output: ---ISetID=0xfe0ceeb819432444---

   PersistentMemoryType=AppDirect

   FreeCapacity=0.000 GiB

---ISetID=0xe1f4eeb8c7432444---

   PersistentMemoryType=AppDirect

   FreeCapacity=0.000 GiB

 

DEBUG 17:41:13.973259 ipmctl.go:104: discovered 4 DCPM modules

DEBUG 17:41:13.988400 ipmctl.go:356: discovered 2 DCPM namespaces

DEBUG 17:41:14.234782 ipmctl.go:133: show region output: ---ISetID=0xfe0ceeb819432444---

   PersistentMemoryType=AppDirect

   FreeCapacity=0.000 GiB

---ISetID=0xe1f4eeb8c7432444---

   PersistentMemoryType=AppDirect

   FreeCapacity=0.000 GiB

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Join daos@daos.groups.io to automatically receive all group messages.