Re: issues with NVMe drives from RPM installation
Dahringer, Richard
That worked!
Thanks Tom!
From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of
Nabarro, Tom
Sent: Thursday, July 30, 2020 12:11 To: daos@daos.groups.io Subject: Re: [daos] issues with NVMe drives from RPM installation
Sounds like maybe metadata is out of sync, can you try removing /mnt/daos0/*, starting the server and then (on a separate tty) reformatting with "dmg storage format --reformat"?
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Dahringer, Richard
Thanks Tom, that led me to this:
07/30-08:21:10.77 elfs13o01 DAOS[74504/74524] bio ERR src/bio/bio_xstream.c:877 init_blobstore_ctxt() Device list & device mapping is inconsistent 07/30-08:21:14.13 elfs13o01 DAOS[74504/74524] server ERR src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1005
When I check for consistency, I see :
[root@elfs13o01 tmp]# daos_server storage scan Scanning locally-attached storage... ERROR: /usr/bin/daos_admin EAL: No free hugepages reported in hugepages-1048576kB NVMe controllers and namespaces: PCI:0000:5e:00.0 Model:INTEL SSDPE2KX040T8 FW:VDV10131 Socket:0 Capacity:4.0 TB PCI:0000:5f:00.0 Model:INTEL SSDPE2KX040T8 FW:VDV10131 Socket:0 Capacity:4.0 TB PCI:0000:d8:00.0 Model:INTEL SSDPE2KX040T8 FW:VDV10131 Socket:1 Capacity:4.0 TB PCI:0000:d9:00.0 Model:INTEL SSDPE2KX040T8 FW:VDV10131 Socket:1 Capacity:4.0 TB SCM Namespaces: Device:pmem0 Socket:0 Capacity:266 GB Device:pmem1 Socket:1 Capacity:266 GB
And the first line of the NVMe controllers listed is the drive I have in the configuration file (from below)
bdev_class: nvme
Is there another file somewhere that I need to set up? I saw some documentation of ‘daos_nvme.conf’ which is automatically generated. I added the second NVMe device on socket 0 to the configuration to test to see if that would change anything, but I have the same results.
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Nabarro, Tom
Hello Richard
"ERROR: DAOS I/O Server exited with error: /usr/bin/daos_io_server (instance 0) exited: exit status 1” Regards, Tom Nabarro – HPC M: +44 (0)7786 260986 Skype: tom.nabarro
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of richard.dahringer@...
Hi all - name: daos_server servers:
env_vars:
# Storage definitions
# When scm_class is set to ram, tmpfs will be used to emulate SCM. # The size of ram is specified by scm_size in GB units. scm_mount: /mnt/daos0 # map to -s /mnt/daos bdev_class: nvme [root@elfs13o01 ~]# daos_server -o daos_local.yml start --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|