Date   

Re: issues with NVMe drives from RPM installation

Farrell, Patrick Arthur
 

Richard,

There's nothing obviously wrong - to me, anyway - with your config, and no useful errors in the output.  You can check the logs in /tmp/daos*.log (There will be multiple files), they should contain more information.  You could also turn on debug before you start the server to possibly get more info - described in the manual https://daos-stack.github.io/admin/troubleshooting/

Also, if you have not, you can check your drives are visible to DAOS and can be prepared as expected with the daos_server storage commands, scan and prepare, detailed here:
https://daos-stack.github.io/admin/deployment/

That details how to run them for SCM, look at the command help for how to run them for NVMe devices.  (You'll want to select NVMe only or it may ask you to reboot to set up your SCM goals, which you've obviously already done.)

Regards,
-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of richard.dahringer@... <richard.dahringer@...>
Sent: Thursday, July 30, 2020 9:27 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] issues with NVMe drives from RPM installation
 

Hi all -
I'm trying to set up a proof of concept daos cluster, and it is proving to be tricky. The systems have 4 SCM 128G DIMMs, and 4 U.2 NVMe drives installed. I have installed all the RPMs from registrationcenter.intel.com, and have been able to set up the SCM devices, 'dmg -i' commands all seem to work.  When I add nvme drives to the configuration though, daos_server does not start - it does start when the nvme drives are not there. 

My daos_server.conf file:

name: daos_server
access_points: ['elfs13o01']
# port: 10001
provider: ofi+psm2
nr_hugepages: 4096
control_log_file: /tmp/daos_control.log
transport_config:
   allow_insecure: true

servers:
-
  targets: 1
  first_core: 0
  nr_xs_helpers: 0
  fabric_iface: hib0
  fabric_iface_port: 31416
  log_file: /tmp/daos_server.log

 

  env_vars:
  - DAOS_MD_CAP=1024
  - CRT_CTX_SHARE_ADDR=0
  - CRT_TIMEOUT=30
  - FI_SOCKETS_MAX_CONN_RETRY=1
  - FI_SOCKETS_CONN_TIMEOUT=2000

 

  # Storage definitions

 

  # When scm_class is set to ram, tmpfs will be used to emulate SCM.

  # The size of ram is specified by scm_size in GB units.

  scm_mount: /mnt/daos0  # map to -s /mnt/daos
  scm_class: dcpm
  scm_list: [/dev/pmem0]

  bdev_class: nvme
  bdev_list: ["0000:5e:00.0"]

The startup error:

[root@elfs13o01 ~]# daos_server -o daos_local.yml start
daos_server logging to file /tmp/daos_control.log
ERROR: /usr/bin/daos_admin EAL: No free hugepages reported in hugepages-1048576kB
DAOS Control Server (pid 73257) listening on 0.0.0.0:10001
Waiting for DAOS I/O Server instance storage to be ready...
SCM @ /mnt/daos0: 262 GB Total/247 GB Avail
Starting I/O server instance 0: /usr/bin/daos_io_server
daos_io_server:0 Using legacy core allocation algorithm
daos_io_server:0 Starting SPDK v19.04.1 / DPDK 19.02.0 initialization...
[ DPDK EAL parameters: daos -c 0x1 --pci-whitelist=0000:5e:00.0 --log-level=lib.eal:6 --base-virtaddr=0x200000000000 --match-allocations --file-prefix=spdk73258 --proc-type=auto ]
ERROR: daos_io_server:0 EAL: No free hugepages reported in hugepages-1048576kB
ERROR: /var/run/daos_server/daos_server.sock: failed to accept connection: accept unixpacket /var/run/daos_server/daos_server.sock: use of closed network connection
ERROR: DAOS I/O Server exited with error: /usr/bin/daos_io_server (instance 0) exited: exit status 1

Can someone provide some pointers to what is going on? 


issues with NVMe drives from RPM installation

Dahringer, Richard
 

Hi all -
I'm trying to set up a proof of concept daos cluster, and it is proving to be tricky. The systems have 4 SCM 128G DIMMs, and 4 U.2 NVMe drives installed. I have installed all the RPMs from registrationcenter.intel.com, and have been able to set up the SCM devices, 'dmg -i' commands all seem to work.  When I add nvme drives to the configuration though, daos_server does not start - it does start when the nvme drives are not there. 

My daos_server.conf file:

name: daos_server
access_points: ['elfs13o01']
# port: 10001
provider: ofi+psm2
nr_hugepages: 4096
control_log_file: /tmp/daos_control.log
transport_config:
   allow_insecure: true

servers:
-
  targets: 1
  first_core: 0
  nr_xs_helpers: 0
  fabric_iface: hib0
  fabric_iface_port: 31416
  log_file: /tmp/daos_server.log

 

  env_vars:
  - DAOS_MD_CAP=1024
  - CRT_CTX_SHARE_ADDR=0
  - CRT_TIMEOUT=30
  - FI_SOCKETS_MAX_CONN_RETRY=1
  - FI_SOCKETS_CONN_TIMEOUT=2000

 

  # Storage definitions

 

  # When scm_class is set to ram, tmpfs will be used to emulate SCM.

  # The size of ram is specified by scm_size in GB units.

  scm_mount: /mnt/daos0  # map to -s /mnt/daos
  scm_class: dcpm
  scm_list: [/dev/pmem0]

  bdev_class: nvme
  bdev_list: ["0000:5e:00.0"]

The startup error:

[root@elfs13o01 ~]# daos_server -o daos_local.yml start
daos_server logging to file /tmp/daos_control.log
ERROR: /usr/bin/daos_admin EAL: No free hugepages reported in hugepages-1048576kB
DAOS Control Server (pid 73257) listening on 0.0.0.0:10001
Waiting for DAOS I/O Server instance storage to be ready...
SCM @ /mnt/daos0: 262 GB Total/247 GB Avail
Starting I/O server instance 0: /usr/bin/daos_io_server
daos_io_server:0 Using legacy core allocation algorithm
daos_io_server:0 Starting SPDK v19.04.1 / DPDK 19.02.0 initialization...
[ DPDK EAL parameters: daos -c 0x1 --pci-whitelist=0000:5e:00.0 --log-level=lib.eal:6 --base-virtaddr=0x200000000000 --match-allocations --file-prefix=spdk73258 --proc-type=auto ]
ERROR: daos_io_server:0 EAL: No free hugepages reported in hugepages-1048576kB
ERROR: /var/run/daos_server/daos_server.sock: failed to accept connection: accept unixpacket /var/run/daos_server/daos_server.sock: use of closed network connection
ERROR: DAOS I/O Server exited with error: /usr/bin/daos_io_server (instance 0) exited: exit status 1

Can someone provide some pointers to what is going on? 


DAOS on CentOS 8

Farrell, Patrick Arthur
 

Good afternoon,

Has anyone been running DAOS on CentOS 8?  We tried this a while back, and had some significant issues related to 'python' referring to Python3 rather than Python2.  Before we try again, I was curious if anyone has been doing this & if there are any caveats or known problems.

Regards,
-Patrick


Re: DAOS ior module has compilation errors

Chaarawi, Mohamad
 

Hi Kevan,

 

Yes I missed this in the latest fixes to update the DAOS & DFS drivers to the new IOR backend API changes.

This PR fixes it:

https://github.com/hpc/ior/pull/244

 

hopefully should land quickly.

 

Thanks,

Mohamad

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Wednesday, July 22, 2020 at 9:55 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] DAOS ior module has compilation errors

 

Apologies if this is old news, but the upstream ior master branch src/aiori-DAOS.c file does not compile cleanly, the signature apparently has changed for .get_file_size, the MPI_Comm parameter has been removed.

 

Regards, Kevan

 

  CC       libaiori_a-aiori-DUMMY.o

  CC       libaiori_a-aiori-MPIIO.o

  CC       libaiori_a-aiori-MMAP.o

  CC       libaiori_a-aiori-POSIX.o

  CC       libaiori_a-aiori-DAOS.o

aiori-DAOS.c:109:9: warning: initialization from incompatible pointer type [enabled by default]

         .get_file_size = DAOS_GetFileSize,

         ^

aiori-DAOS.c:109:9: warning: (near initialization for ‘daos_aiori.get_file_size’) [enabled by default]

  CC       libaiori_a-aiori-DFS.o

  AR       libaiori.a

  CC       ior-ior-main.o

  CC       ior-aiori.o

  CC       ior-aiori-DUMMY.o

  CC       ior-aiori-MPIIO.o

  CC       ior-aiori-MMAP.o

  CC       ior-aiori-POSIX.o

  CC       ior-aiori-DAOS.o

aiori-DAOS.c:109:9: warning: initialization from incompatible pointer type [enabled by default]

         .get_file_size = DAOS_GetFileSize,

         ^

aiori-DAOS.c:109:9: warning: (near initialization for ‘daos_aiori.get_file_size’) [enabled by default]

  CC       ior-aiori-DFS.o

  CCLD     ior

  CC       mdtest-mdtest-main.o

  CC       mdtest-aiori.o

  CC       mdtest-aiori-DUMMY.o

  CC       mdtest-aiori-MPIIO.o

  CC       mdtest-aiori-MMAP.o

  CC       mdtest-aiori-POSIX.o

  CC       mdtest-aiori-DAOS.o

aiori-DAOS.c:109:9: warning: initialization from incompatible pointer type [enabled by default]

         .get_file_size = DAOS_GetFileSize,

         ^

aiori-DAOS.c:109:9: warning: (near initialization for ‘daos_aiori.get_file_size’) [enabled by default]

  CC       mdtest-aiori-DFS.o

  CCLD     mdtest

  CC       test/lib.o


DAOS ior module has compilation errors

Kevan Rehm
 

Apologies if this is old news, but the upstream ior master branch src/aiori-DAOS.c file does not compile cleanly, the signature apparently has changed for .get_file_size, the MPI_Comm parameter has been removed.

 

Regards, Kevan

 

  CC       libaiori_a-aiori-DUMMY.o

  CC       libaiori_a-aiori-MPIIO.o

  CC       libaiori_a-aiori-MMAP.o

  CC       libaiori_a-aiori-POSIX.o

  CC       libaiori_a-aiori-DAOS.o

aiori-DAOS.c:109:9: warning: initialization from incompatible pointer type [enabled by default]

         .get_file_size = DAOS_GetFileSize,

         ^

aiori-DAOS.c:109:9: warning: (near initialization for ‘daos_aiori.get_file_size’) [enabled by default]

  CC       libaiori_a-aiori-DFS.o

  AR       libaiori.a

  CC       ior-ior-main.o

  CC       ior-aiori.o

  CC       ior-aiori-DUMMY.o

  CC       ior-aiori-MPIIO.o

  CC       ior-aiori-MMAP.o

  CC       ior-aiori-POSIX.o

  CC       ior-aiori-DAOS.o

aiori-DAOS.c:109:9: warning: initialization from incompatible pointer type [enabled by default]

         .get_file_size = DAOS_GetFileSize,

         ^

aiori-DAOS.c:109:9: warning: (near initialization for ‘daos_aiori.get_file_size’) [enabled by default]

  CC       ior-aiori-DFS.o

  CCLD     ior

  CC       mdtest-mdtest-main.o

  CC       mdtest-aiori.o

  CC       mdtest-aiori-DUMMY.o

  CC       mdtest-aiori-MPIIO.o

  CC       mdtest-aiori-MMAP.o

  CC       mdtest-aiori-POSIX.o

  CC       mdtest-aiori-DAOS.o

aiori-DAOS.c:109:9: warning: initialization from incompatible pointer type [enabled by default]

         .get_file_size = DAOS_GetFileSize,

         ^

aiori-DAOS.c:109:9: warning: (near initialization for ‘daos_aiori.get_file_size’) [enabled by default]

  CC       mdtest-aiori-DFS.o

  CCLD     mdtest

  CC       test/lib.o


Announcement: DAOS 1.0.1 and pre-built RPMs are available

Prantis, Kelsey
 

Hello All,

 

We would like to announce that DAOS 1.0.1 is now generally available.

 

One of the DAOS dependencies, log4j, has issued a security alert, so we have updated DAOS from log4j 2.11.1 to 2.13.3. For more information, Log4j’s changelog may be found at https://logging.apache.org/log4j/2.x/changes-report.html.

 

This release does not contain any changes in DAOS itself.

 

Source code for this release is available at https://github.com/daos-stack/daos/releases/tag/v1.0.1.

 

Starting with DAOS 1.0.1, pre-built rpms are also available. Instructions on obtaining rpms are available here: https://daos-stack.github.io/admin/installation/#distribution-packages.

 

As always, please let us know if you have any issues or concerns.

 

Regards,

 

Kelsey Prantis

Senior Software Engineering Manager

Extreme Storage Architecture and Development Division

Intel

 


Re: Client IO Request.

Wang, Di
 

Yes, each target has its own progress ULT to listen the request.

Thanks
Wangdi

On Jul 11, 2020, at 4:27 PM, Colin Ngam <colin.ngam@...> wrote:



Thanks WangDi.

 

So each target on the Server is actually listening for RPC IO Requests.

 

Thanks.

 

Colin

 

From: <daos@daos.groups.io> on behalf of "Wang, Di" <di.wang@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Saturday, July 11, 2020 at 5:38 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Client IO Request.

 

Hello,

 

The endpoint is one target of the server, i.e. the client needs to specify both rank and target index (within that server).

 

Thanks

WangDi

 

On 7/11/20, 3:30 PM, "daos@daos.groups.io on behalf of Colin Ngam" <daos@daos.groups.io on behalf of colin.ngam@...> wrote:

 

Greetings,

 

When a Client makes an IO Request, is the endpoint the IO Server or the Target?

 

Thanks.

 

Colin


Re: Client IO Request.

Colin Ngam
 

Thanks WangDi.

 

So each target on the Server is actually listening for RPC IO Requests.

 

Thanks.

 

Colin

 

From: <daos@daos.groups.io> on behalf of "Wang, Di" <di.wang@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Saturday, July 11, 2020 at 5:38 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Client IO Request.

 

Hello,

 

The endpoint is one target of the server, i.e. the client needs to specify both rank and target index (within that server).

 

Thanks

WangDi

 

On 7/11/20, 3:30 PM, "daos@daos.groups.io on behalf of Colin Ngam" <daos@daos.groups.io on behalf of colin.ngam@...> wrote:

 

Greetings,

 

When a Client makes an IO Request, is the endpoint the IO Server or the Target?

 

Thanks.

 

Colin


Re: Client IO Request.

Wang, Di
 

Hello,

 

The endpoint is one target of the server, i.e. the client needs to specify both rank and target index (within that server).

 

Thanks

WangDi

 

On 7/11/20, 3:30 PM, "daos@daos.groups.io on behalf of Colin Ngam" <daos@daos.groups.io on behalf of colin.ngam@...> wrote:

 

Greetings,

 

When a Client makes an IO Request, is the endpoint the IO Server or the Target?

 

Thanks.

 

Colin


Client IO Request.

Colin Ngam
 

Greetings,

 

When a Client makes an IO Request, is the endpoint the IO Server or the Target?

 

Thanks.

 

Colin


Re: dfuse fails mounting container

Ruben Felgenhauer <4felgenh@...>
 

Hi Mohamad,

okay I think that clears up everything, thanks very much!

The second question was regarding using DAOS-VOL to write data into DAOS and afterwards mounting the container with DFS to check the container contents on a POSIX level for convenience. But as I said, you already cleared that up perfectly.

I was focusing mainly on DAOS-VOL, so I'll definitely try out the same task as written above with MPICH instead.

Thanks again,
Ruben

Am 11.07.20 um 20:43 schrieb Chaarawi, Mohamad:

Hi Ruben,

I wouldn't call this requirement "new" as it has been there for a long time now.
But yes when using a DFS container (MPI-IO, dfuse, hdf5 over posix or MPI-IO), the container must be of type POSIX.
Yes I guess we can make this clearer in the documentation.

The HDF5 DAOS-VOL (this is different than running HDF5 with the native vol over MPI or POSIX over DFS DAOS) is a different thing. The HDF5 VOL plugin manages the container creation and it uses type HDF5 not POSIX, since underneath it doesn't use the DFS or POSIX API. So not sure what you mean there by DFS will not work at all. Could you elaborate more please?

As for MPICH, it should work fine still. Not sure what doesn't work. But the container created has to be of type POSIX.
Note that the driver has been integrated into upstream MPICH master repo, and there is no need to use the fork for that anymore.

Thanks,
Mohamad

On 7/11/20, 1:36 PM, "daos@daos.groups.io on behalf of Ruben Felgenhauer" <daos@daos.groups.io on behalf of 4felgenh@informatik.uni-hamburg.de> wrote:

Hi Mohamad,

thanks for the tip, I had not created the container as POSIX-type. I
remember that this wasn't explicitly necessary somewhere in the past, is
this a new behavior? The "POSIX Namespace" page of the user guide is
hinting in this direction, but the necessity isn't really stated
explicitly in my opinion.

Does this also mean that DFS will not work at all, if the container was
automatically created by DAOS-VOL / HDF5V? I figure that DAOS-VOL will
not create the container with the posix type?

And I guess, the forked MPICH with the adio driver wouldn't work either?

Kind regards,
Ruben

Am 11.07.20 um 18:12 schrieb Chaarawi, Mohamad:
>
> Hi Ruben,
>
> I assume you create the container with type POSIX?
>
> daos cont create --pool=$DAOS_POOL --svc=$DAOS_SVCL --type=POSIX
>
> and that succeeds and you pass that cont uuid to the dfuse command?
>
> After you create the container, could you please run a cont query on
> that to verify?
>
> daos cont query --pool=$DAOS_POOL --svc=$DAOS_SVCL --cont=$DAOS_CONT
>
> Thanks,
>
> Mohamad
>
> *From: *<daos@daos.groups.io> on behalf of Ruben Felgenhauer
> <4felgenh@informatik.uni-hamburg.de>
> *Reply-To: *"daos@daos.groups.io" <daos@daos.groups.io>
> *Date: *Saturday, July 11, 2020 at 4:32 AM
> *To: *"daos@daos.groups.io" <daos@daos.groups.io>
> *Subject: *[daos] dfuse fails mounting container
>
> Hi,
>
> I'm still failing to get the DAOS Fuse filesystem running. I'm using
> the daos_server_local.yml config with DAOS v0.9 and have server and
> agent running.
>
> Both the high level and low level Fuse interface are failing on me:
>
> Low level:
> OFI_INTERFACE=eth0 dfuse -S --mountpoint="$DFS_MNT" --svc="$DAOS_SVCL"
> --pool="$DAOS_POOL" --container="$DAOS_CONT" --foreground
>
> This returns immediately without any error message and will log the
> following in daos.log:
>
> 07/11-11:16:41.21 abu2 DAOS[62814/62814] fi INFO
> src/gurt/fault_inject.c:486 d_fault_inject_init() No config file,
> fault injection is OFF.
> 07/11-11:16:41.21 abu2 DAOS[62814/62814] crt INFO
> src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
> 07/11-11:16:41.21 abu2 DAOS[62814/62814] crt WARN
> src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set;
> setting to 2048
> 07/11-11:16:41.46 abu2 DAOS[62814/62814] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:328 main(0x559201d7b010) dfs_mount
> failed (22)
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:362 main(0x559201d7b010) DFP left at the end
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:365 main(0x559201d7af80) DFS left at the end
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:372 main(0x559201d7af80) dfs_umount()
> failed (22)
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:378 main(0x559201d7af80)
> daos_cont_close() failed: (-1002)
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:390 main(0x559201d7b010)
> daos_pool_disconnect() failed: (-1002)
> 07/11-11:16:41.49 abu2 DAOS[62814/62814] dfuse INFO
> src/client/dfuse/dfuse_main.c:404 main() Exiting with status 0
>
> High Level:
> $ OFI_INTERFACE=eth0 dfuse_hl "$DFS_MNT" -s -f -d -p "$DAOS_POOL" -l
> "$DAOS_SVCL" -c "$DAOS_CONT"
> Pool Connect...
> DFS Pool = 0cce90ce-2f6c-4621-836f-b24476acefd0
> DFS SVCL = 0
> DFS Container: 6312c82d-7daf-34a2-edb7-3b45441cdd6f
> Failed dfs mount (22)
>
> This logs the following:
> 07/11-11:21:11.29 abu2 DAOS[62833/62833] fi INFO
> src/gurt/fault_inject.c:486 d_fault_inject_init() No config file,
> fault injection is OFF.
> 07/11-11:21:11.29 abu2 DAOS[62833/62833] crt INFO
> src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
> 07/11-11:21:11.29 abu2 DAOS[62833/62833] crt WARN
> src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set;
> setting to 2048
> 07/11-11:21:11.54 abu2 DAOS[62833/62833] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
> 07/11-11:21:11.56 abu2 DAOS[62833/62833] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
>
> Neither of these are mounting anything in "$DFS_MNT". Interestingly,
> if I leave out the --container at the low level dfuse, it seems to
> work at first, but later fails if I want to do anything with the
> $DFS_MNT directory.
>
> Kind regards,
> Ruben
>
>






Re: dfuse fails mounting container

Chaarawi, Mohamad
 

Hi Ruben,

I wouldn't call this requirement "new" as it has been there for a long time now.
But yes when using a DFS container (MPI-IO, dfuse, hdf5 over posix or MPI-IO), the container must be of type POSIX.
Yes I guess we can make this clearer in the documentation.

The HDF5 DAOS-VOL (this is different than running HDF5 with the native vol over MPI or POSIX over DFS DAOS) is a different thing. The HDF5 VOL plugin manages the container creation and it uses type HDF5 not POSIX, since underneath it doesn't use the DFS or POSIX API. So not sure what you mean there by DFS will not work at all. Could you elaborate more please?

As for MPICH, it should work fine still. Not sure what doesn't work. But the container created has to be of type POSIX.
Note that the driver has been integrated into upstream MPICH master repo, and there is no need to use the fork for that anymore.

Thanks,
Mohamad

On 7/11/20, 1:36 PM, "daos@daos.groups.io on behalf of Ruben Felgenhauer" <daos@daos.groups.io on behalf of 4felgenh@informatik.uni-hamburg.de> wrote:

Hi Mohamad,

thanks for the tip, I had not created the container as POSIX-type. I
remember that this wasn't explicitly necessary somewhere in the past, is
this a new behavior? The "POSIX Namespace" page of the user guide is
hinting in this direction, but the necessity isn't really stated
explicitly in my opinion.

Does this also mean that DFS will not work at all, if the container was
automatically created by DAOS-VOL / HDF5V? I figure that DAOS-VOL will
not create the container with the posix type?

And I guess, the forked MPICH with the adio driver wouldn't work either?

Kind regards,
Ruben

Am 11.07.20 um 18:12 schrieb Chaarawi, Mohamad:

> Hi Ruben,
>
> I assume you create the container with type POSIX?
>
> daos cont create --pool=$DAOS_POOL --svc=$DAOS_SVCL --type=POSIX
>
> and that succeeds and you pass that cont uuid to the dfuse command?
>
> After you create the container, could you please run a cont query on
> that to verify?
>
> daos cont query --pool=$DAOS_POOL --svc=$DAOS_SVCL --cont=$DAOS_CONT
>
> Thanks,
>
> Mohamad
>
> *From: *<daos@daos.groups.io> on behalf of Ruben Felgenhauer
> <4felgenh@informatik.uni-hamburg.de>
> *Reply-To: *"daos@daos.groups.io" <daos@daos.groups.io>
> *Date: *Saturday, July 11, 2020 at 4:32 AM
> *To: *"daos@daos.groups.io" <daos@daos.groups.io>
> *Subject: *[daos] dfuse fails mounting container
>
> Hi,
>
> I'm still failing to get the DAOS Fuse filesystem running. I'm using
> the daos_server_local.yml config with DAOS v0.9 and have server and
> agent running.
>
> Both the high level and low level Fuse interface are failing on me:
>
> Low level:
> OFI_INTERFACE=eth0 dfuse -S --mountpoint="$DFS_MNT" --svc="$DAOS_SVCL"
> --pool="$DAOS_POOL" --container="$DAOS_CONT" --foreground
>
> This returns immediately without any error message and will log the
> following in daos.log:
>
> 07/11-11:16:41.21 abu2 DAOS[62814/62814] fi INFO
> src/gurt/fault_inject.c:486 d_fault_inject_init() No config file,
> fault injection is OFF.
> 07/11-11:16:41.21 abu2 DAOS[62814/62814] crt INFO
> src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
> 07/11-11:16:41.21 abu2 DAOS[62814/62814] crt WARN
> src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set;
> setting to 2048
> 07/11-11:16:41.46 abu2 DAOS[62814/62814] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:328 main(0x559201d7b010) dfs_mount
> failed (22)
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:362 main(0x559201d7b010) DFP left at the end
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:365 main(0x559201d7af80) DFS left at the end
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:372 main(0x559201d7af80) dfs_umount()
> failed (22)
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:378 main(0x559201d7af80)
> daos_cont_close() failed: (-1002)
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:390 main(0x559201d7b010)
> daos_pool_disconnect() failed: (-1002)
> 07/11-11:16:41.49 abu2 DAOS[62814/62814] dfuse INFO
> src/client/dfuse/dfuse_main.c:404 main() Exiting with status 0
>
> High Level:
> $ OFI_INTERFACE=eth0 dfuse_hl "$DFS_MNT" -s -f -d -p "$DAOS_POOL" -l
> "$DAOS_SVCL" -c "$DAOS_CONT"
> Pool Connect...
> DFS Pool = 0cce90ce-2f6c-4621-836f-b24476acefd0
> DFS SVCL = 0
> DFS Container: 6312c82d-7daf-34a2-edb7-3b45441cdd6f
> Failed dfs mount (22)
>
> This logs the following:
> 07/11-11:21:11.29 abu2 DAOS[62833/62833] fi INFO
> src/gurt/fault_inject.c:486 d_fault_inject_init() No config file,
> fault injection is OFF.
> 07/11-11:21:11.29 abu2 DAOS[62833/62833] crt INFO
> src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
> 07/11-11:21:11.29 abu2 DAOS[62833/62833] crt WARN
> src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set;
> setting to 2048
> 07/11-11:21:11.54 abu2 DAOS[62833/62833] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
> 07/11-11:21:11.56 abu2 DAOS[62833/62833] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
>
> Neither of these are mounting anything in "$DFS_MNT". Interestingly,
> if I leave out the --container at the low level dfuse, it seems to
> work at first, but later fails if I want to do anything with the
> $DFS_MNT directory.
>
> Kind regards,
> Ruben
>
>


Re: dfuse fails mounting container

Ruben Felgenhauer <4felgenh@...>
 

Hi Mohamad,

thanks for the tip, I had not created the container as POSIX-type. I remember that this wasn't explicitly necessary somewhere in the past, is this a new behavior? The "POSIX Namespace" page of the user guide is hinting in this direction, but the necessity isn't really stated explicitly in my opinion.

Does this also mean that DFS will not work at all, if the container was automatically created by DAOS-VOL / HDF5V? I figure that DAOS-VOL will not create the container with the posix type?

And I guess, the forked MPICH with the adio driver wouldn't work either?

Kind regards,
Ruben

Am 11.07.20 um 18:12 schrieb Chaarawi, Mohamad:


Hi Ruben,

I assume you create the container with type POSIX?

daos cont create --pool=$DAOS_POOL --svc=$DAOS_SVCL --type=POSIX

and that succeeds and you pass that cont uuid to the dfuse command?

After you create the container, could you please run a cont query on that to verify?

daos cont query --pool=$DAOS_POOL --svc=$DAOS_SVCL --cont=$DAOS_CONT

Thanks,

Mohamad

*From: *<daos@daos.groups.io> on behalf of Ruben Felgenhauer <4felgenh@informatik.uni-hamburg.de>
*Reply-To: *"daos@daos.groups.io" <daos@daos.groups.io>
*Date: *Saturday, July 11, 2020 at 4:32 AM
*To: *"daos@daos.groups.io" <daos@daos.groups.io>
*Subject: *[daos] dfuse fails mounting container

Hi,

I'm still failing to get the DAOS Fuse filesystem running. I'm using the daos_server_local.yml config with DAOS v0.9 and have server and agent running.

Both the high level and low level Fuse interface are failing on me:

Low level:
OFI_INTERFACE=eth0 dfuse -S --mountpoint="$DFS_MNT" --svc="$DAOS_SVCL" --pool="$DAOS_POOL" --container="$DAOS_CONT" --foreground

This returns immediately without any error message and will log the following in daos.log:

07/11-11:16:41.21 abu2 DAOS[62814/62814] fi   INFO src/gurt/fault_inject.c:486 d_fault_inject_init() No config file, fault injection is OFF.
07/11-11:16:41.21 abu2 DAOS[62814/62814] crt  INFO src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
07/11-11:16:41.21 abu2 DAOS[62814/62814] crt  WARN src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set; setting to 2048
07/11-11:16:41.46 abu2 DAOS[62814/62814] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
07/11-11:16:41.48 abu2 DAOS[62814/62814] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR src/client/dfuse/dfuse_main.c:328 main(0x559201d7b010) dfs_mount failed (22)
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR src/client/dfuse/dfuse_main.c:362 main(0x559201d7b010) DFP left at the end
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR src/client/dfuse/dfuse_main.c:365 main(0x559201d7af80) DFS left at the end
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR src/client/dfuse/dfuse_main.c:372 main(0x559201d7af80) dfs_umount() failed (22)
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR src/client/dfuse/dfuse_main.c:378 main(0x559201d7af80) daos_cont_close() failed: (-1002)
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR src/client/dfuse/dfuse_main.c:390 main(0x559201d7b010) daos_pool_disconnect() failed: (-1002)
07/11-11:16:41.49 abu2 DAOS[62814/62814] dfuse INFO src/client/dfuse/dfuse_main.c:404 main() Exiting with status 0

High Level:
$ OFI_INTERFACE=eth0 dfuse_hl "$DFS_MNT" -s -f -d -p "$DAOS_POOL" -l "$DAOS_SVCL" -c "$DAOS_CONT"
Pool Connect...
DFS Pool = 0cce90ce-2f6c-4621-836f-b24476acefd0
DFS SVCL = 0
DFS Container: 6312c82d-7daf-34a2-edb7-3b45441cdd6f
Failed dfs mount (22)

This logs the following:
07/11-11:21:11.29 abu2 DAOS[62833/62833] fi   INFO src/gurt/fault_inject.c:486 d_fault_inject_init() No config file, fault injection is OFF.
07/11-11:21:11.29 abu2 DAOS[62833/62833] crt  INFO src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
07/11-11:21:11.29 abu2 DAOS[62833/62833] crt  WARN src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set; setting to 2048
07/11-11:21:11.54 abu2 DAOS[62833/62833] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
07/11-11:21:11.56 abu2 DAOS[62833/62833] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21

Neither of these are mounting anything in "$DFS_MNT". Interestingly, if I leave out the --container at the low level dfuse, it seems to work at first, but later fails if I want to do anything with the $DFS_MNT directory.

Kind regards,
Ruben


Re: dfuse fails mounting container

Chaarawi, Mohamad
 

Hi Ruben,

 

I assume you create the container with type POSIX?

daos cont create --pool=$DAOS_POOL --svc=$DAOS_SVCL --type=POSIX

and that succeeds and you pass that cont uuid to the dfuse command?

After you create the container, could you please run a cont query on that to verify?

daos cont query --pool=$DAOS_POOL --svc=$DAOS_SVCL --cont=$DAOS_CONT

 

Thanks,

Mohamad

 

From: <daos@daos.groups.io> on behalf of Ruben Felgenhauer <4felgenh@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Saturday, July 11, 2020 at 4:32 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] dfuse fails mounting container

 

Hi,

I'm still failing to get the DAOS Fuse filesystem running. I'm using the daos_server_local.yml config with DAOS v0.9 and have server and agent running.

Both the high level and low level Fuse interface are failing on me:

Low level:
OFI_INTERFACE=eth0 dfuse -S --mountpoint="$DFS_MNT" --svc="$DAOS_SVCL" --pool="$DAOS_POOL" --container="$DAOS_CONT" --foreground

This returns immediately without any error message and will log the following in daos.log:

07/11-11:16:41.21 abu2 DAOS[62814/62814] fi   INFO src/gurt/fault_inject.c:486 d_fault_inject_init() No config file, fault injection is OFF.
07/11-11:16:41.21 abu2 DAOS[62814/62814] crt  INFO src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
07/11-11:16:41.21 abu2 DAOS[62814/62814] crt  WARN src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set; setting to 2048
07/11-11:16:41.46 abu2 DAOS[62814/62814] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
07/11-11:16:41.48 abu2 DAOS[62814/62814] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:328 main(0x559201d7b010) dfs_mount failed (22)
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:362 main(0x559201d7b010) DFP left at the end
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:365 main(0x559201d7af80) DFS left at the end
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:372 main(0x559201d7af80) dfs_umount() failed (22)
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:378 main(0x559201d7af80) daos_cont_close() failed: (-1002)
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:390 main(0x559201d7b010) daos_pool_disconnect() failed: (-1002)
07/11-11:16:41.49 abu2 DAOS[62814/62814] dfuse INFO src/client/dfuse/dfuse_main.c:404 main() Exiting with status 0

High Level:
$ OFI_INTERFACE=eth0 dfuse_hl "$DFS_MNT" -s -f -d -p "$DAOS_POOL" -l "$DAOS_SVCL" -c "$DAOS_CONT"
Pool Connect...
DFS Pool = 0cce90ce-2f6c-4621-836f-b24476acefd0
DFS SVCL = 0
DFS Container: 6312c82d-7daf-34a2-edb7-3b45441cdd6f
Failed dfs mount (22)

This logs the following:
07/11-11:21:11.29 abu2 DAOS[62833/62833] fi   INFO src/gurt/fault_inject.c:486 d_fault_inject_init() No config file, fault injection is OFF.
07/11-11:21:11.29 abu2 DAOS[62833/62833] crt  INFO src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
07/11-11:21:11.29 abu2 DAOS[62833/62833] crt  WARN src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set; setting to 2048
07/11-11:21:11.54 abu2 DAOS[62833/62833] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
07/11-11:21:11.56 abu2 DAOS[62833/62833] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21

Neither of these are mounting anything in "$DFS_MNT". Interestingly, if I leave out the --container at the low level dfuse, it seems to work at first, but later fails if I want to do anything with the $DFS_MNT directory.

Kind regards,
Ruben


dfuse fails mounting container

Ruben Felgenhauer <4felgenh@...>
 

Hi,

I'm still failing to get the DAOS Fuse filesystem running. I'm using the daos_server_local.yml config with DAOS v0.9 and have server and agent running.

Both the high level and low level Fuse interface are failing on me:

Low level:
OFI_INTERFACE=eth0 dfuse -S --mountpoint="$DFS_MNT" --svc="$DAOS_SVCL" --pool="$DAOS_POOL" --container="$DAOS_CONT" --foreground

This returns immediately without any error message and will log the following in daos.log:

07/11-11:16:41.21 abu2 DAOS[62814/62814] fi   INFO src/gurt/fault_inject.c:486 d_fault_inject_init() No config file, fault injection is OFF.
07/11-11:16:41.21 abu2 DAOS[62814/62814] crt  INFO src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
07/11-11:16:41.21 abu2 DAOS[62814/62814] crt  WARN src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set; setting to 2048
07/11-11:16:41.46 abu2 DAOS[62814/62814] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
07/11-11:16:41.48 abu2 DAOS[62814/62814] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:328 main(0x559201d7b010) dfs_mount failed (22)
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:362 main(0x559201d7b010) DFP left at the end
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:365 main(0x559201d7af80) DFS left at the end
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:372 main(0x559201d7af80) dfs_umount() failed (22)
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:378 main(0x559201d7af80) daos_cont_close() failed: (-1002)
07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR  src/client/dfuse/dfuse_main.c:390 main(0x559201d7b010) daos_pool_disconnect() failed: (-1002)
07/11-11:16:41.49 abu2 DAOS[62814/62814] dfuse INFO src/client/dfuse/dfuse_main.c:404 main() Exiting with status 0

High Level:
$ OFI_INTERFACE=eth0 dfuse_hl "$DFS_MNT" -s -f -d -p "$DAOS_POOL" -l "$DAOS_SVCL" -c "$DAOS_CONT"
Pool Connect...
DFS Pool = 0cce90ce-2f6c-4621-836f-b24476acefd0
DFS SVCL = 0
DFS Container: 6312c82d-7daf-34a2-edb7-3b45441cdd6f
Failed dfs mount (22)

This logs the following:
07/11-11:21:11.29 abu2 DAOS[62833/62833] fi   INFO src/gurt/fault_inject.c:486 d_fault_inject_init() No config file, fault injection is OFF.
07/11-11:21:11.29 abu2 DAOS[62833/62833] crt  INFO src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
07/11-11:21:11.29 abu2 DAOS[62833/62833] crt  WARN src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set; setting to 2048
07/11-11:21:11.54 abu2 DAOS[62833/62833] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
07/11-11:21:11.56 abu2 DAOS[62833/62833] daos INFO src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21

Neither of these are mounting anything in "$DFS_MNT". Interestingly, if I leave out the --container at the low level dfuse, it seems to work at first, but later fails if I want to do anything with the $DFS_MNT directory.

Kind regards,
Ruben


Re: Intel Optane SSD mounting problem

Lombardi, Johann
 

Thanks a lot Patrick!

 

From: <daos@daos.groups.io> on behalf of "Farrell, Patrick Arthur" <patrick.farrell@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 2 July 2020 at 18:30
To: "daos@daos.groups.io" <daos@daos.groups.io>
Cc: "Liu, Changpeng" <changpeng.liu@...>, "Harris, James R" <james.r.harris@...>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Johann,

 

Patch worked just fine - able to mount, etc.

 

Thanks much.

 

Regards,

-Patrick

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Intel Optane SSD mounting problem

Farrell, Patrick Arthur
 

Johann,

Patch worked just fine - able to mount, etc.

Thanks much.
 
Regards,
-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Lombardi, Johann <johann.lombardi@...>
Sent: Wednesday, July 1, 2020 2:47 PM
To: daos@daos.groups.io <daos@daos.groups.io>
Cc: Liu, Changpeng <changpeng.liu@...>; Harris, James R <james.r.harris@...>
Subject: Re: [daos] Intel Optane SSD mounting problem
 

Thanks Patrick!

 

The SPDK team came up with this patch: https://review.spdk.io/gerrit/c/spdk/spdk/+/3156

Would you mind giving it a try please? If this works, we will get the process started to integrate it into SPDK and update DAOS to latest SPDK.

Thanks in advance.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "Farrell, Patrick Arthur" <patrick.farrell@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday 30 June 2020 at 22:20
To: "daos@daos.groups.io" <daos@daos.groups.io>
Cc: "Liu, Changpeng" <changpeng.liu@...>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

As suspected, the SSDs are now coming online normally.

 

Let me know if there's any further troubleshooting the SPDK team would like.

 

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Lombardi, Johann <johann.lombardi@...>
Sent: Tuesday, June 30, 2020 5:47 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Cc: Liu, Changpeng <changpeng.liu@...>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Hi Patrick,

 

I talked to the SPDK folks and they suggested to try removing line 1464-1470 in module/bdev/nvme/bdev_nvme.c:

 

-              if (spdk_nvme_ctrlr_get_flags(nvme_bdev_ctrlr->ctrlr) &

-                  SPDK_NVME_CTRLR_SECURITY_SEND_RECV_SUPPORTED) {

-                              nvme_bdev_ctrlr->opal_dev = spdk_opal_dev_construct(nvme_bdev_ctrlr->ctrlr);

-                              if (nvme_bdev_ctrlr->opal_dev == NULL) {

-                                              SPDK_ERRLOG("Failed to initialize Opal\n");

-                              }

-              }

 

Could you please give this a spin?

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "Farrell, Patrick Arthur" <patrick.farrell@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 25 June 2020 at 19:11
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Gert,

 

Thanks for the further information.  Something I don't quite follow:

"The product specification lists both Pyrite as Opal support, what is less clear in the documentation is that Opal is not enable."

 

Then, can Opal be enabled?  Or is it permanently disabled on this model?

 

A general question for you:
Is SPDK simply totally unable to manage some current Intel SSDs if they don't have the full Opal support?

 

Separately, I'm going to dig a little, and I also plan to try older versions of DAOS/SPDK to see what I can figure out.

 

Thanks,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Gert Pauwels (intel) <gert.pauwels@...>
Sent: Thursday, June 25, 2020 12:05 PM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Patrick, Colin,

 

I verified with the NSG Business Unit that builds the 905P SSD.

The product specification lists both Pyrite as Opal support, what is less clear in the documentation is that Opal is not enable. This is causing the behavior you are seeing.

The Identify command correctly returns the Security Send/Receive is supported as this is needed for both Opal and Pyrite.

It is correct that SPDK returns “No Opal support” for the 905P SSD. I verified on my system you do not have this with the Intel Optane SSD P4800X as this support Opal 2.0

 

What I cannot explain right now is why the drive was working with DAOS in the past.

 

Regards,

 

Gert,

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Farrell, Patrick Arthur
Sent: Thursday, June 25, 2020 5:07 PM
To: daos@daos.groups.io
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Recall that we have used this SSD with DAOS in the past.  Whatever is preventing this now is a change, perhaps in SPDK but possibly in how DAOS is invoking it.

 

Regards,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Sent: Thursday, June 25, 2020 3:43 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

We probably need to get further information from NSG folks on product specifics regarding Opal support in 905p SSDs. This seems to be a prerequisite for device management through SPDK.

 

Tom

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Farrell, Patrick Arthur
Sent: Wednesday, June 24, 2020 8:42 PM
To: daos@daos.groups.io
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Gert,

 

We're seeing this when we run identify on our 905p SSDs:

"

NVMe Controller at 0000:1a:00.0 [8086:2700]

[...]

Serial Number:                         PHM29226005S480BGN
Model Number:                          INTEL SSDPE21D480GA

[...]

Admin Command Set Attributes

============================

Security Send/Receive:                 Supported

"

 

However, when I run nvme_manage, I get an error stating that Opal is not supported (the same error DAOS is spitting out, incidentally):

 

"Please Input PCI Address(domain:bus:dev.func):

0000:1a:00.00

Opal General Usage:

 

        [1: scan device]

        [2: init - take ownership and activate locking]

        [3: revert tper]

        [4: setup locking range]

        [5: list locking ranges]

        [6: enable user]

        [7: set new password]

        [8: add user to locking range]

        [9: lock/unlock range]

        [10: erase locking range]

        [0: quit]

1

[2020-06-24 14:37:34.452086] nvme_opal.c: 828:opal_discovery0_end: *ERROR*: Opal Not Supported.

"

 

Any thoughts?

 

Regards,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Gert Pauwels (intel) <gert.pauwels@...>
Sent: Wednesday, June 24, 2020 11:54 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Hi Tom, Colin,


I'm running Ubuntu 20.04 LTS  (kernel: 5.4.0-37-generic) and compiled DAOS v1.0.0.

I also compiled the latest master as of yesterday, but it did not make a difference.

Any application that can manage Opal 2.0 can be used to check the status of the drive. I used the sedutil-cli, can be found at https://github.com/Drive-Trust-Alliance/sedutil the executable can be found at https://github.com/Drive-Trust-Alliance/sedutil/wiki/Executable-Distributions 
You can run
# sedutil-cli --scan
and
# sedutil-cli --query <device>
it will return useful information about the Opal status of the device.
sedutil-cli will send NVMe security commands and the tool will only work if your SSD is bound the the NVMe driver.

In case the SSD is not bound to the NVMe driver you can manage the drive with the nvme_manage spdk utility you can found in the /daos/_build.external/dev/spdk
#
./examples/nvme/nvme_manage/nvme_manage 
Select 8 to get into the OPAL NVMe Management Options, enter the PCIe address of your SSD for the list you get prompted, select 1 to scan the device. These steps gives you the same results as running the 
# sedutil-cli --query <device> in case the SSD is bound to the NVMe driver.

Regards,

Gert,


Re: Intel Optane SSD mounting problem

Lombardi, Johann
 

Thanks Patrick!

 

The SPDK team came up with this patch: https://review.spdk.io/gerrit/c/spdk/spdk/+/3156

Would you mind giving it a try please? If this works, we will get the process started to integrate it into SPDK and update DAOS to latest SPDK.

Thanks in advance.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "Farrell, Patrick Arthur" <patrick.farrell@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday 30 June 2020 at 22:20
To: "daos@daos.groups.io" <daos@daos.groups.io>
Cc: "Liu, Changpeng" <changpeng.liu@...>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

As suspected, the SSDs are now coming online normally.

 

Let me know if there's any further troubleshooting the SPDK team would like.

 

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Lombardi, Johann <johann.lombardi@...>
Sent: Tuesday, June 30, 2020 5:47 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Cc: Liu, Changpeng <changpeng.liu@...>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Hi Patrick,

 

I talked to the SPDK folks and they suggested to try removing line 1464-1470 in module/bdev/nvme/bdev_nvme.c:

 

-              if (spdk_nvme_ctrlr_get_flags(nvme_bdev_ctrlr->ctrlr) &

-                  SPDK_NVME_CTRLR_SECURITY_SEND_RECV_SUPPORTED) {

-                              nvme_bdev_ctrlr->opal_dev = spdk_opal_dev_construct(nvme_bdev_ctrlr->ctrlr);

-                              if (nvme_bdev_ctrlr->opal_dev == NULL) {

-                                              SPDK_ERRLOG("Failed to initialize Opal\n");

-                              }

-              }

 

Could you please give this a spin?

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "Farrell, Patrick Arthur" <patrick.farrell@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 25 June 2020 at 19:11
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Gert,

 

Thanks for the further information.  Something I don't quite follow:

"The product specification lists both Pyrite as Opal support, what is less clear in the documentation is that Opal is not enable."

 

Then, can Opal be enabled?  Or is it permanently disabled on this model?

 

A general question for you:
Is SPDK simply totally unable to manage some current Intel SSDs if they don't have the full Opal support?

 

Separately, I'm going to dig a little, and I also plan to try older versions of DAOS/SPDK to see what I can figure out.

 

Thanks,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Gert Pauwels (intel) <gert.pauwels@...>
Sent: Thursday, June 25, 2020 12:05 PM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Patrick, Colin,

 

I verified with the NSG Business Unit that builds the 905P SSD.

The product specification lists both Pyrite as Opal support, what is less clear in the documentation is that Opal is not enable. This is causing the behavior you are seeing.

The Identify command correctly returns the Security Send/Receive is supported as this is needed for both Opal and Pyrite.

It is correct that SPDK returns “No Opal support” for the 905P SSD. I verified on my system you do not have this with the Intel Optane SSD P4800X as this support Opal 2.0

 

What I cannot explain right now is why the drive was working with DAOS in the past.

 

Regards,

 

Gert,

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Farrell, Patrick Arthur
Sent: Thursday, June 25, 2020 5:07 PM
To: daos@daos.groups.io
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Recall that we have used this SSD with DAOS in the past.  Whatever is preventing this now is a change, perhaps in SPDK but possibly in how DAOS is invoking it.

 

Regards,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Sent: Thursday, June 25, 2020 3:43 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

We probably need to get further information from NSG folks on product specifics regarding Opal support in 905p SSDs. This seems to be a prerequisite for device management through SPDK.

 

Tom

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Farrell, Patrick Arthur
Sent: Wednesday, June 24, 2020 8:42 PM
To: daos@daos.groups.io
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Gert,

 

We're seeing this when we run identify on our 905p SSDs:

"

NVMe Controller at 0000:1a:00.0 [8086:2700]

[...]

Serial Number:                         PHM29226005S480BGN
Model Number:                          INTEL SSDPE21D480GA

[...]

Admin Command Set Attributes

============================

Security Send/Receive:                 Supported

"

 

However, when I run nvme_manage, I get an error stating that Opal is not supported (the same error DAOS is spitting out, incidentally):

 

"Please Input PCI Address(domain:bus:dev.func):

0000:1a:00.00

Opal General Usage:

 

        [1: scan device]

        [2: init - take ownership and activate locking]

        [3: revert tper]

        [4: setup locking range]

        [5: list locking ranges]

        [6: enable user]

        [7: set new password]

        [8: add user to locking range]

        [9: lock/unlock range]

        [10: erase locking range]

        [0: quit]

1

[2020-06-24 14:37:34.452086] nvme_opal.c: 828:opal_discovery0_end: *ERROR*: Opal Not Supported.

"

 

Any thoughts?

 

Regards,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Gert Pauwels (intel) <gert.pauwels@...>
Sent: Wednesday, June 24, 2020 11:54 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Hi Tom, Colin,


I'm running Ubuntu 20.04 LTS  (kernel: 5.4.0-37-generic) and compiled DAOS v1.0.0.

I also compiled the latest master as of yesterday, but it did not make a difference.

Any application that can manage Opal 2.0 can be used to check the status of the drive. I used the sedutil-cli, can be found at https://github.com/Drive-Trust-Alliance/sedutil the executable can be found at https://github.com/Drive-Trust-Alliance/sedutil/wiki/Executable-Distributions 
You can run
# sedutil-cli --scan
and
# sedutil-cli --query <device>
it will return useful information about the Opal status of the device.
sedutil-cli will send NVMe security commands and the tool will only work if your SSD is bound the the NVMe driver.

In case the SSD is not bound to the NVMe driver you can manage the drive with the nvme_manage spdk utility you can found in the /daos/_build.external/dev/spdk
#
./examples/nvme/nvme_manage/nvme_manage 
Select 8 to get into the OPAL NVMe Management Options, enter the PCIe address of your SSD for the list you get prompted, select 1 to scan the device. These steps gives you the same results as running the 
# sedutil-cli --query <device> in case the SSD is bound to the NVMe driver.

Regards,

Gert,

_._,_._,_

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Intel Optane SSD mounting problem

Farrell, Patrick Arthur
 

As suspected, the SSDs are now coming online normally.

Let me know if there's any further troubleshooting the SPDK team would like.

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Lombardi, Johann <johann.lombardi@...>
Sent: Tuesday, June 30, 2020 5:47 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Cc: Liu, Changpeng <changpeng.liu@...>
Subject: Re: [daos] Intel Optane SSD mounting problem
 

Hi Patrick,

 

I talked to the SPDK folks and they suggested to try removing line 1464-1470 in module/bdev/nvme/bdev_nvme.c:

 

-              if (spdk_nvme_ctrlr_get_flags(nvme_bdev_ctrlr->ctrlr) &

-                  SPDK_NVME_CTRLR_SECURITY_SEND_RECV_SUPPORTED) {

-                              nvme_bdev_ctrlr->opal_dev = spdk_opal_dev_construct(nvme_bdev_ctrlr->ctrlr);

-                              if (nvme_bdev_ctrlr->opal_dev == NULL) {

-                                              SPDK_ERRLOG("Failed to initialize Opal\n");

-                              }

-              }

 

Could you please give this a spin?

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "Farrell, Patrick Arthur" <patrick.farrell@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 25 June 2020 at 19:11
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Gert,

 

Thanks for the further information.  Something I don't quite follow:

"The product specification lists both Pyrite as Opal support, what is less clear in the documentation is that Opal is not enable."

 

Then, can Opal be enabled?  Or is it permanently disabled on this model?

 

A general question for you:
Is SPDK simply totally unable to manage some current Intel SSDs if they don't have the full Opal support?

 

Separately, I'm going to dig a little, and I also plan to try older versions of DAOS/SPDK to see what I can figure out.

 

Thanks,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Gert Pauwels (intel) <gert.pauwels@...>
Sent: Thursday, June 25, 2020 12:05 PM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Patrick, Colin,

 

I verified with the NSG Business Unit that builds the 905P SSD.

The product specification lists both Pyrite as Opal support, what is less clear in the documentation is that Opal is not enable. This is causing the behavior you are seeing.

The Identify command correctly returns the Security Send/Receive is supported as this is needed for both Opal and Pyrite.

It is correct that SPDK returns “No Opal support” for the 905P SSD. I verified on my system you do not have this with the Intel Optane SSD P4800X as this support Opal 2.0

 

What I cannot explain right now is why the drive was working with DAOS in the past.

 

Regards,

 

Gert,

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Farrell, Patrick Arthur
Sent: Thursday, June 25, 2020 5:07 PM
To: daos@daos.groups.io
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Recall that we have used this SSD with DAOS in the past.  Whatever is preventing this now is a change, perhaps in SPDK but possibly in how DAOS is invoking it.

 

Regards,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Sent: Thursday, June 25, 2020 3:43 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

We probably need to get further information from NSG folks on product specifics regarding Opal support in 905p SSDs. This seems to be a prerequisite for device management through SPDK.

 

Tom

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Farrell, Patrick Arthur
Sent: Wednesday, June 24, 2020 8:42 PM
To: daos@daos.groups.io
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Gert,

 

We're seeing this when we run identify on our 905p SSDs:

"

NVMe Controller at 0000:1a:00.0 [8086:2700]

[...]

Serial Number:                         PHM29226005S480BGN
Model Number:                          INTEL SSDPE21D480GA

[...]

Admin Command Set Attributes

============================

Security Send/Receive:                 Supported

"

 

However, when I run nvme_manage, I get an error stating that Opal is not supported (the same error DAOS is spitting out, incidentally):

 

"Please Input PCI Address(domain:bus:dev.func):

0000:1a:00.00

Opal General Usage:

 

        [1: scan device]

        [2: init - take ownership and activate locking]

        [3: revert tper]

        [4: setup locking range]

        [5: list locking ranges]

        [6: enable user]

        [7: set new password]

        [8: add user to locking range]

        [9: lock/unlock range]

        [10: erase locking range]

        [0: quit]

1

[2020-06-24 14:37:34.452086] nvme_opal.c: 828:opal_discovery0_end: *ERROR*: Opal Not Supported.

"

 

Any thoughts?

 

Regards,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Gert Pauwels (intel) <gert.pauwels@...>
Sent: Wednesday, June 24, 2020 11:54 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Hi Tom, Colin,


I'm running Ubuntu 20.04 LTS  (kernel: 5.4.0-37-generic) and compiled DAOS v1.0.0.

I also compiled the latest master as of yesterday, but it did not make a difference.

Any application that can manage Opal 2.0 can be used to check the status of the drive. I used the sedutil-cli, can be found at https://github.com/Drive-Trust-Alliance/sedutil the executable can be found at https://github.com/Drive-Trust-Alliance/sedutil/wiki/Executable-Distributions 
You can run
# sedutil-cli --scan
and
# sedutil-cli --query <device>
it will return useful information about the Opal status of the device.
sedutil-cli will send NVMe security commands and the tool will only work if your SSD is bound the the NVMe driver.

In case the SSD is not bound to the NVMe driver you can manage the drive with the nvme_manage spdk utility you can found in the /daos/_build.external/dev/spdk
#
./examples/nvme/nvme_manage/nvme_manage 
Select 8 to get into the OPAL NVMe Management Options, enter the PCIe address of your SSD for the list you get prompted, select 1 to scan the device. These steps gives you the same results as running the 
# sedutil-cli --query <device> in case the SSD is bound to the NVMe driver.

Regards,

Gert,

 

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Intel Corporation NV/SA
Kings Square, Veldkant 31
2550 Kontich
RPM (Bruxelles) 0415.497.718.
Citibank, Brussels, account 570/1031255/09

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Intel Optane SSD mounting problem

Farrell, Patrick Arthur
 

Heh, yes!  These are among the lines I was planning to try to remove, so will do.

It is interesting to note of course that SEND_RECV is reported by the tool Gert had me using, but that's not a total shock.

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Lombardi, Johann <johann.lombardi@...>
Sent: Tuesday, June 30, 2020 5:47 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Cc: Liu, Changpeng <changpeng.liu@...>
Subject: Re: [daos] Intel Optane SSD mounting problem
 

Hi Patrick,

 

I talked to the SPDK folks and they suggested to try removing line 1464-1470 in module/bdev/nvme/bdev_nvme.c:

 

-              if (spdk_nvme_ctrlr_get_flags(nvme_bdev_ctrlr->ctrlr) &

-                  SPDK_NVME_CTRLR_SECURITY_SEND_RECV_SUPPORTED) {

-                              nvme_bdev_ctrlr->opal_dev = spdk_opal_dev_construct(nvme_bdev_ctrlr->ctrlr);

-                              if (nvme_bdev_ctrlr->opal_dev == NULL) {

-                                              SPDK_ERRLOG("Failed to initialize Opal\n");

-                              }

-              }

 

Could you please give this a spin?

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "Farrell, Patrick Arthur" <patrick.farrell@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 25 June 2020 at 19:11
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Gert,

 

Thanks for the further information.  Something I don't quite follow:

"The product specification lists both Pyrite as Opal support, what is less clear in the documentation is that Opal is not enable."

 

Then, can Opal be enabled?  Or is it permanently disabled on this model?

 

A general question for you:
Is SPDK simply totally unable to manage some current Intel SSDs if they don't have the full Opal support?

 

Separately, I'm going to dig a little, and I also plan to try older versions of DAOS/SPDK to see what I can figure out.

 

Thanks,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Gert Pauwels (intel) <gert.pauwels@...>
Sent: Thursday, June 25, 2020 12:05 PM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Patrick, Colin,

 

I verified with the NSG Business Unit that builds the 905P SSD.

The product specification lists both Pyrite as Opal support, what is less clear in the documentation is that Opal is not enable. This is causing the behavior you are seeing.

The Identify command correctly returns the Security Send/Receive is supported as this is needed for both Opal and Pyrite.

It is correct that SPDK returns “No Opal support” for the 905P SSD. I verified on my system you do not have this with the Intel Optane SSD P4800X as this support Opal 2.0

 

What I cannot explain right now is why the drive was working with DAOS in the past.

 

Regards,

 

Gert,

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Farrell, Patrick Arthur
Sent: Thursday, June 25, 2020 5:07 PM
To: daos@daos.groups.io
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Recall that we have used this SSD with DAOS in the past.  Whatever is preventing this now is a change, perhaps in SPDK but possibly in how DAOS is invoking it.

 

Regards,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nabarro, Tom <tom.nabarro@...>
Sent: Thursday, June 25, 2020 3:43 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

We probably need to get further information from NSG folks on product specifics regarding Opal support in 905p SSDs. This seems to be a prerequisite for device management through SPDK.

 

Tom

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Farrell, Patrick Arthur
Sent: Wednesday, June 24, 2020 8:42 PM
To: daos@daos.groups.io
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Gert,

 

We're seeing this when we run identify on our 905p SSDs:

"

NVMe Controller at 0000:1a:00.0 [8086:2700]

[...]

Serial Number:                         PHM29226005S480BGN
Model Number:                          INTEL SSDPE21D480GA

[...]

Admin Command Set Attributes

============================

Security Send/Receive:                 Supported

"

 

However, when I run nvme_manage, I get an error stating that Opal is not supported (the same error DAOS is spitting out, incidentally):

 

"Please Input PCI Address(domain:bus:dev.func):

0000:1a:00.00

Opal General Usage:

 

        [1: scan device]

        [2: init - take ownership and activate locking]

        [3: revert tper]

        [4: setup locking range]

        [5: list locking ranges]

        [6: enable user]

        [7: set new password]

        [8: add user to locking range]

        [9: lock/unlock range]

        [10: erase locking range]

        [0: quit]

1

[2020-06-24 14:37:34.452086] nvme_opal.c: 828:opal_discovery0_end: *ERROR*: Opal Not Supported.

"

 

Any thoughts?

 

Regards,

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Gert Pauwels (intel) <gert.pauwels@...>
Sent: Wednesday, June 24, 2020 11:54 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Intel Optane SSD mounting problem

 

Hi Tom, Colin,


I'm running Ubuntu 20.04 LTS  (kernel: 5.4.0-37-generic) and compiled DAOS v1.0.0.

I also compiled the latest master as of yesterday, but it did not make a difference.

Any application that can manage Opal 2.0 can be used to check the status of the drive. I used the sedutil-cli, can be found at https://github.com/Drive-Trust-Alliance/sedutil the executable can be found at https://github.com/Drive-Trust-Alliance/sedutil/wiki/Executable-Distributions 
You can run
# sedutil-cli --scan
and
# sedutil-cli --query <device>
it will return useful information about the Opal status of the device.
sedutil-cli will send NVMe security commands and the tool will only work if your SSD is bound the the NVMe driver.

In case the SSD is not bound to the NVMe driver you can manage the drive with the nvme_manage spdk utility you can found in the /daos/_build.external/dev/spdk
#
./examples/nvme/nvme_manage/nvme_manage 
Select 8 to get into the OPAL NVMe Management Options, enter the PCIe address of your SSD for the list you get prompted, select 1 to scan the device. These steps gives you the same results as running the 
# sedutil-cli --query <device> in case the SSD is bound to the NVMe driver.

Regards,

Gert,

 

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Intel Corporation NV/SA
Kings Square, Veldkant 31
2550 Kontich
RPM (Bruxelles) 0415.497.718.
Citibank, Brussels, account 570/1031255/09

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

281 - 300 of 1391