Help debugging a daos I/O server startup failure?


Kevan Rehm
 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan


Kevan Rehm
 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan


Kevan Rehm
 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan


Kevan Rehm
 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan


Lombardi, Johann
 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Kevan Rehm
 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Lombardi, Johann
 

Hm, then I don’t understand why it is trying to read /proc/self/pagemap (only accessible to root in recent kernels). Maybe Tom and Mike can comment.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 18:06
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Kevan Rehm
 

I should probably be more accurate and say that “I think I am using vfio”.   😊.   Here are the things I have checked, if there are other things to check, let me know.

 

# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.10.0-1062.12.1.el7.x86_64 root=/dev/mapper/cl_delphi--004-root ro spectre_v2=retpoline rd.lvm.lv=cl_delphi-004/root rd.lvm.lv=cl_delphi-004/swap rhgb quiet intel_iommu=on console=ttyS1,115200

 

# ls -l

total 0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar0 -> ../../devices/virtual/iommu/dmar0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar1 -> ../../devices/virtual/iommu/dmar1

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar2 -> ../../devices/virtual/iommu/dmar2

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar3 -> ../../devices/virtual/iommu/dmar3

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar4 -> ../../devices/virtual/iommu/dmar4

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar5 -> ../../devices/virtual/iommu/dmar5

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar6 -> ../../devices/virtual/iommu/dmar6

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar7 -> ../../devices/virtual/iommu/dmar7

[root@delphi-004 iommu]# pwd

/sys/class/iommu

 

From daos_control.log:

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Initializing vfio

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Probing VFIO support...

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 1 (Type 1) is supported

EAL:   IOMMU type 7 (sPAPR) is not supported

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 8 (No-IOMMU) is not supported

EAL: VFIO support initialized

 

[root@delphi-004 vfio]# cd /dev/vfio

[root@delphi-004 vfio]# ls -l

total 0

crw-rw-rw-. 1 root root 235,  11 Mar 15 02:34 1

crw-rw-rw-. 1 root root 235,   0 Mar 15 02:33 29

crw-rw-rw-. 1 root root 235,   1 Mar 15 02:33 41

crw-rw-rw-. 1 root root 235,   2 Mar 15 02:33 42

crw-rw-rw-. 1 root root 235,   3 Mar 15 02:33 43

crw-rw-rw-. 1 root root 235,   4 Mar 15 02:33 44

crw-rw-rw-. 1 root root 235,  12 Mar 15 02:34 55

crw-rw-rw-. 1 root root 235,   5 Mar 15 02:33 71

crw-rw-rw-. 1 root root 235,   6 Mar 15 02:33 72

crw-rw-rw-. 1 root root 235,   7 Mar 15 02:33 84

crw-rw-rw-. 1 root root 235,   8 Mar 15 02:33 85

crw-rw-rw-. 1 root root 235,   9 Mar 15 02:33 86

crw-rw-rw-. 1 root root 235,  10 Mar 15 02:34 87

crw-rw-rw-. 1 root root  10, 196 Mar 15 02:18 vfio

 

From daos_server.yml, server 1 has:

 

   bdev_class: nvme

   bdev_list: ["0000:1a:00.0", "0000:3b:00.0", "0000:3c:00.0", "0000:3d:00.0", "0000:3e:00.0"]

and server 2 has:

   bdev_class: nvme

   bdev_list: ["0000:86:00.0", "0000:87:00.0", "0000:af:00.0", "0000:b0:00.0", "0000:b1:00.0", "0000:b2:00.0"]

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:44 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hm, then I don’t understand why it is trying to read /proc/self/pagemap (only accessible to root in recent kernels). Maybe Tom and Mike can comment.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 18:06
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Nabarro, Tom
 

Hello Kevan,

 

sincere apologies for not being able to reply sooner, I haven't been well and Mike has been on leave.

 

SPDK through VFIO (when running as non-root) broke with the version of SPDK (19.04) we use in our build when we moved from Centos 7.6->7.7 . The fix is to upgrade SPDK to 20.01, unfortunately there have been some API breakages between those versions and we have had to push for another release to properly bump .so versions to reflect the API changes. we are therefore waiting for 20.01.1 which is targeted for March 20th.

 

Think this is the issue you are seeing, this PR should enable you to run as non-root (I've been using it for a while): https://github.com/daos-stack/daos/pull/1902

Rebuild instructions: https://github.com/daos-stack/daos/pull/1902#issuecomment-595702110

 

https://jira.hpdd.intel.com/browse/DAOS-4164

 

Hope that helps

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Sunday, March 15, 2020 8:05 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I should probably be more accurate and say that “I think I am using vfio”.   😊.   Here are the things I have checked, if there are other things to check, let me know.

 

# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.10.0-1062.12.1.el7.x86_64 root=/dev/mapper/cl_delphi--004-root ro spectre_v2=retpoline rd.lvm.lv=cl_delphi-004/root rd.lvm.lv=cl_delphi-004/swap rhgb quiet intel_iommu=on console=ttyS1,115200

 

# ls -l

total 0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar0 -> ../../devices/virtual/iommu/dmar0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar1 -> ../../devices/virtual/iommu/dmar1

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar2 -> ../../devices/virtual/iommu/dmar2

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar3 -> ../../devices/virtual/iommu/dmar3

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar4 -> ../../devices/virtual/iommu/dmar4

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar5 -> ../../devices/virtual/iommu/dmar5

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar6 -> ../../devices/virtual/iommu/dmar6

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar7 -> ../../devices/virtual/iommu/dmar7

[root@delphi-004 iommu]# pwd

/sys/class/iommu

 

From daos_control.log:

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Initializing vfio

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Probing VFIO support...

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 1 (Type 1) is supported

EAL:   IOMMU type 7 (sPAPR) is not supported

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 8 (No-IOMMU) is not supported

EAL: VFIO support initialized

 

[root@delphi-004 vfio]# cd /dev/vfio

[root@delphi-004 vfio]# ls -l

total 0

crw-rw-rw-. 1 root root 235,  11 Mar 15 02:34 1

crw-rw-rw-. 1 root root 235,   0 Mar 15 02:33 29

crw-rw-rw-. 1 root root 235,   1 Mar 15 02:33 41

crw-rw-rw-. 1 root root 235,   2 Mar 15 02:33 42

crw-rw-rw-. 1 root root 235,   3 Mar 15 02:33 43

crw-rw-rw-. 1 root root 235,   4 Mar 15 02:33 44

crw-rw-rw-. 1 root root 235,  12 Mar 15 02:34 55

crw-rw-rw-. 1 root root 235,   5 Mar 15 02:33 71

crw-rw-rw-. 1 root root 235,   6 Mar 15 02:33 72

crw-rw-rw-. 1 root root 235,   7 Mar 15 02:33 84

crw-rw-rw-. 1 root root 235,   8 Mar 15 02:33 85

crw-rw-rw-. 1 root root 235,   9 Mar 15 02:33 86

crw-rw-rw-. 1 root root 235,  10 Mar 15 02:34 87

crw-rw-rw-. 1 root root  10, 196 Mar 15 02:18 vfio

 

From daos_server.yml, server 1 has:

 

   bdev_class: nvme

   bdev_list: ["0000:1a:00.0", "0000:3b:00.0", "0000:3c:00.0", "0000:3d:00.0", "0000:3e:00.0"]

and server 2 has:

   bdev_class: nvme

   bdev_list: ["0000:86:00.0", "0000:87:00.0", "0000:af:00.0", "0000:b0:00.0", "0000:b1:00.0", "0000:b2:00.0"]

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:44 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hm, then I don’t understand why it is trying to read /proc/self/pagemap (only accessible to root in recent kernels). Maybe Tom and Mike can comment.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 18:06
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Kevan Rehm
 

Tom,

 

Sorry to hear that, being sick is no fun, hope you’re over the hump.

 

Yes, this is the problem I’ve been chasing for a couple of weeks, we are running centos 7.7.  I learned more about DPDK internals than I planned to.  😊

 

Not sure how to make this happen, but it would be nice if breakages like this got communicated to the outside world when they happen so that we don’t lose cycles debugging things that have already been debugged.  Maybe a web page of “current issues”?  Or email messages here?    You probably have better ideas.    I did try reading through the Jira log daily for a while to watch for new issues, but that didn’t seem very productive, TMI that didn’t pertain.

 

I got some consolation from the fact that at least this time it wasn’t pilot error.  😊

 

Thanks, Kevan

 

P.S. We have talked from time to time about upgrading to centos 8.   Would that be a bad idea?

 

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 11:37 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hello Kevan,

 

sincere apologies for not being able to reply sooner, I haven't been well and Mike has been on leave.

 

SPDK through VFIO (when running as non-root) broke with the version of SPDK (19.04) we use in our build when we moved from Centos 7.6->7.7 . The fix is to upgrade SPDK to 20.01, unfortunately there have been some API breakages between those versions and we have had to push for another release to properly bump .so versions to reflect the API changes. we are therefore waiting for 20.01.1 which is targeted for March 20th.

 

Think this is the issue you are seeing, this PR should enable you to run as non-root (I've been using it for a while): https://github.com/daos-stack/daos/pull/1902

Rebuild instructions: https://github.com/daos-stack/daos/pull/1902#issuecomment-595702110

 

https://jira.hpdd.intel.com/browse/DAOS-4164

 

Hope that helps

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Sunday, March 15, 2020 8:05 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I should probably be more accurate and say that “I think I am using vfio”.   😊.   Here are the things I have checked, if there are other things to check, let me know.

 

# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.10.0-1062.12.1.el7.x86_64 root=/dev/mapper/cl_delphi--004-root ro spectre_v2=retpoline rd.lvm.lv=cl_delphi-004/root rd.lvm.lv=cl_delphi-004/swap rhgb quiet intel_iommu=on console=ttyS1,115200

 

# ls -l

total 0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar0 -> ../../devices/virtual/iommu/dmar0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar1 -> ../../devices/virtual/iommu/dmar1

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar2 -> ../../devices/virtual/iommu/dmar2

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar3 -> ../../devices/virtual/iommu/dmar3

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar4 -> ../../devices/virtual/iommu/dmar4

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar5 -> ../../devices/virtual/iommu/dmar5

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar6 -> ../../devices/virtual/iommu/dmar6

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar7 -> ../../devices/virtual/iommu/dmar7

[root@delphi-004 iommu]# pwd

/sys/class/iommu

 

From daos_control.log:

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Initializing vfio

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Probing VFIO support...

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 1 (Type 1) is supported

EAL:   IOMMU type 7 (sPAPR) is not supported

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 8 (No-IOMMU) is not supported

EAL: VFIO support initialized

 

[root@delphi-004 vfio]# cd /dev/vfio

[root@delphi-004 vfio]# ls -l

total 0

crw-rw-rw-. 1 root root 235,  11 Mar 15 02:34 1

crw-rw-rw-. 1 root root 235,   0 Mar 15 02:33 29

crw-rw-rw-. 1 root root 235,   1 Mar 15 02:33 41

crw-rw-rw-. 1 root root 235,   2 Mar 15 02:33 42

crw-rw-rw-. 1 root root 235,   3 Mar 15 02:33 43

crw-rw-rw-. 1 root root 235,   4 Mar 15 02:33 44

crw-rw-rw-. 1 root root 235,  12 Mar 15 02:34 55

crw-rw-rw-. 1 root root 235,   5 Mar 15 02:33 71

crw-rw-rw-. 1 root root 235,   6 Mar 15 02:33 72

crw-rw-rw-. 1 root root 235,   7 Mar 15 02:33 84

crw-rw-rw-. 1 root root 235,   8 Mar 15 02:33 85

crw-rw-rw-. 1 root root 235,   9 Mar 15 02:33 86

crw-rw-rw-. 1 root root 235,  10 Mar 15 02:34 87

crw-rw-rw-. 1 root root  10, 196 Mar 15 02:18 vfio

 

From daos_server.yml, server 1 has:

 

   bdev_class: nvme

   bdev_list: ["0000:1a:00.0", "0000:3b:00.0", "0000:3c:00.0", "0000:3d:00.0", "0000:3e:00.0"]

and server 2 has:

   bdev_class: nvme

   bdev_list: ["0000:86:00.0", "0000:87:00.0", "0000:af:00.0", "0000:b0:00.0", "0000:b1:00.0", "0000:b2:00.0"]

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:44 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hm, then I don’t understand why it is trying to read /proc/self/pagemap (only accessible to root in recent kernels). Maybe Tom and Mike can comment.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 18:06
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Nabarro, Tom
 

Apologies that this has taken some of your time, up until now the only real supported option for running DAOS with NVMe is to run as root (many months ago you could get away with running UIO+SPDK as non-root but a security fix in kernel precluded that). We haven't communicated VFIO+SPDK non-root as a supported configuration yet as far as I know.

 

Let me know how you get on with the build from that PR.

 

We are not using Centos 8 yet so if you did you would be doing some useful pathfinding :-)

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Monday, March 16, 2020 4:21 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Tom,

 

Sorry to hear that, being sick is no fun, hope you’re over the hump.

 

Yes, this is the problem I’ve been chasing for a couple of weeks, we are running centos 7.7.  I learned more about DPDK internals than I planned to.  😊

 

Not sure how to make this happen, but it would be nice if breakages like this got communicated to the outside world when they happen so that we don’t lose cycles debugging things that have already been debugged.  Maybe a web page of “current issues”?  Or email messages here?    You probably have better ideas.    I did try reading through the Jira log daily for a while to watch for new issues, but that didn’t seem very productive, TMI that didn’t pertain.

 

I got some consolation from the fact that at least this time it wasn’t pilot error.  😊

 

Thanks, Kevan

 

P.S. We have talked from time to time about upgrading to centos 8.   Would that be a bad idea?

 

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 11:37 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hello Kevan,

 

sincere apologies for not being able to reply sooner, I haven't been well and Mike has been on leave.

 

SPDK through VFIO (when running as non-root) broke with the version of SPDK (19.04) we use in our build when we moved from Centos 7.6->7.7 . The fix is to upgrade SPDK to 20.01, unfortunately there have been some API breakages between those versions and we have had to push for another release to properly bump .so versions to reflect the API changes. we are therefore waiting for 20.01.1 which is targeted for March 20th.

 

Think this is the issue you are seeing, this PR should enable you to run as non-root (I've been using it for a while): https://github.com/daos-stack/daos/pull/1902

Rebuild instructions: https://github.com/daos-stack/daos/pull/1902#issuecomment-595702110

 

https://jira.hpdd.intel.com/browse/DAOS-4164

 

Hope that helps

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Sunday, March 15, 2020 8:05 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I should probably be more accurate and say that “I think I am using vfio”.   😊.   Here are the things I have checked, if there are other things to check, let me know.

 

# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.10.0-1062.12.1.el7.x86_64 root=/dev/mapper/cl_delphi--004-root ro spectre_v2=retpoline rd.lvm.lv=cl_delphi-004/root rd.lvm.lv=cl_delphi-004/swap rhgb quiet intel_iommu=on console=ttyS1,115200

 

# ls -l

total 0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar0 -> ../../devices/virtual/iommu/dmar0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar1 -> ../../devices/virtual/iommu/dmar1

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar2 -> ../../devices/virtual/iommu/dmar2

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar3 -> ../../devices/virtual/iommu/dmar3

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar4 -> ../../devices/virtual/iommu/dmar4

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar5 -> ../../devices/virtual/iommu/dmar5

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar6 -> ../../devices/virtual/iommu/dmar6

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar7 -> ../../devices/virtual/iommu/dmar7

[root@delphi-004 iommu]# pwd

/sys/class/iommu

 

From daos_control.log:

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Initializing vfio

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Probing VFIO support...

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 1 (Type 1) is supported

EAL:   IOMMU type 7 (sPAPR) is not supported

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 8 (No-IOMMU) is not supported

EAL: VFIO support initialized

 

[root@delphi-004 vfio]# cd /dev/vfio

[root@delphi-004 vfio]# ls -l

total 0

crw-rw-rw-. 1 root root 235,  11 Mar 15 02:34 1

crw-rw-rw-. 1 root root 235,   0 Mar 15 02:33 29

crw-rw-rw-. 1 root root 235,   1 Mar 15 02:33 41

crw-rw-rw-. 1 root root 235,   2 Mar 15 02:33 42

crw-rw-rw-. 1 root root 235,   3 Mar 15 02:33 43

crw-rw-rw-. 1 root root 235,   4 Mar 15 02:33 44

crw-rw-rw-. 1 root root 235,  12 Mar 15 02:34 55

crw-rw-rw-. 1 root root 235,   5 Mar 15 02:33 71

crw-rw-rw-. 1 root root 235,   6 Mar 15 02:33 72

crw-rw-rw-. 1 root root 235,   7 Mar 15 02:33 84

crw-rw-rw-. 1 root root 235,   8 Mar 15 02:33 85

crw-rw-rw-. 1 root root 235,   9 Mar 15 02:33 86

crw-rw-rw-. 1 root root 235,  10 Mar 15 02:34 87

crw-rw-rw-. 1 root root  10, 196 Mar 15 02:18 vfio

 

From daos_server.yml, server 1 has:

 

   bdev_class: nvme

   bdev_list: ["0000:1a:00.0", "0000:3b:00.0", "0000:3c:00.0", "0000:3d:00.0", "0000:3e:00.0"]

and server 2 has:

   bdev_class: nvme

   bdev_list: ["0000:86:00.0", "0000:87:00.0", "0000:af:00.0", "0000:b0:00.0", "0000:b1:00.0", "0000:b2:00.0"]

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:44 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hm, then I don’t understand why it is trying to read /proc/self/pagemap (only accessible to root in recent kernels). Maybe Tom and Mike can comment.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 18:06
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Kevan Rehm
 

Well, the daos admin guide describes how to run daos as non-root user, and how to install and configure daos_server and daos_admin for that use case.   I have been following the documentation.  I figured if it is documented, then it is supported.  

 

Originally I followed the email that came out about a month or two ago on how to run daos as non-root, it described how to also do the daos_admin config.   I took that to mean non-root was supported.  Oddly, looking again just now at the admin guide, I see that the documentation is different again than it was a while ago, now there are a bunch of symlinks to set up.   When did that change occur?

 

I think I am going to punt and run the daemon as root.    And I am definitely not going to touch centos 8…

 

Kevan

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 4:50 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Apologies that this has taken some of your time, up until now the only real supported option for running DAOS with NVMe is to run as root (many months ago you could get away with running UIO+SPDK as non-root but a security fix in kernel precluded that). We haven't communicated VFIO+SPDK non-root as a supported configuration yet as far as I know.

 

Let me know how you get on with the build from that PR.

 

We are not using Centos 8 yet so if you did you would be doing some useful pathfinding :-)

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Monday, March 16, 2020 4:21 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Tom,

 

Sorry to hear that, being sick is no fun, hope you’re over the hump.

 

Yes, this is the problem I’ve been chasing for a couple of weeks, we are running centos 7.7.  I learned more about DPDK internals than I planned to.  😊

 

Not sure how to make this happen, but it would be nice if breakages like this got communicated to the outside world when they happen so that we don’t lose cycles debugging things that have already been debugged.  Maybe a web page of “current issues”?  Or email messages here?    You probably have better ideas.    I did try reading through the Jira log daily for a while to watch for new issues, but that didn’t seem very productive, TMI that didn’t pertain.

 

I got some consolation from the fact that at least this time it wasn’t pilot error.  😊

 

Thanks, Kevan

 

P.S. We have talked from time to time about upgrading to centos 8.   Would that be a bad idea?

 

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 11:37 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hello Kevan,

 

sincere apologies for not being able to reply sooner, I haven't been well and Mike has been on leave.

 

SPDK through VFIO (when running as non-root) broke with the version of SPDK (19.04) we use in our build when we moved from Centos 7.6->7.7 . The fix is to upgrade SPDK to 20.01, unfortunately there have been some API breakages between those versions and we have had to push for another release to properly bump .so versions to reflect the API changes. we are therefore waiting for 20.01.1 which is targeted for March 20th.

 

Think this is the issue you are seeing, this PR should enable you to run as non-root (I've been using it for a while): https://github.com/daos-stack/daos/pull/1902

Rebuild instructions: https://github.com/daos-stack/daos/pull/1902#issuecomment-595702110

 

https://jira.hpdd.intel.com/browse/DAOS-4164

 

Hope that helps

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Sunday, March 15, 2020 8:05 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I should probably be more accurate and say that “I think I am using vfio”.   😊.   Here are the things I have checked, if there are other things to check, let me know.

 

# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.10.0-1062.12.1.el7.x86_64 root=/dev/mapper/cl_delphi--004-root ro spectre_v2=retpoline rd.lvm.lv=cl_delphi-004/root rd.lvm.lv=cl_delphi-004/swap rhgb quiet intel_iommu=on console=ttyS1,115200

 

# ls -l

total 0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar0 -> ../../devices/virtual/iommu/dmar0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar1 -> ../../devices/virtual/iommu/dmar1

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar2 -> ../../devices/virtual/iommu/dmar2

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar3 -> ../../devices/virtual/iommu/dmar3

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar4 -> ../../devices/virtual/iommu/dmar4

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar5 -> ../../devices/virtual/iommu/dmar5

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar6 -> ../../devices/virtual/iommu/dmar6

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar7 -> ../../devices/virtual/iommu/dmar7

[root@delphi-004 iommu]# pwd

/sys/class/iommu

 

From daos_control.log:

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Initializing vfio

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Probing VFIO support...

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 1 (Type 1) is supported

EAL:   IOMMU type 7 (sPAPR) is not supported

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 8 (No-IOMMU) is not supported

EAL: VFIO support initialized

 

[root@delphi-004 vfio]# cd /dev/vfio

[root@delphi-004 vfio]# ls -l

total 0

crw-rw-rw-. 1 root root 235,  11 Mar 15 02:34 1

crw-rw-rw-. 1 root root 235,   0 Mar 15 02:33 29

crw-rw-rw-. 1 root root 235,   1 Mar 15 02:33 41

crw-rw-rw-. 1 root root 235,   2 Mar 15 02:33 42

crw-rw-rw-. 1 root root 235,   3 Mar 15 02:33 43

crw-rw-rw-. 1 root root 235,   4 Mar 15 02:33 44

crw-rw-rw-. 1 root root 235,  12 Mar 15 02:34 55

crw-rw-rw-. 1 root root 235,   5 Mar 15 02:33 71

crw-rw-rw-. 1 root root 235,   6 Mar 15 02:33 72

crw-rw-rw-. 1 root root 235,   7 Mar 15 02:33 84

crw-rw-rw-. 1 root root 235,   8 Mar 15 02:33 85

crw-rw-rw-. 1 root root 235,   9 Mar 15 02:33 86

crw-rw-rw-. 1 root root 235,  10 Mar 15 02:34 87

crw-rw-rw-. 1 root root  10, 196 Mar 15 02:18 vfio

 

From daos_server.yml, server 1 has:

 

   bdev_class: nvme

   bdev_list: ["0000:1a:00.0", "0000:3b:00.0", "0000:3c:00.0", "0000:3d:00.0", "0000:3e:00.0"]

and server 2 has:

   bdev_class: nvme

   bdev_list: ["0000:86:00.0", "0000:87:00.0", "0000:af:00.0", "0000:b0:00.0", "0000:b1:00.0", "0000:b2:00.0"]

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:44 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hm, then I don’t understand why it is trying to read /proc/self/pagemap (only accessible to root in recent kernels). Maybe Tom and Mike can comment.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 18:06
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Nabarro, Tom
 

Non-root supported for SCM only currently. Apologies for any inconsistency with that communication.

 

Why don’t you try building with the PR I suggested, that should get landed before too long.

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Monday, March 16, 2020 9:09 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Well, the daos admin guide describes how to run daos as non-root user, and how to install and configure daos_server and daos_admin for that use case.   I have been following the documentation.  I figured if it is documented, then it is supported.  

 

Originally I followed the email that came out about a month or two ago on how to run daos as non-root, it described how to also do the daos_admin config.   I took that to mean non-root was supported.  Oddly, looking again just now at the admin guide, I see that the documentation is different again than it was a while ago, now there are a bunch of symlinks to set up.   When did that change occur?

 

I think I am going to punt and run the daemon as root.    And I am definitely not going to touch centos 8…

 

Kevan

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 4:50 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Apologies that this has taken some of your time, up until now the only real supported option for running DAOS with NVMe is to run as root (many months ago you could get away with running UIO+SPDK as non-root but a security fix in kernel precluded that). We haven't communicated VFIO+SPDK non-root as a supported configuration yet as far as I know.

 

Let me know how you get on with the build from that PR.

 

We are not using Centos 8 yet so if you did you would be doing some useful pathfinding :-)

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Monday, March 16, 2020 4:21 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Tom,

 

Sorry to hear that, being sick is no fun, hope you’re over the hump.

 

Yes, this is the problem I’ve been chasing for a couple of weeks, we are running centos 7.7.  I learned more about DPDK internals than I planned to.  😊

 

Not sure how to make this happen, but it would be nice if breakages like this got communicated to the outside world when they happen so that we don’t lose cycles debugging things that have already been debugged.  Maybe a web page of “current issues”?  Or email messages here?    You probably have better ideas.    I did try reading through the Jira log daily for a while to watch for new issues, but that didn’t seem very productive, TMI that didn’t pertain.

 

I got some consolation from the fact that at least this time it wasn’t pilot error.  😊

 

Thanks, Kevan

 

P.S. We have talked from time to time about upgrading to centos 8.   Would that be a bad idea?

 

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 11:37 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hello Kevan,

 

sincere apologies for not being able to reply sooner, I haven't been well and Mike has been on leave.

 

SPDK through VFIO (when running as non-root) broke with the version of SPDK (19.04) we use in our build when we moved from Centos 7.6->7.7 . The fix is to upgrade SPDK to 20.01, unfortunately there have been some API breakages between those versions and we have had to push for another release to properly bump .so versions to reflect the API changes. we are therefore waiting for 20.01.1 which is targeted for March 20th.

 

Think this is the issue you are seeing, this PR should enable you to run as non-root (I've been using it for a while): https://github.com/daos-stack/daos/pull/1902

Rebuild instructions: https://github.com/daos-stack/daos/pull/1902#issuecomment-595702110

 

https://jira.hpdd.intel.com/browse/DAOS-4164

 

Hope that helps

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Sunday, March 15, 2020 8:05 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I should probably be more accurate and say that “I think I am using vfio”.   😊.   Here are the things I have checked, if there are other things to check, let me know.

 

# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.10.0-1062.12.1.el7.x86_64 root=/dev/mapper/cl_delphi--004-root ro spectre_v2=retpoline rd.lvm.lv=cl_delphi-004/root rd.lvm.lv=cl_delphi-004/swap rhgb quiet intel_iommu=on console=ttyS1,115200

 

# ls -l

total 0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar0 -> ../../devices/virtual/iommu/dmar0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar1 -> ../../devices/virtual/iommu/dmar1

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar2 -> ../../devices/virtual/iommu/dmar2

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar3 -> ../../devices/virtual/iommu/dmar3

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar4 -> ../../devices/virtual/iommu/dmar4

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar5 -> ../../devices/virtual/iommu/dmar5

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar6 -> ../../devices/virtual/iommu/dmar6

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar7 -> ../../devices/virtual/iommu/dmar7

[root@delphi-004 iommu]# pwd

/sys/class/iommu

 

From daos_control.log:

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Initializing vfio

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Probing VFIO support...

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 1 (Type 1) is supported

EAL:   IOMMU type 7 (sPAPR) is not supported

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 8 (No-IOMMU) is not supported

EAL: VFIO support initialized

 

[root@delphi-004 vfio]# cd /dev/vfio

[root@delphi-004 vfio]# ls -l

total 0

crw-rw-rw-. 1 root root 235,  11 Mar 15 02:34 1

crw-rw-rw-. 1 root root 235,   0 Mar 15 02:33 29

crw-rw-rw-. 1 root root 235,   1 Mar 15 02:33 41

crw-rw-rw-. 1 root root 235,   2 Mar 15 02:33 42

crw-rw-rw-. 1 root root 235,   3 Mar 15 02:33 43

crw-rw-rw-. 1 root root 235,   4 Mar 15 02:33 44

crw-rw-rw-. 1 root root 235,  12 Mar 15 02:34 55

crw-rw-rw-. 1 root root 235,   5 Mar 15 02:33 71

crw-rw-rw-. 1 root root 235,   6 Mar 15 02:33 72

crw-rw-rw-. 1 root root 235,   7 Mar 15 02:33 84

crw-rw-rw-. 1 root root 235,   8 Mar 15 02:33 85

crw-rw-rw-. 1 root root 235,   9 Mar 15 02:33 86

crw-rw-rw-. 1 root root 235,  10 Mar 15 02:34 87

crw-rw-rw-. 1 root root  10, 196 Mar 15 02:18 vfio

 

From daos_server.yml, server 1 has:

 

   bdev_class: nvme

   bdev_list: ["0000:1a:00.0", "0000:3b:00.0", "0000:3c:00.0", "0000:3d:00.0", "0000:3e:00.0"]

and server 2 has:

   bdev_class: nvme

   bdev_list: ["0000:86:00.0", "0000:87:00.0", "0000:af:00.0", "0000:b0:00.0", "0000:b1:00.0", "0000:b2:00.0"]

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:44 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hm, then I don’t understand why it is trying to read /proc/self/pagemap (only accessible to root in recent kernels). Maybe Tom and Mike can comment.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 18:06
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Kevan Rehm
 

Right, we are using SCM, thanks.

 

I’ll give the PR a try.

 

Kevan

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday, March 17, 2020 at 5:45 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Non-root supported for SCM only currently. Apologies for any inconsistency with that communication.

 

Why don’t you try building with the PR I suggested, that should get landed before too long.

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Monday, March 16, 2020 9:09 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Well, the daos admin guide describes how to run daos as non-root user, and how to install and configure daos_server and daos_admin for that use case.   I have been following the documentation.  I figured if it is documented, then it is supported.  

 

Originally I followed the email that came out about a month or two ago on how to run daos as non-root, it described how to also do the daos_admin config.   I took that to mean non-root was supported.  Oddly, looking again just now at the admin guide, I see that the documentation is different again than it was a while ago, now there are a bunch of symlinks to set up.   When did that change occur?

 

I think I am going to punt and run the daemon as root.    And I am definitely not going to touch centos 8…

 

Kevan

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 4:50 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Apologies that this has taken some of your time, up until now the only real supported option for running DAOS with NVMe is to run as root (many months ago you could get away with running UIO+SPDK as non-root but a security fix in kernel precluded that). We haven't communicated VFIO+SPDK non-root as a supported configuration yet as far as I know.

 

Let me know how you get on with the build from that PR.

 

We are not using Centos 8 yet so if you did you would be doing some useful pathfinding :-)

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Monday, March 16, 2020 4:21 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Tom,

 

Sorry to hear that, being sick is no fun, hope you’re over the hump.

 

Yes, this is the problem I’ve been chasing for a couple of weeks, we are running centos 7.7.  I learned more about DPDK internals than I planned to.  😊

 

Not sure how to make this happen, but it would be nice if breakages like this got communicated to the outside world when they happen so that we don’t lose cycles debugging things that have already been debugged.  Maybe a web page of “current issues”?  Or email messages here?    You probably have better ideas.    I did try reading through the Jira log daily for a while to watch for new issues, but that didn’t seem very productive, TMI that didn’t pertain.

 

I got some consolation from the fact that at least this time it wasn’t pilot error.  😊

 

Thanks, Kevan

 

P.S. We have talked from time to time about upgrading to centos 8.   Would that be a bad idea?

 

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 11:37 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hello Kevan,

 

sincere apologies for not being able to reply sooner, I haven't been well and Mike has been on leave.

 

SPDK through VFIO (when running as non-root) broke with the version of SPDK (19.04) we use in our build when we moved from Centos 7.6->7.7 . The fix is to upgrade SPDK to 20.01, unfortunately there have been some API breakages between those versions and we have had to push for another release to properly bump .so versions to reflect the API changes. we are therefore waiting for 20.01.1 which is targeted for March 20th.

 

Think this is the issue you are seeing, this PR should enable you to run as non-root (I've been using it for a while): https://github.com/daos-stack/daos/pull/1902

Rebuild instructions: https://github.com/daos-stack/daos/pull/1902#issuecomment-595702110

 

https://jira.hpdd.intel.com/browse/DAOS-4164

 

Hope that helps

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Sunday, March 15, 2020 8:05 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I should probably be more accurate and say that “I think I am using vfio”.   😊.   Here are the things I have checked, if there are other things to check, let me know.

 

# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.10.0-1062.12.1.el7.x86_64 root=/dev/mapper/cl_delphi--004-root ro spectre_v2=retpoline rd.lvm.lv=cl_delphi-004/root rd.lvm.lv=cl_delphi-004/swap rhgb quiet intel_iommu=on console=ttyS1,115200

 

# ls -l

total 0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar0 -> ../../devices/virtual/iommu/dmar0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar1 -> ../../devices/virtual/iommu/dmar1

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar2 -> ../../devices/virtual/iommu/dmar2

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar3 -> ../../devices/virtual/iommu/dmar3

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar4 -> ../../devices/virtual/iommu/dmar4

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar5 -> ../../devices/virtual/iommu/dmar5

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar6 -> ../../devices/virtual/iommu/dmar6

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar7 -> ../../devices/virtual/iommu/dmar7

[root@delphi-004 iommu]# pwd

/sys/class/iommu

 

From daos_control.log:

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Initializing vfio

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Probing VFIO support...

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 1 (Type 1) is supported

EAL:   IOMMU type 7 (sPAPR) is not supported

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 8 (No-IOMMU) is not supported

EAL: VFIO support initialized

 

[root@delphi-004 vfio]# cd /dev/vfio

[root@delphi-004 vfio]# ls -l

total 0

crw-rw-rw-. 1 root root 235,  11 Mar 15 02:34 1

crw-rw-rw-. 1 root root 235,   0 Mar 15 02:33 29

crw-rw-rw-. 1 root root 235,   1 Mar 15 02:33 41

crw-rw-rw-. 1 root root 235,   2 Mar 15 02:33 42

crw-rw-rw-. 1 root root 235,   3 Mar 15 02:33 43

crw-rw-rw-. 1 root root 235,   4 Mar 15 02:33 44

crw-rw-rw-. 1 root root 235,  12 Mar 15 02:34 55

crw-rw-rw-. 1 root root 235,   5 Mar 15 02:33 71

crw-rw-rw-. 1 root root 235,   6 Mar 15 02:33 72

crw-rw-rw-. 1 root root 235,   7 Mar 15 02:33 84

crw-rw-rw-. 1 root root 235,   8 Mar 15 02:33 85

crw-rw-rw-. 1 root root 235,   9 Mar 15 02:33 86

crw-rw-rw-. 1 root root 235,  10 Mar 15 02:34 87

crw-rw-rw-. 1 root root  10, 196 Mar 15 02:18 vfio

 

From daos_server.yml, server 1 has:

 

   bdev_class: nvme

   bdev_list: ["0000:1a:00.0", "0000:3b:00.0", "0000:3c:00.0", "0000:3d:00.0", "0000:3e:00.0"]

and server 2 has:

   bdev_class: nvme

   bdev_list: ["0000:86:00.0", "0000:87:00.0", "0000:af:00.0", "0000:b0:00.0", "0000:b1:00.0", "0000:b2:00.0"]

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:44 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hm, then I don’t understand why it is trying to read /proc/self/pagemap (only accessible to root in recent kernels). Maybe Tom and Mike can comment.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 18:06
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Colin Ngam
 

Hi Tom,

 

I followed the instructions and got the following:

 

Checking for C library spdk... no

MissingTargets: spdk has missing targets after build.  See config.log for details:

  File "/home/users/daos/daos/SConstruct", line 404:

    scons()

  File "/home/users/daos/daos/SConstruct", line 349:

    preload_prereqs(prereqs)

  File "/home/users/daos/daos/SConstruct", line 132:

    prereqs.load_definitions(prebuild=reqs)

  File "/home/users/daos/daos/scons_local/prereq_tools/base.py", line 1057:

    self.require(env, comp)

  File "/home/users/daos/daos/scons_local/prereq_tools/base.py", line 1131:

    raise error

 

Keep in mind that I ran the build twice. So, I do not know if the config.log is twice as much.

 

Thanks.

 

Colin

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday, March 17, 2020 at 4:44 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Non-root supported for SCM only currently. Apologies for any inconsistency with that communication.

 

Why don’t you try building with the PR I suggested, that should get landed before too long.

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Monday, March 16, 2020 9:09 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Well, the daos admin guide describes how to run daos as non-root user, and how to install and configure daos_server and daos_admin for that use case.   I have been following the documentation.  I figured if it is documented, then it is supported.  

 

Originally I followed the email that came out about a month or two ago on how to run daos as non-root, it described how to also do the daos_admin config.   I took that to mean non-root was supported.  Oddly, looking again just now at the admin guide, I see that the documentation is different again than it was a while ago, now there are a bunch of symlinks to set up.   When did that change occur?

 

I think I am going to punt and run the daemon as root.    And I am definitely not going to touch centos 8…

 

Kevan

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 4:50 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Apologies that this has taken some of your time, up until now the only real supported option for running DAOS with NVMe is to run as root (many months ago you could get away with running UIO+SPDK as non-root but a security fix in kernel precluded that). We haven't communicated VFIO+SPDK non-root as a supported configuration yet as far as I know.

 

Let me know how you get on with the build from that PR.

 

We are not using Centos 8 yet so if you did you would be doing some useful pathfinding :-)

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Monday, March 16, 2020 4:21 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Tom,

 

Sorry to hear that, being sick is no fun, hope you’re over the hump.

 

Yes, this is the problem I’ve been chasing for a couple of weeks, we are running centos 7.7.  I learned more about DPDK internals than I planned to.  😊

 

Not sure how to make this happen, but it would be nice if breakages like this got communicated to the outside world when they happen so that we don’t lose cycles debugging things that have already been debugged.  Maybe a web page of “current issues”?  Or email messages here?    You probably have better ideas.    I did try reading through the Jira log daily for a while to watch for new issues, but that didn’t seem very productive, TMI that didn’t pertain.

 

I got some consolation from the fact that at least this time it wasn’t pilot error.  😊

 

Thanks, Kevan

 

P.S. We have talked from time to time about upgrading to centos 8.   Would that be a bad idea?

 

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 11:37 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hello Kevan,

 

sincere apologies for not being able to reply sooner, I haven't been well and Mike has been on leave.

 

SPDK through VFIO (when running as non-root) broke with the version of SPDK (19.04) we use in our build when we moved from Centos 7.6->7.7 . The fix is to upgrade SPDK to 20.01, unfortunately there have been some API breakages between those versions and we have had to push for another release to properly bump .so versions to reflect the API changes. we are therefore waiting for 20.01.1 which is targeted for March 20th.

 

Think this is the issue you are seeing, this PR should enable you to run as non-root (I've been using it for a while): https://github.com/daos-stack/daos/pull/1902

Rebuild instructions: https://github.com/daos-stack/daos/pull/1902#issuecomment-595702110

 

https://jira.hpdd.intel.com/browse/DAOS-4164

 

Hope that helps

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Sunday, March 15, 2020 8:05 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I should probably be more accurate and say that “I think I am using vfio”.   😊.   Here are the things I have checked, if there are other things to check, let me know.

 

# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.10.0-1062.12.1.el7.x86_64 root=/dev/mapper/cl_delphi--004-root ro spectre_v2=retpoline rd.lvm.lv=cl_delphi-004/root rd.lvm.lv=cl_delphi-004/swap rhgb quiet intel_iommu=on console=ttyS1,115200

 

# ls -l

total 0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar0 -> ../../devices/virtual/iommu/dmar0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar1 -> ../../devices/virtual/iommu/dmar1

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar2 -> ../../devices/virtual/iommu/dmar2

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar3 -> ../../devices/virtual/iommu/dmar3

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar4 -> ../../devices/virtual/iommu/dmar4

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar5 -> ../../devices/virtual/iommu/dmar5

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar6 -> ../../devices/virtual/iommu/dmar6

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar7 -> ../../devices/virtual/iommu/dmar7

[root@delphi-004 iommu]# pwd

/sys/class/iommu

 

From daos_control.log:

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Initializing vfio

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Probing VFIO support...

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 1 (Type 1) is supported

EAL:   IOMMU type 7 (sPAPR) is not supported

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 8 (No-IOMMU) is not supported

EAL: VFIO support initialized

 

[root@delphi-004 vfio]# cd /dev/vfio

[root@delphi-004 vfio]# ls -l

total 0

crw-rw-rw-. 1 root root 235,  11 Mar 15 02:34 1

crw-rw-rw-. 1 root root 235,   0 Mar 15 02:33 29

crw-rw-rw-. 1 root root 235,   1 Mar 15 02:33 41

crw-rw-rw-. 1 root root 235,   2 Mar 15 02:33 42

crw-rw-rw-. 1 root root 235,   3 Mar 15 02:33 43

crw-rw-rw-. 1 root root 235,   4 Mar 15 02:33 44

crw-rw-rw-. 1 root root 235,  12 Mar 15 02:34 55

crw-rw-rw-. 1 root root 235,   5 Mar 15 02:33 71

crw-rw-rw-. 1 root root 235,   6 Mar 15 02:33 72

crw-rw-rw-. 1 root root 235,   7 Mar 15 02:33 84

crw-rw-rw-. 1 root root 235,   8 Mar 15 02:33 85

crw-rw-rw-. 1 root root 235,   9 Mar 15 02:33 86

crw-rw-rw-. 1 root root 235,  10 Mar 15 02:34 87

crw-rw-rw-. 1 root root  10, 196 Mar 15 02:18 vfio

 

From daos_server.yml, server 1 has:

 

   bdev_class: nvme

   bdev_list: ["0000:1a:00.0", "0000:3b:00.0", "0000:3c:00.0", "0000:3d:00.0", "0000:3e:00.0"]

and server 2 has:

   bdev_class: nvme

   bdev_list: ["0000:86:00.0", "0000:87:00.0", "0000:af:00.0", "0000:b0:00.0", "0000:b1:00.0", "0000:b2:00.0"]

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:44 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hm, then I don’t understand why it is trying to read /proc/self/pagemap (only accessible to root in recent kernels). Maybe Tom and Mike can comment.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 18:06
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Nabarro, Tom
 

Did you do a “git submodule update” to pull in the scons_local changes? Maybe I missed that out of the instructions, there was a change to the configure script check on the spdk libs.

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Colin Ngam
Sent: Wednesday, March 18, 2020 8:56 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Tom,

 

I followed the instructions and got the following:

 

Checking for C library spdk... no

MissingTargets: spdk has missing targets after build.  See config.log for details:

  File "/home/users/daos/daos/SConstruct", line 404:

    scons()

  File "/home/users/daos/daos/SConstruct", line 349:

    preload_prereqs(prereqs)

  File "/home/users/daos/daos/SConstruct", line 132:

    prereqs.load_definitions(prebuild=reqs)

  File "/home/users/daos/daos/scons_local/prereq_tools/base.py", line 1057:

    self.require(env, comp)

  File "/home/users/daos/daos/scons_local/prereq_tools/base.py", line 1131:

    raise error

 

Keep in mind that I ran the build twice. So, I do not know if the config.log is twice as much.

 

Thanks.

 

Colin

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday, March 17, 2020 at 4:44 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Non-root supported for SCM only currently. Apologies for any inconsistency with that communication.

 

Why don’t you try building with the PR I suggested, that should get landed before too long.

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Monday, March 16, 2020 9:09 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Well, the daos admin guide describes how to run daos as non-root user, and how to install and configure daos_server and daos_admin for that use case.   I have been following the documentation.  I figured if it is documented, then it is supported.  

 

Originally I followed the email that came out about a month or two ago on how to run daos as non-root, it described how to also do the daos_admin config.   I took that to mean non-root was supported.  Oddly, looking again just now at the admin guide, I see that the documentation is different again than it was a while ago, now there are a bunch of symlinks to set up.   When did that change occur?

 

I think I am going to punt and run the daemon as root.    And I am definitely not going to touch centos 8…

 

Kevan

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 4:50 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Apologies that this has taken some of your time, up until now the only real supported option for running DAOS with NVMe is to run as root (many months ago you could get away with running UIO+SPDK as non-root but a security fix in kernel precluded that). We haven't communicated VFIO+SPDK non-root as a supported configuration yet as far as I know.

 

Let me know how you get on with the build from that PR.

 

We are not using Centos 8 yet so if you did you would be doing some useful pathfinding :-)

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Monday, March 16, 2020 4:21 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Tom,

 

Sorry to hear that, being sick is no fun, hope you’re over the hump.

 

Yes, this is the problem I’ve been chasing for a couple of weeks, we are running centos 7.7.  I learned more about DPDK internals than I planned to.  😊

 

Not sure how to make this happen, but it would be nice if breakages like this got communicated to the outside world when they happen so that we don’t lose cycles debugging things that have already been debugged.  Maybe a web page of “current issues”?  Or email messages here?    You probably have better ideas.    I did try reading through the Jira log daily for a while to watch for new issues, but that didn’t seem very productive, TMI that didn’t pertain.

 

I got some consolation from the fact that at least this time it wasn’t pilot error.  😊

 

Thanks, Kevan

 

P.S. We have talked from time to time about upgrading to centos 8.   Would that be a bad idea?

 

 

From: <daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, March 16, 2020 at 11:37 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hello Kevan,

 

sincere apologies for not being able to reply sooner, I haven't been well and Mike has been on leave.

 

SPDK through VFIO (when running as non-root) broke with the version of SPDK (19.04) we use in our build when we moved from Centos 7.6->7.7 . The fix is to upgrade SPDK to 20.01, unfortunately there have been some API breakages between those versions and we have had to push for another release to properly bump .so versions to reflect the API changes. we are therefore waiting for 20.01.1 which is targeted for March 20th.

 

Think this is the issue you are seeing, this PR should enable you to run as non-root (I've been using it for a while): https://github.com/daos-stack/daos/pull/1902

Rebuild instructions: https://github.com/daos-stack/daos/pull/1902#issuecomment-595702110

 

https://jira.hpdd.intel.com/browse/DAOS-4164

 

Hope that helps

 

Regards,

Tom Nabarro – DCG/ESAD

M: +44 (0)7786 260986

Skype: tom.nabarro

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Sunday, March 15, 2020 8:05 PM
To: daos@daos.groups.io
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I should probably be more accurate and say that “I think I am using vfio”.   😊.   Here are the things I have checked, if there are other things to check, let me know.

 

# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.10.0-1062.12.1.el7.x86_64 root=/dev/mapper/cl_delphi--004-root ro spectre_v2=retpoline rd.lvm.lv=cl_delphi-004/root rd.lvm.lv=cl_delphi-004/swap rhgb quiet intel_iommu=on console=ttyS1,115200

 

# ls -l

total 0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar0 -> ../../devices/virtual/iommu/dmar0

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar1 -> ../../devices/virtual/iommu/dmar1

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar2 -> ../../devices/virtual/iommu/dmar2

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar3 -> ../../devices/virtual/iommu/dmar3

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar4 -> ../../devices/virtual/iommu/dmar4

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar5 -> ../../devices/virtual/iommu/dmar5

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar6 -> ../../devices/virtual/iommu/dmar6

lrwxrwxrwx. 1 root root 0 Mar 15 01:38 dmar7 -> ../../devices/virtual/iommu/dmar7

[root@delphi-004 iommu]# pwd

/sys/class/iommu

 

From daos_control.log:

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Initializing vfio

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL: Probing VFIO support...

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 1 (Type 1) is supported

EAL:   IOMMU type 7 (sPAPR) is not supported

delphi-004.us.cray.com INFO 2020/03/15 02:31:29 daos_io_server:1 EAL:   IOMMU type 8 (No-IOMMU) is not supported

EAL: VFIO support initialized

 

[root@delphi-004 vfio]# cd /dev/vfio

[root@delphi-004 vfio]# ls -l

total 0

crw-rw-rw-. 1 root root 235,  11 Mar 15 02:34 1

crw-rw-rw-. 1 root root 235,   0 Mar 15 02:33 29

crw-rw-rw-. 1 root root 235,   1 Mar 15 02:33 41

crw-rw-rw-. 1 root root 235,   2 Mar 15 02:33 42

crw-rw-rw-. 1 root root 235,   3 Mar 15 02:33 43

crw-rw-rw-. 1 root root 235,   4 Mar 15 02:33 44

crw-rw-rw-. 1 root root 235,  12 Mar 15 02:34 55

crw-rw-rw-. 1 root root 235,   5 Mar 15 02:33 71

crw-rw-rw-. 1 root root 235,   6 Mar 15 02:33 72

crw-rw-rw-. 1 root root 235,   7 Mar 15 02:33 84

crw-rw-rw-. 1 root root 235,   8 Mar 15 02:33 85

crw-rw-rw-. 1 root root 235,   9 Mar 15 02:33 86

crw-rw-rw-. 1 root root 235,  10 Mar 15 02:34 87

crw-rw-rw-. 1 root root  10, 196 Mar 15 02:18 vfio

 

From daos_server.yml, server 1 has:

 

   bdev_class: nvme

   bdev_list: ["0000:1a:00.0", "0000:3b:00.0", "0000:3c:00.0", "0000:3d:00.0", "0000:3e:00.0"]

and server 2 has:

   bdev_class: nvme

   bdev_list: ["0000:86:00.0", "0000:87:00.0", "0000:af:00.0", "0000:b0:00.0", "0000:b1:00.0", "0000:b2:00.0"]

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:44 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hm, then I don’t understand why it is trying to read /proc/self/pagemap (only accessible to root in recent kernels). Maybe Tom and Mike can comment.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 18:06
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

I am using vfio, and my IOMMU is enabled.

 

 

 

From: <daos@daos.groups.io> on behalf of "Lombardi, Johann" <johann.lombardi@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 1:05 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Hi Kevan,

 

To run SPDK as a non-root user, you need to switch from UIO to VFIO. Tom and Mike have spent some time recently to verify that the DAOS server can be run as a regular user (except for the setuid root on the daos_admin utility) when VFIO is enabled.

It requires VT-d to be enabled in the BIOS. Please check: https://daos-stack.github.io/admin/deployment/#enable-iommu-optional

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 15 March 2020 at 16:49
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

(DAOS-4342 will be closed, a  more recent master contains the fix.  For the DPDK problem here, I am running yesterday’s master.)

 

My DPDK problem is related to permissions.  I am running daos_server as user ‘daos’, group ‘daos_grp’, with daos_admin and daos_server permissions set as documented:

 

# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  6188984 Mar 14 16:12 daos_admin

-rwxr-sr-x. 1 root daos_grp 16345032 Mar 14 16:13 daos_server

 

If I start the daemon manually like this:

                ~/daos/install/bin/daos_server start

 

it fails every time, the page frame number read by rte_mem_virt2phy() is always zero, and alloc_seg() always fails.   If I instead start the daemon with:

                sudo ~/daos/install/bin/daos_server start

 

then the page frame numbers read by rte_mem_virt2phy() are correctly non-zero, and the daos_io_server daemons start up.

 

Am I doing something incorrectly in my attempts to run the daos servers as non-root?

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday, March 15, 2020 at 7:40 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I have opened Jira DAOS-4342 for the problem where the value of MaxMessageSize is too small.

 

I still have no solution for the DPDK failure I am seeing in daos_server startup, but I have more background information on the problem.   The underlying problem is that DPDK cannot convert virtual addresses to physical addresses.

 

Step 1: Early in startup, rte_pci_get_iommu_class() is called from rte_eal_init(), it sets iova_mode to RTE_IOVA_PA (DMA using physical addresses).

 

Step 2: rte_eal_hugepage_init() gets called, which eventually calls test_phys_addrs_available().  That routine sets phys_addrs_available = false, and reports:

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:1 EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function.

 

The “Permission denied” is a stale errno value, the actual test in the code that fails is inside routine rte_mem_virt2phy():

 

        /*

         * the pfn (page frame number) are bits 0-54 (see

         * pagemap.txt in linux Documentation)

         */

        if ((page & 0x7fffffffffffffULL) == 0) {

                return RTE_BAD_IOVA;

        }

 

The bottom 55 bits of the word that was read are all zeros.   The actual value of the word is 0x8180000000000000.

 

Step 3: Routine rte_service_init() gets called.   It eventually calls alloc_seg() which wants to convert a virtual address to a physical address, but phys_addrs_available is false, so it fails.   It tries to allocate segments on both sockets but both attempts fail, as phys_addrs_available applies to both sockets.

 

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

EAL: Setting policy MPOL_PREFERRED for socket 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Virtual area found at 0x200000200000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Trying to obtain current memory policy.

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Setting policy MPOL_PREFERRED for socket 1

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: alloc_seg(): can't get IOVA addr

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Ask a virtual area of 0x200000 bytes

EAL: Virtual area found at 0x201000a00000 (size = 0x200000)

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: attempted to allocate 1 segments, but only 0 were allocated

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 EAL: Restoring previous memory policy: 0

delphi-004.us.cray.com ERROR 2020/03/14 08:37:09 daos_io_server:0 EAL: FATAL: rte_service_init() failed

Failed to initialize DPDK

delphi-004.us.cray.com INFO 2020/03/14 08:37:09 daos_io_server:0 error allocating rte services array

EAL: rte_service_init() failed

 

I think the heart of the problem is, when DPDK reads /proc/self/pagemap, why is it getting a value where the bottom 55 bits are all zero?

 

Kevan

 

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 9:51 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Help debugging a daos I/O server startup failure?

 

One update, the discarded garbage messages occur because MaxMessageSize is set to 4096 which is smaller than the message which daos_admin is trying to return because there are 11 (eventually 12) NVMe devices on this node.  I raised the value of MaxMessageSize to 8192 and the “discarded garbage message” problem went away, but the rest of the issues still remain.  

 

Kevan

 

 

From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday, March 12, 2020 at 6:36 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Help debugging a daos I/O server startup failure?

 

Greetings,

 

I’ve been debugging a daos_io_server startup problem for a couple of days now and have gotten nowhere, so it’s time to call in the experts.   Two IO servers are configured per node, one has 5 NVMe devices and one has 6.  They both give the following errors:

 

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:491 server_init() Network successfully initialized

03/12-08:33:32.33 delphi-004 DAOS[141245/141245] server INFO src/iosrv/init.c:500 server_init() Module vos,rdb,rsvc,security,mgmt,pool,cont,dtx,obj,rebuild successfully loaded

03/12-08:33:32.43 delphi-004 DAOS[141245/141287] bio  INFO src/bio/bio_xstream.c:961 bio_xsctxt_alloc() Initialize NVMe context, tgt_id:0, init_thread:(nil)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] bio  ERR  src/bio/bio_xstream.c:1019 bio_xsctxt_alloc() failed to initialize SPDK env, DER_INVAL(-1003)

03/12-08:33:32.71 delphi-004 DAOS[141245/141287] server ERR  src/iosrv/srv.c:452 dss_srv_handler() failed to init spdk context for xstream(2) rc:-1003

 

The failing function in bio_xsctxt_alloc()  is spdk_env_init(), which  just returns -1.   Looking in /dev/hugepages, I see 40 2 MiB hugepage files owned by root, which I did not expect, because daos_server and the two daos_io_server processes are all running as user ‘daos’.  I believe I have the file permissions set correctly:

 

[root@delphi-004 tmp]# cd ~daos/daos/install/bin

[root@delphi-004 bin]# ls -l daos_admin daos_server

-rwsr-x---. 1 root daos_grp  5751760 Mar 12  2020 daos_admin

-rwxr-sr-x. 1 root daos_grp 16219920 Mar 12  2020 daos_server

 

The daos_admin process had errors also and exited, and daos_control.log mentions it is discarding garbage responses.

 

Am I doing something obviously wrong?     Attached are the daos_control.log and daos_io_server.log files, together with the daos .yml files.

 

If you need more info, let me know.

 

Thanks, Kevan

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.