Re: DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)


Huang, Lei
 

ERROR: /usr/bin/daos_admin SIGILL: illegal instruction

 

You CPU does not support certain instructions inside daos_admin process. Could you please attach the output of “lscpu” of your computer? Thank you!

 

-lei

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of lnsyyj@...
Sent: Wednesday, November 2, 2022 10:26 PM
To: daos@daos.groups.io
Subject: [daos] DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

 

Hello everyone,

When I start Daos Server, the following information appears. How should I solve it?

 

[root@Rocky-1 ~]# /usr/share/spdk/scripts/setup.sh

0000:44:00.0 (1d78 1512): nvme -> vfio-pci

[root@Rocky-1 ~]# cat /etc/daos/daos_server.yml 

name: daos_server

access_points: ['Rocky-1']

port: 10001

transport_config:

  allow_insecure: false

  client_cert_dir: /etc/daos/certs/clients

  ca_cert: /etc/daos/certs/daosCA.crt

  cert: /etc/daos/certs/server.crt

  key: /etc/daos/certs/server.key

provider: ofi+sockets

socket_dir: /var/run/

nr_hugepages: 4096

control_log_mask: DEBUG

control_log_file: /var/log/daos_server.log

helper_log_file: /var/log/daos_admin.log

 

engines:

-

  targets: 8

  nr_xs_helpers: 0

  fabric_iface: enp3s0f1

  fabric_iface_port: 31316

  log_mask: INFO

  log_file: /var/log/daos_engine_0.log

  env_vars:

      - CRT_TIMEOUT=30

  storage:

  -

    class: ram

    scm_mount: /mnt/daos0

    scm_size: 2 #gb to allocate for tmpfs to emulate SCM

  -

    class: nvme

    bdev_list: ["0000:44:00.0"]

 


[root@Rocky-1 ~]# /usr/bin/daos_server start

DAOS Server config loaded from /etc/daos/daos_server.yml

/usr/bin/daos_server logging to file /var/log/daos_server.log

DEBUG 11:17:31.720878 start.go:90: Switching control log level to DEBUG

DEBUG 11:17:31.721131 defaults.go:92: failed to load library: unable to open a handle to the library

ERROR: unable to open a handle to the library

DEBUG 11:17:31.721209 fabric.go:875: waiting for fabric interfaces to become ready...

DEBUG 11:17:31.721299 fabric.go:892: fabric interface "enp3s0f1" is ready

DEBUG 11:17:31.721372 provider.go:87: getting topology with hwloc version 0x20100

DEBUG 11:17:31.769773 provider.go:145: adding device found at "/sys/class/net/eno1" (type network interface, NUMA node 0)

DEBUG 11:17:31.769933 provider.go:145: adding device found at "/sys/class/net/eno2" (type network interface, NUMA node 0)

DEBUG 11:17:31.770081 provider.go:145: adding device found at "/sys/class/net/eno3" (type network interface, NUMA node 0)

DEBUG 11:17:31.770212 provider.go:145: adding device found at "/sys/class/net/eno4" (type network interface, NUMA node 0)

DEBUG 11:17:31.770357 provider.go:145: adding device found at "/sys/class/net/enp3s0f0" (type network interface, NUMA node 0)

DEBUG 11:17:31.770485 provider.go:145: adding device found at "/sys/class/net/enp3s0f1" (type network interface, NUMA node 0)

DEBUG 11:17:31.770537 provider.go:125: failed to read net device: open /sys/class/net/lo/device/net: no such file or directory

DEBUG 11:17:31.770749 provider.go:264: adding virtual device at "/sys/devices/virtual/net/lo"

DEBUG 11:17:31.886150 provider.go:83: found fabric interfaces:

enp3s0f1 (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

lo (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

shm (providers: shm)

DEBUG 11:17:31.886239 provider.go:292: no cxi subsystem in sysfs

DEBUG 11:17:31.886338 fabric.go:441: unable to open a handle to the library

DEBUG 11:17:31.886419 fabric.go:511: ignoring fabric interface "shm" (shm) not found in topology

DEBUG 11:17:31.886534 fabric.go:793: discovered 2 fabric interfaces:

enp3s0f1 (interface: enp3s0f1) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

lo (interface: lo) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

DEBUG 11:17:31.886645 server.go:750: detected NUMA affinity 0 for engine 0

DEBUG 11:17:31.886675 server.go:757: enabling single-engine legacy core allocation algorithm

DEBUG 11:17:31.886703 server.go:420: validating config file read from "/etc/daos/daos_server.yml"

DEBUG 11:17:31.886742 server.go:443: vfio=true hotplug=false vmd=true requested in config

WARNING: Configuration includes only one access point. This provides no redundancy in the event of an access point failure.

DEBUG 11:17:31.886841 server.go:549: engine 0 fabric numa 0, storage numa 0

DEBUG 11:17:31.887914 server_utils.go:148: setting OFI_DOMAIN=enp3s0f1 for enp3s0f1

DEBUG 11:17:31.889170 server.go:377: active config saved to /var/run/.daos_server.active.yml (read-only)

DEBUG 11:17:31.889251 server.go:525: fault domain: /rocky-1

DEBUG 11:17:31.889862 server.go:236: setting core dump filter to 0x13

DEBUG 11:17:31.890615 database.go:280: set db replica addr: 192.168.1.215:10001

DEBUG 11:17:31.891076 server.go:164: time to init network: 242.45µs

DEBUG 11:17:31.891195 server_utils.go:260: allocating 4098 hugepages on each of these numa nodes: [0]

DEBUG 11:17:31.891267 ctl_storage.go:53: calling bdev provider prepare: {ForwardableRequest:{Forwarded:false} HugePageCount:4098 HugeNodes:0 CleanHugePagesOnly:false PCIAllowList: PCIBlockList: TargetUser:root Reset_:false DisableVFIO:false EnableVMD:true}

DEBUG 11:17:32.224164 server.go:164: time to prepare bdev storage: 332.967644ms

DEBUG 11:17:32.224261 ctl_storage.go:59: calling bdev provider scan: {ForwardableRequest:{Forwarded:false} DeviceList:0000:44:00.0 VMDEnabled:false BypassCache:true}

ERROR: /usr/bin/daos_admin SIGILL: illegal instruction

PC=0x7fbc78755c0e m=0 sigcode=2

signal arrived during cgo execution

instruction bytes: 0xc4 0xe2 0x69 0xf7 0xc0 0x41 0x89 0x85 0xe0 0x19 0x0 0x0 0xe8 0xb1 0x98 0xfe

 

goroutine 1 [syscall]:

runtime.cgocall(0x92049d, 0xc0001d1c20)

/usr/src/runtime/cgocall.go:158 +0x5c fp=0xc0001d1bf8 sp=0xc0001d1bc0 pc=0x408b1c

github.com/daos-stack/daos/src/control/lib/spdk._Cfunc_nvme_discover()

_cgo_gotypes.go:321 +0x49 fp=0xc0001d1c20 sp=0xc0001d1bf8 pc=0x904fc9

github.com/daos-stack/daos/src/control/lib/spdk.(*NvmeImpl).Discover(0xc0000bcd00?, {0xb267f8, 0xc000184300})

/builddir/build/BUILD/daos-2.2.0/src/control/lib/spdk/nvme.go:127 +0x54 fp=0xc0001d1cd8

ERROR: /usr/bin/daos_admin  sp=0xc0001d1c20 pc=0x9059b4

github.com/daos-stack/daos/src/control/server/storage/bdev.(*spdkBackend).Scan(0xc0000bcce0, {{0x56?}, 0xc0001d5030?, 0x20?, 0x1?})

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/backend.go:341 +0x1b7 fp=0xc0001d1da8 sp=0xc0001d1cd8 pc=0x909f37

github.com/daos-stack/daos/src/control/server/storage/bdev.(*Provider).Scan(...)

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/provider.go:54

main.(*bdevScanHandler).Handle(0xc000014788, {0xb267f8?, 0xc000184300}, 0xc0002e4240)

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/handler.go:175 +0x27a fp=0xc0001d1e08 sp=0xc0001d1da8 pc=0x91d1fa

github.com/daos-stack/daos/src/control/pbin.(*App).handleRequest(0xc0000caae0, 0xc0002e4240)

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:214 +0x62 fp=0xc0001d1e58 sp=0xc0001d1e08 pc=0x5949c2

github.com/daos-stack/daos/src/control/pbin.(*App).Run(0xc0000caae0)

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:155 +0x2ed fp=0xc0001d1f50 sp=0xc0001d1e58 pc=0x59448d

main.main()

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/main.go:25 +0xaf fp=0xc0001d1f80 sp=0xc0001d1f50 pc=0x91de6f

runtime.main()

/usr/src/runtime/proc.go:250 +0x212 fp=0xc0001d1fe0 sp=0xc0001d1f80 pc=0x43dd32

runtime.goexit

ERROR: /usr/bin/daos_admin ()

/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0001d1fe8 sp=0xc0001d1fe0 pc=0x46b9c1

 

goroutine 2 [force gc (idle)]:

runtime.gopark(0x0?, 0x0?, 0x0?

ERROR: /usr/bin/daos_admin , 0x0?, 0x0?)

/usr/src/runtime/proc.go:363 +0xd6 fp=0xc00009efb0 sp=0xc00009ef90 pc=0x43e0f6

ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)

ERROR: /usr/bin/daos_admin  /usr/src/runtime/proc.go:369

runtime.forcegchelper()

 

ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:302 +0xad fp=0xc00009efe0 sp=0xc00009efb0 pc=0x43df8d

runtime.goexit()

/usr/src/runtime/asm_amd64.s

ERROR: /usr/bin/daos_admin :1594 +0x1 fp=0xc00009efe8 sp=0xc00009efe0 pc=0x46b9c1

created by 

ERROR: /usr/bin/daos_admin runtime.init.6

/usr/src/runtime/proc.go:290 +0x25

ERROR: /usr/bin/daos_admin 

goroutine 3 [GC sweep wait]:

runtime.gopark(0x0

ERROR: /usr/bin/daos_admin ?, 0x0?, 0x0?, 0x0?

ERROR: /usr/bin/daos_admin , 0x0?)

/usr/src/runtime/proc.go:363 +

ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009f790 sp=0xc00009f770 pc=0x43e0f6

ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)

/usr/src/runtime/proc.go:369

runtime.bgsweep(0x0?)

 

ERROR: /usr/bin/daos_admin /usr/src/runtime/mgcsweep.go:278 +0x8e fp=0xc00009f7c8 sp=0xc00009f790 pc=0x429c2e

runtime.gcenable.func1()

/usr/src/runtime/mgc.go:178 +

ERROR: /usr/bin/daos_admin 0x26 fp=0xc00009f7e0 sp=0xc00009f7c8 pc=0x41e8c6

runtime.goexit()

/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc00009f7e8 sp=0xc00009f7e0 pc=0x46b9c1

created by runtime.gcenable

/usr/src/runtime/mgc.go:

ERROR: /usr/bin/daos_admin 178 +0x6b

 

goroutine 4 [GC scavenge wait]:

runtime.gopark(0xc0000c6000?, 0xb1e1e8?

ERROR: /usr/bin/daos_admin , 0x1?, 0x0?, 0x0?)

/usr/src/runtime/proc.go:363 +

ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009ff70 sp=0xc00009ff50 pc=0x43e0f6

runtime.goparkunlock(...)

ERROR: /usr/bin/daos_admin  /usr/src/runtime/proc.go:369

runtime.(*scavengerState).park(0x10a3a20)

/usr/src/runtime/mgcscavenge.go:389 +0x53 fp=

ERROR: /usr/bin/daos_admin 0xc00009ffa0 sp=0xc00009ff70 pc=0x427cd3

runtime.bgscavenge(0x0?)

/usr/src/runtime/mgcscavenge.go:

ERROR: /usr/bin/daos_admin 617 +0x45 fp=0xc00009ffc8 sp=0xc00009ffa0 pc=0x4282a5

runtime.gcenable.func2

ERROR: /usr/bin/daos_admin ()

/usr/src/runtime/mgc.go:179 +0x26 fp=0xc00009ffe0 sp=0xc00009ffc8 pc=0x41e866

ERROR: /usr/bin/daos_admin 

runtime.goexit()

/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=

ERROR: /usr/bin/daos_admin 0xc00009ffe8 sp=0xc00009ffe0 pc=0x46b9c1

created by runtime.gcenable

/usr/src/runtime/mgc.go:179

ERROR: /usr/bin/daos_admin  +0xaa

 

goroutine 5 [finalizer wait]:

runtime.gopark(0x10a4520?, 

ERROR: /usr/bin/daos_admin 0xc000007860?, 0x0?, 0x0?, 0xc00009e770?)

/usr/src/runtime/proc.go:363

ERROR: /usr/bin/daos_admin  +0xd6 fp=0xc00009e628 sp=0xc00009e608 pc=0x43e0f6

runtime.goparkunlock(...)

 

ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369

runtime.runfinq()

/usr/src/runtime/mfinal.go:

ERROR: /usr/bin/daos_admin 180 +0x10f fp=0xc00009e7e0 sp=0xc00009e628 pc=0x41d9cf

runtime.goexit()

/usr/src/runtime/asm_amd64.s:

ERROR: /usr/bin/daos_admin 1594 +0x1 fp=0xc00009e7e8 sp=0xc00009e7e0 pc=0x46b9c1

created by runtime.createfing

/usr/src/runtime/mfinal.go:157 +0x45

ERROR: /usr/bin/daos_admin 

rax    0x1

rbx    0x2492240

rcx    0x7fbc78da4e60

rdx    0x0

rdi    0x2492240

rsi    

ERROR: /usr/bin/daos_admin 0x7fbc78da1af0

rbp    0x2000003e7240

rsp    0x7ffc5180b120

r8     0x7fbc78da2460

r9     0x0

r10    0x70000000004

r11    0x0

ERROR: /usr/bin/daos_admin 

r12    0x202001000000

r13    0x2000003e7240

r14    0x7ffc5180b150

r15    0x0

rip    0x7fbc78755c0e

rflags 

ERROR: /usr/bin/daos_admin 0x13246

cs     0x33

fs     0x0

gs     0x0

DEBUG 11:17:32.627353 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627423 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627466 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627498 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627541 exec.go:188: discarding garbage response ""

ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts

DEBUG 11:17:32.627658 server.go:164: time to scan bdev storage: 403.426657ms

DEBUG 11:17:32.627726 pubsub.go:259: stopping event loop

DEBUG 11:17:32.627853 main.go:69: Unable to decode response after 5 attempts

github.com/daos-stack/daos/src/control/pbin.ExecReq

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/exec.go:197

github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:100

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586

github.com/daos-stack/daos/src/control/server/storage.scanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483

github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493

github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan

/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60

github.com/daos-stack/daos/src/control/server.scanBdevStorage

/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297

github.com/daos-stack/daos/src/control/server.(*server).addEngines

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306

github.com/daos-stack/daos/src/control/server.Start

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549

main.(*startCmd).Execute

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147

main.parseOpts.func1

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126

github.com/jessevdk/go-flags.(*Parser).ParseArgs

/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314

main.parseOpts

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134

main.main

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151

runtime.main

/usr/src/runtime/proc.go:250

runtime.goexit

/usr/src/runtime/asm_amd64.s:1594

privileged binary execution failed

github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:105

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586

github.com/daos-stack/daos/src/control/server/storage.scanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483

github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493

github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan

/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60

github.com/daos-stack/daos/src/control/server.scanBdevStorage

/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297

github.com/daos-stack/daos/src/control/server.(*server).addEngines

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306

github.com/daos-stack/daos/src/control/server.Start

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549

main.(*startCmd).Execute

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147

main.parseOpts.func1

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126

github.com/jessevdk/go-flags.(*Parser).ParseArgs

/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314

main.parseOpts

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134

main.main

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151

runtime.main

/usr/src/runtime/proc.go:250

runtime.goexit

/usr/src/runtime/asm_amd64.s:1594

NVMe Scan Failed

github.com/daos-stack/daos/src/control/server.scanBdevStorage

/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:302

github.com/daos-stack/daos/src/control/server.(*server).addEngines

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306

github.com/daos-stack/daos/src/control/server.Start

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549

main.(*startCmd).Execute

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147

main.parseOpts.func1

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126

github.com/jessevdk/go-flags.(*Parser).ParseArgs

/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314

main.parseOpts

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134

main.main

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151

runtime.main

/usr/src/runtime/proc.go:250

runtime.goexit

/usr/src/runtime/asm_amd64.s:1594

ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts

 

Join daos@daos.groups.io to automatically receive all group messages.