DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)


JiangYu
 

Hello everyone,
When I start Daos Server, the following information appears. How should I solve it?


[root@Rocky-1 ~]# /usr/share/spdk/scripts/setup.sh
0000:44:00.0 (1d78 1512): nvme -> vfio-pci

[root@Rocky-1 ~]# cat /etc/daos/daos_server.yml 
name: daos_server
access_points: ['Rocky-1']
port: 10001
transport_config:
  allow_insecure: false
  client_cert_dir: /etc/daos/certs/clients
  ca_cert: /etc/daos/certs/daosCA.crt
  cert: /etc/daos/certs/server.crt
  key: /etc/daos/certs/server.key
provider: ofi+sockets
socket_dir: /var/run/
nr_hugepages: 4096
control_log_mask: DEBUG
control_log_file: /var/log/daos_server.log
helper_log_file: /var/log/daos_admin.log
 
engines:
-
  targets: 8
  nr_xs_helpers: 0
  fabric_iface: enp3s0f1
  fabric_iface_port: 31316
  log_mask: INFO
  log_file: /var/log/daos_engine_0.log
  env_vars:
      - CRT_TIMEOUT=30
  storage:
  -
    class: ram
    scm_mount: /mnt/daos0
    scm_size: 2 #gb to allocate for tmpfs to emulate SCM
  -
    class: nvme
    bdev_list: ["0000:44:00.0"]
 

[root@Rocky-1 ~]# /usr/bin/daos_server start
DAOS Server config loaded from /etc/daos/daos_server.yml
/usr/bin/daos_server logging to file /var/log/daos_server.log
DEBUG 11:17:31.720878 start.go:90: Switching control log level to DEBUG
DEBUG 11:17:31.721131 defaults.go:92: failed to load library: unable to open a handle to the library
ERROR: unable to open a handle to the library
DEBUG 11:17:31.721209 fabric.go:875: waiting for fabric interfaces to become ready...
DEBUG 11:17:31.721299 fabric.go:892: fabric interface "enp3s0f1" is ready
DEBUG 11:17:31.721372 provider.go:87: getting topology with hwloc version 0x20100
DEBUG 11:17:31.769773 provider.go:145: adding device found at "/sys/class/net/eno1" (type network interface, NUMA node 0)
DEBUG 11:17:31.769933 provider.go:145: adding device found at "/sys/class/net/eno2" (type network interface, NUMA node 0)
DEBUG 11:17:31.770081 provider.go:145: adding device found at "/sys/class/net/eno3" (type network interface, NUMA node 0)
DEBUG 11:17:31.770212 provider.go:145: adding device found at "/sys/class/net/eno4" (type network interface, NUMA node 0)
DEBUG 11:17:31.770357 provider.go:145: adding device found at "/sys/class/net/enp3s0f0" (type network interface, NUMA node 0)
DEBUG 11:17:31.770485 provider.go:145: adding device found at "/sys/class/net/enp3s0f1" (type network interface, NUMA node 0)
DEBUG 11:17:31.770537 provider.go:125: failed to read net device: open /sys/class/net/lo/device/net: no such file or directory
DEBUG 11:17:31.770749 provider.go:264: adding virtual device at "/sys/devices/virtual/net/lo"
DEBUG 11:17:31.886150 provider.go:83: found fabric interfaces:
enp3s0f1 (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
lo (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
shm (providers: shm)
DEBUG 11:17:31.886239 provider.go:292: no cxi subsystem in sysfs
DEBUG 11:17:31.886338 fabric.go:441: unable to open a handle to the library
DEBUG 11:17:31.886419 fabric.go:511: ignoring fabric interface "shm" (shm) not found in topology
DEBUG 11:17:31.886534 fabric.go:793: discovered 2 fabric interfaces:
enp3s0f1 (interface: enp3s0f1) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
lo (interface: lo) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
DEBUG 11:17:31.886645 server.go:750: detected NUMA affinity 0 for engine 0
DEBUG 11:17:31.886675 server.go:757: enabling single-engine legacy core allocation algorithm
DEBUG 11:17:31.886703 server.go:420: validating config file read from "/etc/daos/daos_server.yml"
DEBUG 11:17:31.886742 server.go:443: vfio=true hotplug=false vmd=true requested in config
WARNING: Configuration includes only one access point. This provides no redundancy in the event of an access point failure.
DEBUG 11:17:31.886841 server.go:549: engine 0 fabric numa 0, storage numa 0
DEBUG 11:17:31.887914 server_utils.go:148: setting OFI_DOMAIN=enp3s0f1 for enp3s0f1
DEBUG 11:17:31.889170 server.go:377: active config saved to /var/run/.daos_server.active.yml (read-only)
DEBUG 11:17:31.889251 server.go:525: fault domain: /rocky-1
DEBUG 11:17:31.889862 server.go:236: setting core dump filter to 0x13
DEBUG 11:17:31.890615 database.go:280: set db replica addr: 192.168.1.215:10001
DEBUG 11:17:31.891076 server.go:164: time to init network: 242.45µs
DEBUG 11:17:31.891195 server_utils.go:260: allocating 4098 hugepages on each of these numa nodes: [0]
DEBUG 11:17:31.891267 ctl_storage.go:53: calling bdev provider prepare: {ForwardableRequest:{Forwarded:false} HugePageCount:4098 HugeNodes:0 CleanHugePagesOnly:false PCIAllowList: PCIBlockList: TargetUser:root Reset_:false DisableVFIO:false EnableVMD:true}
DEBUG 11:17:32.224164 server.go:164: time to prepare bdev storage: 332.967644ms
DEBUG 11:17:32.224261 ctl_storage.go:59: calling bdev provider scan: {ForwardableRequest:{Forwarded:false} DeviceList:0000:44:00.0 VMDEnabled:false BypassCache:true}
ERROR: /usr/bin/daos_admin SIGILL: illegal instruction
PC=0x7fbc78755c0e m=0 sigcode=2
signal arrived during cgo execution
instruction bytes: 0xc4 0xe2 0x69 0xf7 0xc0 0x41 0x89 0x85 0xe0 0x19 0x0 0x0 0xe8 0xb1 0x98 0xfe
 
goroutine 1 [syscall]:
runtime.cgocall(0x92049d, 0xc0001d1c20)
/usr/src/runtime/cgocall.go:158 +0x5c fp=0xc0001d1bf8 sp=0xc0001d1bc0 pc=0x408b1c
github.com/daos-stack/daos/src/control/lib/spdk._Cfunc_nvme_discover()
_cgo_gotypes.go:321 +0x49 fp=0xc0001d1c20 sp=0xc0001d1bf8 pc=0x904fc9
github.com/daos-stack/daos/src/control/lib/spdk.(*NvmeImpl).Discover(0xc0000bcd00?, {0xb267f8, 0xc000184300})
/builddir/build/BUILD/daos-2.2.0/src/control/lib/spdk/nvme.go:127 +0x54 fp=0xc0001d1cd8
ERROR: /usr/bin/daos_admin  sp=0xc0001d1c20 pc=0x9059b4
github.com/daos-stack/daos/src/control/server/storage/bdev.(*spdkBackend).Scan(0xc0000bcce0, {{0x56?}, 0xc0001d5030?, 0x20?, 0x1?})
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/backend.go:341 +0x1b7 fp=0xc0001d1da8 sp=0xc0001d1cd8 pc=0x909f37
github.com/daos-stack/daos/src/control/server/storage/bdev.(*Provider).Scan(...)
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/provider.go:54
main.(*bdevScanHandler).Handle(0xc000014788, {0xb267f8?, 0xc000184300}, 0xc0002e4240)
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/handler.go:175 +0x27a fp=0xc0001d1e08 sp=0xc0001d1da8 pc=0x91d1fa
github.com/daos-stack/daos/src/control/pbin.(*App).handleRequest(0xc0000caae0, 0xc0002e4240)
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:214 +0x62 fp=0xc0001d1e58 sp=0xc0001d1e08 pc=0x5949c2
github.com/daos-stack/daos/src/control/pbin.(*App).Run(0xc0000caae0)
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:155 +0x2ed fp=0xc0001d1f50 sp=0xc0001d1e58 pc=0x59448d
main.main()
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/main.go:25 +0xaf fp=0xc0001d1f80 sp=0xc0001d1f50 pc=0x91de6f
runtime.main()
/usr/src/runtime/proc.go:250 +0x212 fp=0xc0001d1fe0 sp=0xc0001d1f80 pc=0x43dd32
runtime.goexit
ERROR: /usr/bin/daos_admin ()
/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0001d1fe8 sp=0xc0001d1fe0 pc=0x46b9c1
 
goroutine 2 [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?
ERROR: /usr/bin/daos_admin , 0x0?, 0x0?)
/usr/src/runtime/proc.go:363 +0xd6 fp=0xc00009efb0 sp=0xc00009ef90 pc=0x43e0f6
ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369
runtime.forcegchelper()
 
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:302 +0xad fp=0xc00009efe0 sp=0xc00009efb0 pc=0x43df8d
runtime.goexit()
/usr/src/runtime/asm_amd64.s
ERROR: /usr/bin/daos_admin :1594 +0x1 fp=0xc00009efe8 sp=0xc00009efe0 pc=0x46b9c1
created by 
ERROR: /usr/bin/daos_admin runtime.init.6
/usr/src/runtime/proc.go:290 +0x25
ERROR: /usr/bin/daos_admin 
goroutine 3 [GC sweep wait]:
runtime.gopark(0x0
ERROR: /usr/bin/daos_admin ?, 0x0?, 0x0?, 0x0?
ERROR: /usr/bin/daos_admin , 0x0?)
/usr/src/runtime/proc.go:363 +
ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009f790 sp=0xc00009f770 pc=0x43e0f6
ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)
/usr/src/runtime/proc.go:369
runtime.bgsweep(0x0?)
 
ERROR: /usr/bin/daos_admin /usr/src/runtime/mgcsweep.go:278 +0x8e fp=0xc00009f7c8 sp=0xc00009f790 pc=0x429c2e
runtime.gcenable.func1()
/usr/src/runtime/mgc.go:178 +
ERROR: /usr/bin/daos_admin 0x26 fp=0xc00009f7e0 sp=0xc00009f7c8 pc=0x41e8c6
runtime.goexit()
/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc00009f7e8 sp=0xc00009f7e0 pc=0x46b9c1
created by runtime.gcenable
/usr/src/runtime/mgc.go:
ERROR: /usr/bin/daos_admin 178 +0x6b
 
goroutine 4 [GC scavenge wait]:
runtime.gopark(0xc0000c6000?, 0xb1e1e8?
ERROR: /usr/bin/daos_admin , 0x1?, 0x0?, 0x0?)
/usr/src/runtime/proc.go:363 +
ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009ff70 sp=0xc00009ff50 pc=0x43e0f6
runtime.goparkunlock(...)
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369
runtime.(*scavengerState).park(0x10a3a20)
/usr/src/runtime/mgcscavenge.go:389 +0x53 fp=
ERROR: /usr/bin/daos_admin 0xc00009ffa0 sp=0xc00009ff70 pc=0x427cd3
runtime.bgscavenge(0x0?)
/usr/src/runtime/mgcscavenge.go:
ERROR: /usr/bin/daos_admin 617 +0x45 fp=0xc00009ffc8 sp=0xc00009ffa0 pc=0x4282a5
runtime.gcenable.func2
ERROR: /usr/bin/daos_admin ()
/usr/src/runtime/mgc.go:179 +0x26 fp=0xc00009ffe0 sp=0xc00009ffc8 pc=0x41e866
ERROR: /usr/bin/daos_admin 
runtime.goexit()
/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=
ERROR: /usr/bin/daos_admin 0xc00009ffe8 sp=0xc00009ffe0 pc=0x46b9c1
created by runtime.gcenable
/usr/src/runtime/mgc.go:179
ERROR: /usr/bin/daos_admin  +0xaa
 
goroutine 5 [finalizer wait]:
runtime.gopark(0x10a4520?, 
ERROR: /usr/bin/daos_admin 0xc000007860?, 0x0?, 0x0?, 0xc00009e770?)
/usr/src/runtime/proc.go:363
ERROR: /usr/bin/daos_admin  +0xd6 fp=0xc00009e628 sp=0xc00009e608 pc=0x43e0f6
runtime.goparkunlock(...)
 
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369
runtime.runfinq()
/usr/src/runtime/mfinal.go:
ERROR: /usr/bin/daos_admin 180 +0x10f fp=0xc00009e7e0 sp=0xc00009e628 pc=0x41d9cf
runtime.goexit()
/usr/src/runtime/asm_amd64.s:
ERROR: /usr/bin/daos_admin 1594 +0x1 fp=0xc00009e7e8 sp=0xc00009e7e0 pc=0x46b9c1
created by runtime.createfing
/usr/src/runtime/mfinal.go:157 +0x45
ERROR: /usr/bin/daos_admin 
rax    0x1
rbx    0x2492240
rcx    0x7fbc78da4e60
rdx    0x0
rdi    0x2492240
rsi    
ERROR: /usr/bin/daos_admin 0x7fbc78da1af0
rbp    0x2000003e7240
rsp    0x7ffc5180b120
r8     0x7fbc78da2460
r9     0x0
r10    0x70000000004
r11    0x0
ERROR: /usr/bin/daos_admin 
r12    0x202001000000
r13    0x2000003e7240
r14    0x7ffc5180b150
r15    0x0
rip    0x7fbc78755c0e
rflags 
ERROR: /usr/bin/daos_admin 0x13246
cs     0x33
fs     0x0
gs     0x0
DEBUG 11:17:32.627353 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627423 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627466 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627498 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627541 exec.go:188: discarding garbage response ""
ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts
DEBUG 11:17:32.627658 server.go:164: time to scan bdev storage: 403.426657ms
DEBUG 11:17:32.627726 pubsub.go:259: stopping event loop
DEBUG 11:17:32.627853 main.go:69: Unable to decode response after 5 attempts
github.com/daos-stack/daos/src/control/pbin.ExecReq
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/exec.go:197
github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:100
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586
github.com/daos-stack/daos/src/control/server/storage.scanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483
github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493
github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan
/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60
github.com/daos-stack/daos/src/control/server.scanBdevStorage
/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297
github.com/daos-stack/daos/src/control/server.(*server).addEngines
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306
github.com/daos-stack/daos/src/control/server.Start
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549
main.(*startCmd).Execute
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147
main.parseOpts.func1
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126
github.com/jessevdk/go-flags.(*Parser).ParseArgs
/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314
main.parseOpts
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134
main.main
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151
runtime.main
/usr/src/runtime/proc.go:250
runtime.goexit
/usr/src/runtime/asm_amd64.s:1594
privileged binary execution failed
github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:105
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586
github.com/daos-stack/daos/src/control/server/storage.scanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483
github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493
github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan
/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60
github.com/daos-stack/daos/src/control/server.scanBdevStorage
/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297
github.com/daos-stack/daos/src/control/server.(*server).addEngines
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306
github.com/daos-stack/daos/src/control/server.Start
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549
main.(*startCmd).Execute
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147
main.parseOpts.func1
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126
github.com/jessevdk/go-flags.(*Parser).ParseArgs
/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314
main.parseOpts
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134
main.main
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151
runtime.main
/usr/src/runtime/proc.go:250
runtime.goexit
/usr/src/runtime/asm_amd64.s:1594
NVMe Scan Failed
github.com/daos-stack/daos/src/control/server.scanBdevStorage
/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:302
github.com/daos-stack/daos/src/control/server.(*server).addEngines
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306
github.com/daos-stack/daos/src/control/server.Start
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549
main.(*startCmd).Execute
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147
main.parseOpts.func1
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126
github.com/jessevdk/go-flags.(*Parser).ParseArgs
/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314
main.parseOpts
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134
main.main
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151
runtime.main
/usr/src/runtime/proc.go:250
runtime.goexit
/usr/src/runtime/asm_amd64.s:1594
ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts
 

Join daos@daos.groups.io to automatically receive all group messages.