DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)


JiangYu
 

Hello everyone,
When I start Daos Server, the following information appears. How should I solve it?


[root@Rocky-1 ~]# /usr/share/spdk/scripts/setup.sh
0000:44:00.0 (1d78 1512): nvme -> vfio-pci

[root@Rocky-1 ~]# cat /etc/daos/daos_server.yml 
name: daos_server
access_points: ['Rocky-1']
port: 10001
transport_config:
  allow_insecure: false
  client_cert_dir: /etc/daos/certs/clients
  ca_cert: /etc/daos/certs/daosCA.crt
  cert: /etc/daos/certs/server.crt
  key: /etc/daos/certs/server.key
provider: ofi+sockets
socket_dir: /var/run/
nr_hugepages: 4096
control_log_mask: DEBUG
control_log_file: /var/log/daos_server.log
helper_log_file: /var/log/daos_admin.log
 
engines:
-
  targets: 8
  nr_xs_helpers: 0
  fabric_iface: enp3s0f1
  fabric_iface_port: 31316
  log_mask: INFO
  log_file: /var/log/daos_engine_0.log
  env_vars:
      - CRT_TIMEOUT=30
  storage:
  -
    class: ram
    scm_mount: /mnt/daos0
    scm_size: 2 #gb to allocate for tmpfs to emulate SCM
  -
    class: nvme
    bdev_list: ["0000:44:00.0"]
 

[root@Rocky-1 ~]# /usr/bin/daos_server start
DAOS Server config loaded from /etc/daos/daos_server.yml
/usr/bin/daos_server logging to file /var/log/daos_server.log
DEBUG 11:17:31.720878 start.go:90: Switching control log level to DEBUG
DEBUG 11:17:31.721131 defaults.go:92: failed to load library: unable to open a handle to the library
ERROR: unable to open a handle to the library
DEBUG 11:17:31.721209 fabric.go:875: waiting for fabric interfaces to become ready...
DEBUG 11:17:31.721299 fabric.go:892: fabric interface "enp3s0f1" is ready
DEBUG 11:17:31.721372 provider.go:87: getting topology with hwloc version 0x20100
DEBUG 11:17:31.769773 provider.go:145: adding device found at "/sys/class/net/eno1" (type network interface, NUMA node 0)
DEBUG 11:17:31.769933 provider.go:145: adding device found at "/sys/class/net/eno2" (type network interface, NUMA node 0)
DEBUG 11:17:31.770081 provider.go:145: adding device found at "/sys/class/net/eno3" (type network interface, NUMA node 0)
DEBUG 11:17:31.770212 provider.go:145: adding device found at "/sys/class/net/eno4" (type network interface, NUMA node 0)
DEBUG 11:17:31.770357 provider.go:145: adding device found at "/sys/class/net/enp3s0f0" (type network interface, NUMA node 0)
DEBUG 11:17:31.770485 provider.go:145: adding device found at "/sys/class/net/enp3s0f1" (type network interface, NUMA node 0)
DEBUG 11:17:31.770537 provider.go:125: failed to read net device: open /sys/class/net/lo/device/net: no such file or directory
DEBUG 11:17:31.770749 provider.go:264: adding virtual device at "/sys/devices/virtual/net/lo"
DEBUG 11:17:31.886150 provider.go:83: found fabric interfaces:
enp3s0f1 (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
lo (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
shm (providers: shm)
DEBUG 11:17:31.886239 provider.go:292: no cxi subsystem in sysfs
DEBUG 11:17:31.886338 fabric.go:441: unable to open a handle to the library
DEBUG 11:17:31.886419 fabric.go:511: ignoring fabric interface "shm" (shm) not found in topology
DEBUG 11:17:31.886534 fabric.go:793: discovered 2 fabric interfaces:
enp3s0f1 (interface: enp3s0f1) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
lo (interface: lo) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
DEBUG 11:17:31.886645 server.go:750: detected NUMA affinity 0 for engine 0
DEBUG 11:17:31.886675 server.go:757: enabling single-engine legacy core allocation algorithm
DEBUG 11:17:31.886703 server.go:420: validating config file read from "/etc/daos/daos_server.yml"
DEBUG 11:17:31.886742 server.go:443: vfio=true hotplug=false vmd=true requested in config
WARNING: Configuration includes only one access point. This provides no redundancy in the event of an access point failure.
DEBUG 11:17:31.886841 server.go:549: engine 0 fabric numa 0, storage numa 0
DEBUG 11:17:31.887914 server_utils.go:148: setting OFI_DOMAIN=enp3s0f1 for enp3s0f1
DEBUG 11:17:31.889170 server.go:377: active config saved to /var/run/.daos_server.active.yml (read-only)
DEBUG 11:17:31.889251 server.go:525: fault domain: /rocky-1
DEBUG 11:17:31.889862 server.go:236: setting core dump filter to 0x13
DEBUG 11:17:31.890615 database.go:280: set db replica addr: 192.168.1.215:10001
DEBUG 11:17:31.891076 server.go:164: time to init network: 242.45µs
DEBUG 11:17:31.891195 server_utils.go:260: allocating 4098 hugepages on each of these numa nodes: [0]
DEBUG 11:17:31.891267 ctl_storage.go:53: calling bdev provider prepare: {ForwardableRequest:{Forwarded:false} HugePageCount:4098 HugeNodes:0 CleanHugePagesOnly:false PCIAllowList: PCIBlockList: TargetUser:root Reset_:false DisableVFIO:false EnableVMD:true}
DEBUG 11:17:32.224164 server.go:164: time to prepare bdev storage: 332.967644ms
DEBUG 11:17:32.224261 ctl_storage.go:59: calling bdev provider scan: {ForwardableRequest:{Forwarded:false} DeviceList:0000:44:00.0 VMDEnabled:false BypassCache:true}
ERROR: /usr/bin/daos_admin SIGILL: illegal instruction
PC=0x7fbc78755c0e m=0 sigcode=2
signal arrived during cgo execution
instruction bytes: 0xc4 0xe2 0x69 0xf7 0xc0 0x41 0x89 0x85 0xe0 0x19 0x0 0x0 0xe8 0xb1 0x98 0xfe
 
goroutine 1 [syscall]:
runtime.cgocall(0x92049d, 0xc0001d1c20)
/usr/src/runtime/cgocall.go:158 +0x5c fp=0xc0001d1bf8 sp=0xc0001d1bc0 pc=0x408b1c
github.com/daos-stack/daos/src/control/lib/spdk._Cfunc_nvme_discover()
_cgo_gotypes.go:321 +0x49 fp=0xc0001d1c20 sp=0xc0001d1bf8 pc=0x904fc9
github.com/daos-stack/daos/src/control/lib/spdk.(*NvmeImpl).Discover(0xc0000bcd00?, {0xb267f8, 0xc000184300})
/builddir/build/BUILD/daos-2.2.0/src/control/lib/spdk/nvme.go:127 +0x54 fp=0xc0001d1cd8
ERROR: /usr/bin/daos_admin  sp=0xc0001d1c20 pc=0x9059b4
github.com/daos-stack/daos/src/control/server/storage/bdev.(*spdkBackend).Scan(0xc0000bcce0, {{0x56?}, 0xc0001d5030?, 0x20?, 0x1?})
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/backend.go:341 +0x1b7 fp=0xc0001d1da8 sp=0xc0001d1cd8 pc=0x909f37
github.com/daos-stack/daos/src/control/server/storage/bdev.(*Provider).Scan(...)
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/provider.go:54
main.(*bdevScanHandler).Handle(0xc000014788, {0xb267f8?, 0xc000184300}, 0xc0002e4240)
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/handler.go:175 +0x27a fp=0xc0001d1e08 sp=0xc0001d1da8 pc=0x91d1fa
github.com/daos-stack/daos/src/control/pbin.(*App).handleRequest(0xc0000caae0, 0xc0002e4240)
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:214 +0x62 fp=0xc0001d1e58 sp=0xc0001d1e08 pc=0x5949c2
github.com/daos-stack/daos/src/control/pbin.(*App).Run(0xc0000caae0)
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:155 +0x2ed fp=0xc0001d1f50 sp=0xc0001d1e58 pc=0x59448d
main.main()
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/main.go:25 +0xaf fp=0xc0001d1f80 sp=0xc0001d1f50 pc=0x91de6f
runtime.main()
/usr/src/runtime/proc.go:250 +0x212 fp=0xc0001d1fe0 sp=0xc0001d1f80 pc=0x43dd32
runtime.goexit
ERROR: /usr/bin/daos_admin ()
/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0001d1fe8 sp=0xc0001d1fe0 pc=0x46b9c1
 
goroutine 2 [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?
ERROR: /usr/bin/daos_admin , 0x0?, 0x0?)
/usr/src/runtime/proc.go:363 +0xd6 fp=0xc00009efb0 sp=0xc00009ef90 pc=0x43e0f6
ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369
runtime.forcegchelper()
 
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:302 +0xad fp=0xc00009efe0 sp=0xc00009efb0 pc=0x43df8d
runtime.goexit()
/usr/src/runtime/asm_amd64.s
ERROR: /usr/bin/daos_admin :1594 +0x1 fp=0xc00009efe8 sp=0xc00009efe0 pc=0x46b9c1
created by 
ERROR: /usr/bin/daos_admin runtime.init.6
/usr/src/runtime/proc.go:290 +0x25
ERROR: /usr/bin/daos_admin 
goroutine 3 [GC sweep wait]:
runtime.gopark(0x0
ERROR: /usr/bin/daos_admin ?, 0x0?, 0x0?, 0x0?
ERROR: /usr/bin/daos_admin , 0x0?)
/usr/src/runtime/proc.go:363 +
ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009f790 sp=0xc00009f770 pc=0x43e0f6
ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)
/usr/src/runtime/proc.go:369
runtime.bgsweep(0x0?)
 
ERROR: /usr/bin/daos_admin /usr/src/runtime/mgcsweep.go:278 +0x8e fp=0xc00009f7c8 sp=0xc00009f790 pc=0x429c2e
runtime.gcenable.func1()
/usr/src/runtime/mgc.go:178 +
ERROR: /usr/bin/daos_admin 0x26 fp=0xc00009f7e0 sp=0xc00009f7c8 pc=0x41e8c6
runtime.goexit()
/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc00009f7e8 sp=0xc00009f7e0 pc=0x46b9c1
created by runtime.gcenable
/usr/src/runtime/mgc.go:
ERROR: /usr/bin/daos_admin 178 +0x6b
 
goroutine 4 [GC scavenge wait]:
runtime.gopark(0xc0000c6000?, 0xb1e1e8?
ERROR: /usr/bin/daos_admin , 0x1?, 0x0?, 0x0?)
/usr/src/runtime/proc.go:363 +
ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009ff70 sp=0xc00009ff50 pc=0x43e0f6
runtime.goparkunlock(...)
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369
runtime.(*scavengerState).park(0x10a3a20)
/usr/src/runtime/mgcscavenge.go:389 +0x53 fp=
ERROR: /usr/bin/daos_admin 0xc00009ffa0 sp=0xc00009ff70 pc=0x427cd3
runtime.bgscavenge(0x0?)
/usr/src/runtime/mgcscavenge.go:
ERROR: /usr/bin/daos_admin 617 +0x45 fp=0xc00009ffc8 sp=0xc00009ffa0 pc=0x4282a5
runtime.gcenable.func2
ERROR: /usr/bin/daos_admin ()
/usr/src/runtime/mgc.go:179 +0x26 fp=0xc00009ffe0 sp=0xc00009ffc8 pc=0x41e866
ERROR: /usr/bin/daos_admin 
runtime.goexit()
/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=
ERROR: /usr/bin/daos_admin 0xc00009ffe8 sp=0xc00009ffe0 pc=0x46b9c1
created by runtime.gcenable
/usr/src/runtime/mgc.go:179
ERROR: /usr/bin/daos_admin  +0xaa
 
goroutine 5 [finalizer wait]:
runtime.gopark(0x10a4520?, 
ERROR: /usr/bin/daos_admin 0xc000007860?, 0x0?, 0x0?, 0xc00009e770?)
/usr/src/runtime/proc.go:363
ERROR: /usr/bin/daos_admin  +0xd6 fp=0xc00009e628 sp=0xc00009e608 pc=0x43e0f6
runtime.goparkunlock(...)
 
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369
runtime.runfinq()
/usr/src/runtime/mfinal.go:
ERROR: /usr/bin/daos_admin 180 +0x10f fp=0xc00009e7e0 sp=0xc00009e628 pc=0x41d9cf
runtime.goexit()
/usr/src/runtime/asm_amd64.s:
ERROR: /usr/bin/daos_admin 1594 +0x1 fp=0xc00009e7e8 sp=0xc00009e7e0 pc=0x46b9c1
created by runtime.createfing
/usr/src/runtime/mfinal.go:157 +0x45
ERROR: /usr/bin/daos_admin 
rax    0x1
rbx    0x2492240
rcx    0x7fbc78da4e60
rdx    0x0
rdi    0x2492240
rsi    
ERROR: /usr/bin/daos_admin 0x7fbc78da1af0
rbp    0x2000003e7240
rsp    0x7ffc5180b120
r8     0x7fbc78da2460
r9     0x0
r10    0x70000000004
r11    0x0
ERROR: /usr/bin/daos_admin 
r12    0x202001000000
r13    0x2000003e7240
r14    0x7ffc5180b150
r15    0x0
rip    0x7fbc78755c0e
rflags 
ERROR: /usr/bin/daos_admin 0x13246
cs     0x33
fs     0x0
gs     0x0
DEBUG 11:17:32.627353 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627423 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627466 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627498 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627541 exec.go:188: discarding garbage response ""
ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts
DEBUG 11:17:32.627658 server.go:164: time to scan bdev storage: 403.426657ms
DEBUG 11:17:32.627726 pubsub.go:259: stopping event loop
DEBUG 11:17:32.627853 main.go:69: Unable to decode response after 5 attempts
github.com/daos-stack/daos/src/control/pbin.ExecReq
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/exec.go:197
github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:100
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586
github.com/daos-stack/daos/src/control/server/storage.scanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483
github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493
github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan
/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60
github.com/daos-stack/daos/src/control/server.scanBdevStorage
/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297
github.com/daos-stack/daos/src/control/server.(*server).addEngines
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306
github.com/daos-stack/daos/src/control/server.Start
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549
main.(*startCmd).Execute
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147
main.parseOpts.func1
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126
github.com/jessevdk/go-flags.(*Parser).ParseArgs
/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314
main.parseOpts
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134
main.main
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151
runtime.main
/usr/src/runtime/proc.go:250
runtime.goexit
/usr/src/runtime/asm_amd64.s:1594
privileged binary execution failed
github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:105
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586
github.com/daos-stack/daos/src/control/server/storage.scanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483
github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493
github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan
/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60
github.com/daos-stack/daos/src/control/server.scanBdevStorage
/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297
github.com/daos-stack/daos/src/control/server.(*server).addEngines
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306
github.com/daos-stack/daos/src/control/server.Start
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549
main.(*startCmd).Execute
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147
main.parseOpts.func1
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126
github.com/jessevdk/go-flags.(*Parser).ParseArgs
/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314
main.parseOpts
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134
main.main
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151
runtime.main
/usr/src/runtime/proc.go:250
runtime.goexit
/usr/src/runtime/asm_amd64.s:1594
NVMe Scan Failed
github.com/daos-stack/daos/src/control/server.scanBdevStorage
/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:302
github.com/daos-stack/daos/src/control/server.(*server).addEngines
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306
github.com/daos-stack/daos/src/control/server.Start
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549
main.(*startCmd).Execute
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147
main.parseOpts.func1
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126
github.com/jessevdk/go-flags.(*Parser).ParseArgs
/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314
main.parseOpts
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134
main.main
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151
runtime.main
/usr/src/runtime/proc.go:250
runtime.goexit
/usr/src/runtime/asm_amd64.s:1594
ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts
 


JiangYu
 

Does this refer to NVMe instructions or CPU instructions? Is my device not supported?


Huang, Lei
 

ERROR: /usr/bin/daos_admin SIGILL: illegal instruction

 

You CPU does not support certain instructions inside daos_admin process. Could you please attach the output of “lscpu” of your computer? Thank you!

 

-lei

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of lnsyyj@...
Sent: Wednesday, November 2, 2022 10:26 PM
To: daos@daos.groups.io
Subject: [daos] DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

 

Hello everyone,

When I start Daos Server, the following information appears. How should I solve it?

 

[root@Rocky-1 ~]# /usr/share/spdk/scripts/setup.sh

0000:44:00.0 (1d78 1512): nvme -> vfio-pci

[root@Rocky-1 ~]# cat /etc/daos/daos_server.yml 

name: daos_server

access_points: ['Rocky-1']

port: 10001

transport_config:

  allow_insecure: false

  client_cert_dir: /etc/daos/certs/clients

  ca_cert: /etc/daos/certs/daosCA.crt

  cert: /etc/daos/certs/server.crt

  key: /etc/daos/certs/server.key

provider: ofi+sockets

socket_dir: /var/run/

nr_hugepages: 4096

control_log_mask: DEBUG

control_log_file: /var/log/daos_server.log

helper_log_file: /var/log/daos_admin.log

 

engines:

-

  targets: 8

  nr_xs_helpers: 0

  fabric_iface: enp3s0f1

  fabric_iface_port: 31316

  log_mask: INFO

  log_file: /var/log/daos_engine_0.log

  env_vars:

      - CRT_TIMEOUT=30

  storage:

  -

    class: ram

    scm_mount: /mnt/daos0

    scm_size: 2 #gb to allocate for tmpfs to emulate SCM

  -

    class: nvme

    bdev_list: ["0000:44:00.0"]

 


[root@Rocky-1 ~]# /usr/bin/daos_server start

DAOS Server config loaded from /etc/daos/daos_server.yml

/usr/bin/daos_server logging to file /var/log/daos_server.log

DEBUG 11:17:31.720878 start.go:90: Switching control log level to DEBUG

DEBUG 11:17:31.721131 defaults.go:92: failed to load library: unable to open a handle to the library

ERROR: unable to open a handle to the library

DEBUG 11:17:31.721209 fabric.go:875: waiting for fabric interfaces to become ready...

DEBUG 11:17:31.721299 fabric.go:892: fabric interface "enp3s0f1" is ready

DEBUG 11:17:31.721372 provider.go:87: getting topology with hwloc version 0x20100

DEBUG 11:17:31.769773 provider.go:145: adding device found at "/sys/class/net/eno1" (type network interface, NUMA node 0)

DEBUG 11:17:31.769933 provider.go:145: adding device found at "/sys/class/net/eno2" (type network interface, NUMA node 0)

DEBUG 11:17:31.770081 provider.go:145: adding device found at "/sys/class/net/eno3" (type network interface, NUMA node 0)

DEBUG 11:17:31.770212 provider.go:145: adding device found at "/sys/class/net/eno4" (type network interface, NUMA node 0)

DEBUG 11:17:31.770357 provider.go:145: adding device found at "/sys/class/net/enp3s0f0" (type network interface, NUMA node 0)

DEBUG 11:17:31.770485 provider.go:145: adding device found at "/sys/class/net/enp3s0f1" (type network interface, NUMA node 0)

DEBUG 11:17:31.770537 provider.go:125: failed to read net device: open /sys/class/net/lo/device/net: no such file or directory

DEBUG 11:17:31.770749 provider.go:264: adding virtual device at "/sys/devices/virtual/net/lo"

DEBUG 11:17:31.886150 provider.go:83: found fabric interfaces:

enp3s0f1 (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

lo (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

shm (providers: shm)

DEBUG 11:17:31.886239 provider.go:292: no cxi subsystem in sysfs

DEBUG 11:17:31.886338 fabric.go:441: unable to open a handle to the library

DEBUG 11:17:31.886419 fabric.go:511: ignoring fabric interface "shm" (shm) not found in topology

DEBUG 11:17:31.886534 fabric.go:793: discovered 2 fabric interfaces:

enp3s0f1 (interface: enp3s0f1) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

lo (interface: lo) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

DEBUG 11:17:31.886645 server.go:750: detected NUMA affinity 0 for engine 0

DEBUG 11:17:31.886675 server.go:757: enabling single-engine legacy core allocation algorithm

DEBUG 11:17:31.886703 server.go:420: validating config file read from "/etc/daos/daos_server.yml"

DEBUG 11:17:31.886742 server.go:443: vfio=true hotplug=false vmd=true requested in config

WARNING: Configuration includes only one access point. This provides no redundancy in the event of an access point failure.

DEBUG 11:17:31.886841 server.go:549: engine 0 fabric numa 0, storage numa 0

DEBUG 11:17:31.887914 server_utils.go:148: setting OFI_DOMAIN=enp3s0f1 for enp3s0f1

DEBUG 11:17:31.889170 server.go:377: active config saved to /var/run/.daos_server.active.yml (read-only)

DEBUG 11:17:31.889251 server.go:525: fault domain: /rocky-1

DEBUG 11:17:31.889862 server.go:236: setting core dump filter to 0x13

DEBUG 11:17:31.890615 database.go:280: set db replica addr: 192.168.1.215:10001

DEBUG 11:17:31.891076 server.go:164: time to init network: 242.45µs

DEBUG 11:17:31.891195 server_utils.go:260: allocating 4098 hugepages on each of these numa nodes: [0]

DEBUG 11:17:31.891267 ctl_storage.go:53: calling bdev provider prepare: {ForwardableRequest:{Forwarded:false} HugePageCount:4098 HugeNodes:0 CleanHugePagesOnly:false PCIAllowList: PCIBlockList: TargetUser:root Reset_:false DisableVFIO:false EnableVMD:true}

DEBUG 11:17:32.224164 server.go:164: time to prepare bdev storage: 332.967644ms

DEBUG 11:17:32.224261 ctl_storage.go:59: calling bdev provider scan: {ForwardableRequest:{Forwarded:false} DeviceList:0000:44:00.0 VMDEnabled:false BypassCache:true}

ERROR: /usr/bin/daos_admin SIGILL: illegal instruction

PC=0x7fbc78755c0e m=0 sigcode=2

signal arrived during cgo execution

instruction bytes: 0xc4 0xe2 0x69 0xf7 0xc0 0x41 0x89 0x85 0xe0 0x19 0x0 0x0 0xe8 0xb1 0x98 0xfe

 

goroutine 1 [syscall]:

runtime.cgocall(0x92049d, 0xc0001d1c20)

/usr/src/runtime/cgocall.go:158 +0x5c fp=0xc0001d1bf8 sp=0xc0001d1bc0 pc=0x408b1c

github.com/daos-stack/daos/src/control/lib/spdk._Cfunc_nvme_discover()

_cgo_gotypes.go:321 +0x49 fp=0xc0001d1c20 sp=0xc0001d1bf8 pc=0x904fc9

github.com/daos-stack/daos/src/control/lib/spdk.(*NvmeImpl).Discover(0xc0000bcd00?, {0xb267f8, 0xc000184300})

/builddir/build/BUILD/daos-2.2.0/src/control/lib/spdk/nvme.go:127 +0x54 fp=0xc0001d1cd8

ERROR: /usr/bin/daos_admin  sp=0xc0001d1c20 pc=0x9059b4

github.com/daos-stack/daos/src/control/server/storage/bdev.(*spdkBackend).Scan(0xc0000bcce0, {{0x56?}, 0xc0001d5030?, 0x20?, 0x1?})

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/backend.go:341 +0x1b7 fp=0xc0001d1da8 sp=0xc0001d1cd8 pc=0x909f37

github.com/daos-stack/daos/src/control/server/storage/bdev.(*Provider).Scan(...)

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/provider.go:54

main.(*bdevScanHandler).Handle(0xc000014788, {0xb267f8?, 0xc000184300}, 0xc0002e4240)

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/handler.go:175 +0x27a fp=0xc0001d1e08 sp=0xc0001d1da8 pc=0x91d1fa

github.com/daos-stack/daos/src/control/pbin.(*App).handleRequest(0xc0000caae0, 0xc0002e4240)

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:214 +0x62 fp=0xc0001d1e58 sp=0xc0001d1e08 pc=0x5949c2

github.com/daos-stack/daos/src/control/pbin.(*App).Run(0xc0000caae0)

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:155 +0x2ed fp=0xc0001d1f50 sp=0xc0001d1e58 pc=0x59448d

main.main()

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/main.go:25 +0xaf fp=0xc0001d1f80 sp=0xc0001d1f50 pc=0x91de6f

runtime.main()

/usr/src/runtime/proc.go:250 +0x212 fp=0xc0001d1fe0 sp=0xc0001d1f80 pc=0x43dd32

runtime.goexit

ERROR: /usr/bin/daos_admin ()

/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0001d1fe8 sp=0xc0001d1fe0 pc=0x46b9c1

 

goroutine 2 [force gc (idle)]:

runtime.gopark(0x0?, 0x0?, 0x0?

ERROR: /usr/bin/daos_admin , 0x0?, 0x0?)

/usr/src/runtime/proc.go:363 +0xd6 fp=0xc00009efb0 sp=0xc00009ef90 pc=0x43e0f6

ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)

ERROR: /usr/bin/daos_admin  /usr/src/runtime/proc.go:369

runtime.forcegchelper()

 

ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:302 +0xad fp=0xc00009efe0 sp=0xc00009efb0 pc=0x43df8d

runtime.goexit()

/usr/src/runtime/asm_amd64.s

ERROR: /usr/bin/daos_admin :1594 +0x1 fp=0xc00009efe8 sp=0xc00009efe0 pc=0x46b9c1

created by 

ERROR: /usr/bin/daos_admin runtime.init.6

/usr/src/runtime/proc.go:290 +0x25

ERROR: /usr/bin/daos_admin 

goroutine 3 [GC sweep wait]:

runtime.gopark(0x0

ERROR: /usr/bin/daos_admin ?, 0x0?, 0x0?, 0x0?

ERROR: /usr/bin/daos_admin , 0x0?)

/usr/src/runtime/proc.go:363 +

ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009f790 sp=0xc00009f770 pc=0x43e0f6

ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)

/usr/src/runtime/proc.go:369

runtime.bgsweep(0x0?)

 

ERROR: /usr/bin/daos_admin /usr/src/runtime/mgcsweep.go:278 +0x8e fp=0xc00009f7c8 sp=0xc00009f790 pc=0x429c2e

runtime.gcenable.func1()

/usr/src/runtime/mgc.go:178 +

ERROR: /usr/bin/daos_admin 0x26 fp=0xc00009f7e0 sp=0xc00009f7c8 pc=0x41e8c6

runtime.goexit()

/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc00009f7e8 sp=0xc00009f7e0 pc=0x46b9c1

created by runtime.gcenable

/usr/src/runtime/mgc.go:

ERROR: /usr/bin/daos_admin 178 +0x6b

 

goroutine 4 [GC scavenge wait]:

runtime.gopark(0xc0000c6000?, 0xb1e1e8?

ERROR: /usr/bin/daos_admin , 0x1?, 0x0?, 0x0?)

/usr/src/runtime/proc.go:363 +

ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009ff70 sp=0xc00009ff50 pc=0x43e0f6

runtime.goparkunlock(...)

ERROR: /usr/bin/daos_admin  /usr/src/runtime/proc.go:369

runtime.(*scavengerState).park(0x10a3a20)

/usr/src/runtime/mgcscavenge.go:389 +0x53 fp=

ERROR: /usr/bin/daos_admin 0xc00009ffa0 sp=0xc00009ff70 pc=0x427cd3

runtime.bgscavenge(0x0?)

/usr/src/runtime/mgcscavenge.go:

ERROR: /usr/bin/daos_admin 617 +0x45 fp=0xc00009ffc8 sp=0xc00009ffa0 pc=0x4282a5

runtime.gcenable.func2

ERROR: /usr/bin/daos_admin ()

/usr/src/runtime/mgc.go:179 +0x26 fp=0xc00009ffe0 sp=0xc00009ffc8 pc=0x41e866

ERROR: /usr/bin/daos_admin 

runtime.goexit()

/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=

ERROR: /usr/bin/daos_admin 0xc00009ffe8 sp=0xc00009ffe0 pc=0x46b9c1

created by runtime.gcenable

/usr/src/runtime/mgc.go:179

ERROR: /usr/bin/daos_admin  +0xaa

 

goroutine 5 [finalizer wait]:

runtime.gopark(0x10a4520?, 

ERROR: /usr/bin/daos_admin 0xc000007860?, 0x0?, 0x0?, 0xc00009e770?)

/usr/src/runtime/proc.go:363

ERROR: /usr/bin/daos_admin  +0xd6 fp=0xc00009e628 sp=0xc00009e608 pc=0x43e0f6

runtime.goparkunlock(...)

 

ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369

runtime.runfinq()

/usr/src/runtime/mfinal.go:

ERROR: /usr/bin/daos_admin 180 +0x10f fp=0xc00009e7e0 sp=0xc00009e628 pc=0x41d9cf

runtime.goexit()

/usr/src/runtime/asm_amd64.s:

ERROR: /usr/bin/daos_admin 1594 +0x1 fp=0xc00009e7e8 sp=0xc00009e7e0 pc=0x46b9c1

created by runtime.createfing

/usr/src/runtime/mfinal.go:157 +0x45

ERROR: /usr/bin/daos_admin 

rax    0x1

rbx    0x2492240

rcx    0x7fbc78da4e60

rdx    0x0

rdi    0x2492240

rsi    

ERROR: /usr/bin/daos_admin 0x7fbc78da1af0

rbp    0x2000003e7240

rsp    0x7ffc5180b120

r8     0x7fbc78da2460

r9     0x0

r10    0x70000000004

r11    0x0

ERROR: /usr/bin/daos_admin 

r12    0x202001000000

r13    0x2000003e7240

r14    0x7ffc5180b150

r15    0x0

rip    0x7fbc78755c0e

rflags 

ERROR: /usr/bin/daos_admin 0x13246

cs     0x33

fs     0x0

gs     0x0

DEBUG 11:17:32.627353 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627423 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627466 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627498 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627541 exec.go:188: discarding garbage response ""

ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts

DEBUG 11:17:32.627658 server.go:164: time to scan bdev storage: 403.426657ms

DEBUG 11:17:32.627726 pubsub.go:259: stopping event loop

DEBUG 11:17:32.627853 main.go:69: Unable to decode response after 5 attempts

github.com/daos-stack/daos/src/control/pbin.ExecReq

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/exec.go:197

github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:100

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586

github.com/daos-stack/daos/src/control/server/storage.scanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483

github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493

github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan

/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60

github.com/daos-stack/daos/src/control/server.scanBdevStorage

/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297

github.com/daos-stack/daos/src/control/server.(*server).addEngines

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306

github.com/daos-stack/daos/src/control/server.Start

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549

main.(*startCmd).Execute

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147

main.parseOpts.func1

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126

github.com/jessevdk/go-flags.(*Parser).ParseArgs

/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314

main.parseOpts

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134

main.main

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151

runtime.main

/usr/src/runtime/proc.go:250

runtime.goexit

/usr/src/runtime/asm_amd64.s:1594

privileged binary execution failed

github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:105

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586

github.com/daos-stack/daos/src/control/server/storage.scanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483

github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493

github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan

/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60

github.com/daos-stack/daos/src/control/server.scanBdevStorage

/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297

github.com/daos-stack/daos/src/control/server.(*server).addEngines

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306

github.com/daos-stack/daos/src/control/server.Start

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549

main.(*startCmd).Execute

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147

main.parseOpts.func1

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126

github.com/jessevdk/go-flags.(*Parser).ParseArgs

/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314

main.parseOpts

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134

main.main

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151

runtime.main

/usr/src/runtime/proc.go:250

runtime.goexit

/usr/src/runtime/asm_amd64.s:1594

NVMe Scan Failed

github.com/daos-stack/daos/src/control/server.scanBdevStorage

/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:302

github.com/daos-stack/daos/src/control/server.(*server).addEngines

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306

github.com/daos-stack/daos/src/control/server.Start

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549

main.(*startCmd).Execute

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147

main.parseOpts.func1

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126

github.com/jessevdk/go-flags.(*Parser).ParseArgs

/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314

main.parseOpts

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134

main.main

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151

runtime.main

/usr/src/runtime/proc.go:250

runtime.goexit

/usr/src/runtime/asm_amd64.s:1594

ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts

 


JiangYu
 

Thanks, the CPU information is as follows, how can I choose the CPU?

[root@Rocky-1 ~]# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              40
On-line CPU(s) list: 0-39
Thread(s) per core:  2
Core(s) per socket:  10
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Intel
CPU family:          6
Model:               62
Model name:          Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
BIOS Model name:           Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
Stepping:            4
CPU MHz:             3600.000
CPU max MHz:         3600.0000
CPU min MHz:         1200.0000
BogoMIPS:            5599.96
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            25600K
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d


Huang, Lei
 

E5-2680 v2 is Ivy Bridge.

I guess some packages (maybe spdk) was compiled to run on haswell or newer architecture. You may use gdb to run “daos_admin”. Gdb will stop when cpu runs into an unsupported instruction. With “bt” and “info proc mappings” in gdb, you can find out which module/library contains unsupported instruction. You need to compile your library from source targeting your own CPU and install it.

 

Or you can compile daos and its dependent libraries from scratch.

https://github.com/daos-stack/daos/blob/release/2.2/site_scons/components/__init__.py#L348-L355

You may need to replace this part with

spdk_arch = 'nehalem'

 

Not sure whether there are other places to fix the compiling flags for CPU optimizations.

 

-lei

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of JiangYu
Sent: Friday, November 4, 2022 12:58 AM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

 

Thanks, the CPU information is as follows, how can I choose the CPU?

[root@Rocky-1 ~]# lscpu

Architecture:        x86_64

CPU op-mode(s):      32-bit, 64-bit

Byte Order:          Little Endian

CPU(s):              40

On-line CPU(s) list: 0-39

Thread(s) per core:  2

Core(s) per socket:  10

Socket(s):           2

NUMA node(s):        2

Vendor ID:           GenuineIntel

BIOS Vendor ID:      Intel

CPU family:          6

Model:               62

Model name:          Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz

BIOS Model name:           Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz

Stepping:            4

CPU MHz:             3600.000

CPU max MHz:         3600.0000

CPU min MHz:         1200.0000

BogoMIPS:            5599.96

Virtualization:      VT-x

L1d cache:           32K

L1i cache:           32K

L2 cache:            256K

L3 cache:            25600K

NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38

NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39

Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d


Murrell, Brian
 

On Wed, 2022-11-02 at 20:25 -0700, lnsyyj@... wrote:
ERROR: /usr/bin/daos_admin SIGILL: illegal instruction
PC=0x7fbc78755c0e m=0 sigcode=2
signal arrived during cgo execution
instruction bytes: 0xc4 0xe2 0x69 0xf7 0xc0 0x41 0x89 0x85 0xe0 0x19
0x0 0x0 0xe8 0xb1 0x98 0xfe
Are you using the RPMs that the DAOS project team builds and
distributes or did you build your own from source?

If the latter, you might have built with compiler optimizations set to
aggressively target the CPU of the build system and then tried to use
the binaries produced there on a different system with a lesser-capable
CPU. You will have to reduce the optimization level of your build if
you wish to make more portable binaries.

Cheers,
b.


JiangYu
 

I'm using an RPM built and distributed by the DAOS project team.
So I need to adjust the compiler optimization to compile the daos binary to run it?
Can you tell me where I need to modify the compiler optimizations? How do I compile it?
Thank you very much!

[root@Rocky-1 ~]# rpm -qa | grep daos
daos-client-2.2.0-4.el8.x86_64
daos-2.2.0-4.el8.x86_64
daos-server-2.2.0-4.el8.x86_64
daos-admin-2.2.0-4.el8.x86_64

[root@Rocky-1 ~]# rpm -qa | grep spdk
spdk-tools-22.01.1-2.el8.noarch
spdk-22.01.1-2.el8.x86_64

[root@Rocky-1 ~]# rpm -qa | grep dpdk
dpdk-21.11.1-1.el8.x86_64

[root@Rocky-1 ~]# cat /etc/yum.repos.d/daos-packages.repo 
[daos-packages]
name=DAOS v2.2.0 Packages Packages
baseurl=https://packages.daos.io/private/v2.2.0/EL8/packages/x86_64
enabled=1
gpgcheck=1
protect=1
gpgkey="https://packages.daos.io/RPM-GPG-KEY"

[root@Rocky-1 ~]# cat /etc/os-release 
NAME="Rocky Linux"
VERSION="8.6 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.6"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.6 (Green Obsidian)"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
ROCKY_SUPPORT_PRODUCT="Rocky Linux"
ROCKY_SUPPORT_PRODUCT_VERSION="8"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8"


JiangYu
 

Thank you lei,
I tried compiling, but the problem still exists, I will try gdb as you said.
Maybe if I use the new CPU architecture there won't be this problem.
 
[root@Rocky-1 yum.repos.d]# vim /etc/yum.repos.d/Rocky-PowerTools.repo
enabled=1
 
[root@Rocky-3 ~]# yum install -y python2 gcc gcc-c++ libunwind-devel epel-release
 
[root@Rocky-1 daos]# yum makecache
 
[root@Rocky-1 daos]# dnf --enablerepo=powertools install python3-scons
 
[root@Rocky-1 daos]# pip3 install distro
 
[root@Rocky-1 daos]# git checkout -b local-v2.2.0 v2.2.0
[root@Rocky-1 daos]# ./utils/scripts/install-el8.sh
[root@Rocky-1 daos]# scons-3 --build-deps=yes


Huang, Lei
 

Did you revise spdk_arch in site_scons/components/__init__.py before compiling?

Right. Using newer CPU could avoid such problem.

 

-lei

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of JiangYu
Sent: Friday, November 4, 2022 10:23 AM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

 

Thank you lei,
I tried compiling, but the problem still exists, I will try gdb as you said.
Maybe if I use the new CPU architecture there won't be this problem.

 

[root@Rocky-1 yum.repos.d]# vim /etc/yum.repos.d/Rocky-PowerTools.repo

enabled=1

 

[root@Rocky-3 ~]# yum install -y python2 gcc gcc-c++ libunwind-devel epel-release

 

[root@Rocky-1 daos]# yum makecache

 

[root@Rocky-1 daos]# dnf --enablerepo=powertools install python3-scons

 

[root@Rocky-1 daos]# pip3 install distro

 

[root@Rocky-1 daos]# git checkout -b local-v2.2.0 v2.2.0

[root@Rocky-1 daos]# ./utils/scripts/install-el8.sh

[root@Rocky-1 daos]# scons-3 --build-deps=yes


JiangYu
 

Thanks Lei, we didn't try this, we are considering to update our hardware.