Date   

Re: [DUG'22] DAOS User Group virtual meeting details

Mora Acosta, Josue
 

Is there an abstract for each of the presentations so we can be selective on what to attend ?

 

Thanks,

Joshua

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kudryavtsev, Andrey O
Sent: Thursday, November 10, 2022 8:30 PM
To: daos@daos.groups.io
Subject: [daos] [DUG'22] DAOS User Group virtual meeting details

 

Dear DAOS community,

The final agenda is posted and we added virtual meeting details for those who are not able to join us in person.

Those who are travelling, have a safe trip. We can wait to see you in person in Dallas.

 

https://daosio.atlassian.net/wiki/spaces/DC/pages/11248861216/DUG22

Date: November 14th, 9:00am-1:00pm CT

Event Location: Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201) 

 

Time (CT)

 

Presentation

8:45am

Gathering

9:00am

Welcome & DAOS Update 
Kelsey Prantis, Director of DAOS Engineering, Intel AXG

9:20am

DAOS at Exascale
Kevin Harms, Performance Engineering Team Lead, ALCF

9:50am

HPE DAOS program update
Lance Evans, HPC CTO Chief Storage Architect, HPE

10:20am

Implementing SPDK DAOS bdev module
Denis Nuja, Technical Pre-sales Engineer, Croit

10:50am

Break

11:00am

DAOS 2.4 and beyond
Johann Lombardi, DAOS Chief Architect, Intel AXG

11:30am

DAOS on Google Cloud Platform update
Dean Hildebrand, Technical Director, Office of the CTO, Google

11:50pm

Latest Seagate DAOS development on S3 interface
John Bent, Senior Director, Seagate Technology

12:15pm

Re-architecting Storage Systems for Modern Hardware
Luke Logan, PhD Student, Illinois Tech

12:30pm

OPX transport update and DAOS support
Paul Stasurak, Program Manager, Cornelis Networks

12:50pm

Closing remarks
Kelsey Prantis, Director of DAOS Engineering, Intel AXG

1:00pm

End of conference

When

Monday Nov 14, 2022 7am – 11am (Pacific Time - Los Angeles)

Location

Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201)
View map

 

 

 

-- 

Andrey Kudryavtsev, 

DAOS Product Manager

Intel Corp. 

 

 

From: Andrey O Kudryavtsev <andrey.o.kudryavtsev@...>
Date: Friday, November 4, 2022 at 4:42 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [DUG'22] Announcement and Agenda

 

Dear DAOS community,

Following the email I sent earlier, DAOS team is proud to announce DUG'22 details and the agenda.

 

Date: November 14th, 9:00am-1:00pm CT

Event Location: Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201) 

 

Based on the feedback you provided, we are going to make it hybrid.

We will be glad to see many of you in person and welcome others to join remotely who can’t make SC this year.

The details for remote connection will be shared later.

 

 

Agenda:

Time (CT)

Presentation

8:45am

Gathering

9:00am

Welcome & DAOS Update 

Kelsey Prantis, DAOS team manager, Intel AXG

9:20am

Reserved for the special guest

9:50am

HPE DAOS program update

Lance Evans, HPC CTO Chief Storage Architect, HPE

10:20am

Implementing SPDK DAOS bdev module

Denis Nuja, Technical Pre-sales Engineer, Croit

10:50am

Break

11:00am

DAOS 2.4 and beyond

Johann Lombardi, DAOS Chief Architect, Intel AXG

11:30pm

DAOS on Google Cloud Platform update
Dean Hildebrand, Technical Director, Office of the CTO, Google

12:00am

Latest Seagate DAOS development on S3 interface
John Bent, Senior Director, Seagate Technology

12:30pm

OPX transport update and DAOS support
Paul Stasurak, Program Manager, Cornelis Networks

12:50pm

Closing remarks

Kelsey Prantis, DAOS team manager, Intel AXG

1:00pm

End of conference

* agenda may have some changes later.

 

Please, also take a moment to stop by other DAOS session during workshop, tutorials and official SC program.

We have the large presence this year in many of those, including with our partners.

 

SC22 Tutorial, Workshop and Program:

 

·        SC'22 Tutorial (Nov 14 at 8:30am CST)

Emerging Storage Interfaces: DAOS and PMDK
https://sc22.supercomputing.org/presentation/?id=tut132&sess=sess213
Adrian Jackson (EPCC)
Mohamad Chaarawi (Intel)
Johann Lombardi (Intel) 

·        SuperCheck-SC'22 (Nov 14 at 8:35am CST)

DAOS: Nextgen Storage Stack for HPC and AI
https://supercheck.lbl.gov/schedule
Johann Lombardi (Intel)

·        PDSW (Nov 14 at 3:30pm CST)

Performance Comparison of DAOS and Lustre for Object Data Storage Approaches

http://www.pdsw.org/index.shtml

Nicolau Manubens Gil (ECMWF)

Simon Smart (ECMWF)

Tiago Quintino (ECMWF)

Adrian Jackson (EPCC)

·        SC'22 BoF (Nov 16 at 5:15pm CST)

DAOS Storage Community BoF
https://sc22.supercomputing.org/presentation/?id=bof147&sess=sess357
Kevin Harms (ANL)
Johann Lombardi (Intel)
Dean Hildebrand (Google)
Panagiotis Adamidis (DKRZ)

·        SC'22 BoF (Nov 17 at 12:15pm CST)

The Storage Tower of Babel? ... Not! Actually, maybe?
https://sc22.supercomputing.org/presentation/?id=bof150&sess=sess378
Philippe Deniel (CEA)
John Bent (Seagate)
Tiago Quintino (ECMWF)
Johann Lombardi (Intel)

 

SC22 Exhibition:

 

·        Intel booth presentations at SC22 Showcase

Panel: Google Cloud Platform (GCP) and Intel DAOS Further HPC Data Pipeline Optimization (Nov 15 at 3:30pm CST)
Margaret Lawson (Google)

Dean Hildebrand (Google)

Kelsey Prantis (Intel)

Tech Talk: Aurora DAOS Storage Innovation at Argonne National Laboratory (Nov 16 at 3:30pm CST)

Kevin Harms (ANL)

Tech Talk: Cambridge University is Tackling Tomorrow’s Problems Today (Nov 16 at 11:00am CST)

Paul Calleja (UoC)

·        Google booth at SC22 Showcase (all time)

DAOS Storage live demo in GCP environment 

Margaret Lawson (Google)

·        Croit booth at SC22 Showcase (all time)

DAOS live demo running Block and NFS interface with Croit.io

Denis Nuja (Croit)

 

All the best, DAOS team.

 

 

-- 

Andrey Kudryavtsev, 

DAOS Product Manager

Intel Corp. 

 

 

From: Andrey O Kudryavtsev <andrey.o.kudryavtsev@...>
Date: Thursday, October 27, 2022 at 4:47 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [DUG'22] Save the date & call for presentations!

 

Greetings DAOS Community!

 

SC22 is around the corner and it means we have a special event coming again. The Intel DAOS team invites you to join us for the 6th annual DAOS User Group (DUG22). This will be the first in-person user group since the pandemic.

 

The agenda is not yet finalized and we’re inviting the community members to submit their presentation proposals. Please, send brief submissions to daos-info@daos.groups.io and keep me copied.

If you have any feedback to share, what you want to see the most, type of presentations, areas to cover and others to listen, - don’t hesitate to contact me directly. This is the event we make for you!

 

The event will take place on November 14th from 9am until 1pm. We did our best to avoid overlaps with other activities and that’s why Monday was selected. We hope it fits your plans and the agenda and doesn’t overlap with other workshops and tutorials that day.

 

Event Location: Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201), which is within one mile from the Kay Bailey Hutchison Convention Center.

 

Additional details will be shared once the agenda is finalized. We hope to see you all in person.

 

Best Regards,

Andrey, Kelsey, Johann and the rest of the DAOS team. 

 

-- 

Andrey Kudryavtsev, 

DAOS Product Manager

Intel Corp. 

 


[DUG'22] DAOS User Group virtual meeting details

Kudryavtsev, Andrey O
 

Dear DAOS community,

The final agenda is posted and we added virtual meeting details for those who are not able to join us in person.

Those who are travelling, have a safe trip. We can wait to see you in person in Dallas.

 

Join with Google Meet

Meeting link

meet.google.com/jmw-azpx-ftr

Join by phone

(US) +1 253-289-6833
PIN: 39095605190

More phone numbers

https://daosio.atlassian.net/wiki/spaces/DC/pages/11248861216/DUG22

Date: November 14th, 9:00am-1:00pm CT

Event Location: Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201) 

 

Time (CT)

 

Presentation

8:45am

Gathering

9:00am

Welcome & DAOS Update 
Kelsey Prantis, Director of DAOS Engineering, Intel AXG

9:20am

DAOS at Exascale
Kevin Harms, Performance Engineering Team Lead, ALCF

9:50am

HPE DAOS program update
Lance Evans, HPC CTO Chief Storage Architect, HPE

10:20am

Implementing SPDK DAOS bdev module
Denis Nuja, Technical Pre-sales Engineer, Croit

10:50am

Break

11:00am

DAOS 2.4 and beyond
Johann Lombardi, DAOS Chief Architect, Intel AXG

11:30am

DAOS on Google Cloud Platform update
Dean Hildebrand, Technical Director, Office of the CTO, Google

11:50pm

Latest Seagate DAOS development on S3 interface
John Bent, Senior Director, Seagate Technology

12:15pm

Re-architecting Storage Systems for Modern Hardware
Luke Logan, PhD Student, Illinois Tech

12:30pm

OPX transport update and DAOS support
Paul Stasurak, Program Manager, Cornelis Networks

12:50pm

Closing remarks
Kelsey Prantis, Director of DAOS Engineering, Intel AXG

1:00pm

End of conference

When

Monday Nov 14, 2022 7am – 11am (Pacific Time - Los Angeles)

Location

Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201)
View map

 

 

 

 

-- 

Andrey Kudryavtsev, 

DAOS Product Manager

Intel Corp. 

 

 

From: Andrey O Kudryavtsev <andrey.o.kudryavtsev@...>
Date: Friday, November 4, 2022 at 4:42 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [DUG'22] Announcement and Agenda

 

Dear DAOS community,

Following the email I sent earlier, DAOS team is proud to announce DUG'22 details and the agenda.

 

Date: November 14th, 9:00am-1:00pm CT

Event Location: Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201) 

 

Based on the feedback you provided, we are going to make it hybrid.

We will be glad to see many of you in person and welcome others to join remotely who can’t make SC this year.

The details for remote connection will be shared later.

 

 

Agenda:

Time (CT)

Presentation

8:45am

Gathering

9:00am

Welcome & DAOS Update 

Kelsey Prantis, DAOS team manager, Intel AXG

9:20am

Reserved for the special guest

9:50am

HPE DAOS program update

Lance Evans, HPC CTO Chief Storage Architect, HPE

10:20am

Implementing SPDK DAOS bdev module

Denis Nuja, Technical Pre-sales Engineer, Croit

10:50am

Break

11:00am

DAOS 2.4 and beyond

Johann Lombardi, DAOS Chief Architect, Intel AXG

11:30pm

DAOS on Google Cloud Platform update
Dean Hildebrand, Technical Director, Office of the CTO, Google

12:00am

Latest Seagate DAOS development on S3 interface
John Bent, Senior Director, Seagate Technology

12:30pm

OPX transport update and DAOS support
Paul Stasurak, Program Manager, Cornelis Networks

12:50pm

Closing remarks

Kelsey Prantis, DAOS team manager, Intel AXG

1:00pm

End of conference

* agenda may have some changes later.

 

Please, also take a moment to stop by other DAOS session during workshop, tutorials and official SC program.

We have the large presence this year in many of those, including with our partners.

 

SC22 Tutorial, Workshop and Program:

 

·         SC'22 Tutorial (Nov 14 at 8:30am CST)

Emerging Storage Interfaces: DAOS and PMDK
https://sc22.supercomputing.org/presentation/?id=tut132&sess=sess213
Adrian Jackson (EPCC)
Mohamad Chaarawi (Intel)
Johann Lombardi (Intel) 

·         SuperCheck-SC'22 (Nov 14 at 8:35am CST)

DAOS: Nextgen Storage Stack for HPC and AI
https://supercheck.lbl.gov/schedule
Johann Lombardi (Intel)

·         PDSW (Nov 14 at 3:30pm CST)

Performance Comparison of DAOS and Lustre for Object Data Storage Approaches

http://www.pdsw.org/index.shtml

Nicolau Manubens Gil (ECMWF)

Simon Smart (ECMWF)

Tiago Quintino (ECMWF)

Adrian Jackson (EPCC)

·         SC'22 BoF (Nov 16 at 5:15pm CST)

DAOS Storage Community BoF
https://sc22.supercomputing.org/presentation/?id=bof147&sess=sess357
Kevin Harms (ANL)
Johann Lombardi (Intel)
Dean Hildebrand (Google)
Panagiotis Adamidis (DKRZ)

·         SC'22 BoF (Nov 17 at 12:15pm CST)

The Storage Tower of Babel? ... Not! Actually, maybe?
https://sc22.supercomputing.org/presentation/?id=bof150&sess=sess378
Philippe Deniel (CEA)
John Bent (Seagate)
Tiago Quintino (ECMWF)
Johann Lombardi (Intel)

 

SC22 Exhibition:

 

·         Intel booth presentations at SC22 Showcase

Panel: Google Cloud Platform (GCP) and Intel DAOS Further HPC Data Pipeline Optimization (Nov 15 at 3:30pm CST)
Margaret Lawson (Google)

Dean Hildebrand (Google)

Kelsey Prantis (Intel)

Tech Talk: Aurora DAOS Storage Innovation at Argonne National Laboratory (Nov 16 at 3:30pm CST)

Kevin Harms (ANL)

Tech Talk: Cambridge University is Tackling Tomorrow’s Problems Today (Nov 16 at 11:00am CST)

Paul Calleja (UoC)

·         Google booth at SC22 Showcase (all time)

DAOS Storage live demo in GCP environment 

Margaret Lawson (Google)

·         Croit booth at SC22 Showcase (all time)

DAOS live demo running Block and NFS interface with Croit.io

Denis Nuja (Croit)

 

All the best, DAOS team.

 

 

-- 

Andrey Kudryavtsev, 

DAOS Product Manager

Intel Corp. 

 

 

From: Andrey O Kudryavtsev <andrey.o.kudryavtsev@...>
Date: Thursday, October 27, 2022 at 4:47 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [DUG'22] Save the date & call for presentations!

 

Greetings DAOS Community!

 

SC22 is around the corner and it means we have a special event coming again. The Intel DAOS team invites you to join us for the 6th annual DAOS User Group (DUG22). This will be the first in-person user group since the pandemic.

 

The agenda is not yet finalized and we’re inviting the community members to submit their presentation proposals. Please, send brief submissions to daos-info@daos.groups.io and keep me copied.

If you have any feedback to share, what you want to see the most, type of presentations, areas to cover and others to listen, - don’t hesitate to contact me directly. This is the event we make for you!

 

The event will take place on November 14th from 9am until 1pm. We did our best to avoid overlaps with other activities and that’s why Monday was selected. We hope it fits your plans and the agenda and doesn’t overlap with other workshops and tutorials that day.

 

Event Location: Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201), which is within one mile from the Kay Bailey Hutchison Convention Center.

 

Additional details will be shared once the agenda is finalized. We hope to see you all in person.

 

Best Regards,

Andrey, Kelsey, Johann and the rest of the DAOS team. 

 

-- 

Andrey Kudryavtsev, 

DAOS Product Manager

Intel Corp. 

 


Re: DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

JiangYu
 

Thanks Lei, we didn't try this, we are considering to update our hardware.


[DUG'22] Announcement and Agenda

Kudryavtsev, Andrey O <andrey.o.kudryavtsev@...>
 

Dear DAOS community,

Following the email I sent earlier, DAOS team is proud to announce DUG'22 details and the agenda.

 

Date: November 14th, 9:00am-1:00pm CT

Event Location: Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201) 

 

Based on the feedback you provided, we are going to make it hybrid.

We will be glad to see many of you in person and welcome others to join remotely who can’t make SC this year.

The details for remote connection will be shared later.

 

 

Agenda:

Time (CT)

Presentation

8:45am

Gathering

9:00am

Welcome & DAOS Update 

Kelsey Prantis, DAOS team manager, Intel AXG

9:20am

Reserved for the special guest

9:50am

HPE DAOS program update

Lance Evans, HPC CTO Chief Storage Architect, HPE

10:20am

Implementing SPDK DAOS bdev module

Denis Nuja, Technical Pre-sales Engineer, Croit

10:50am

Break

11:00am

DAOS 2.4 and beyond

Johann Lombardi, DAOS Chief Architect, Intel AXG

11:30pm

DAOS on Google Cloud Platform update
Dean Hildebrand, Technical Director, Office of the CTO, Google

12:00am

Latest Seagate DAOS development on S3 interface
John Bent, Senior Director, Seagate Technology

12:30pm

OPX transport update and DAOS support
Paul Stasurak, Program Manager, Cornelis Networks

12:50pm

Closing remarks

Kelsey Prantis, DAOS team manager, Intel AXG

1:00pm

End of conference

* agenda may have some changes later.

 

Please, also take a moment to stop by other DAOS session during workshop, tutorials and official SC program.

We have the large presence this year in many of those, including with our partners.

 

SC22 Tutorial, Workshop and Program:

 

·         SC'22 Tutorial (Nov 14 at 8:30am CST)

Emerging Storage Interfaces: DAOS and PMDK
https://sc22.supercomputing.org/presentation/?id=tut132&sess=sess213
Adrian Jackson (EPCC)
Mohamad Chaarawi (Intel)
Johann Lombardi (Intel) 

·         SuperCheck-SC'22 (Nov 14 at 8:35am CST)

DAOS: Nextgen Storage Stack for HPC and AI
https://supercheck.lbl.gov/schedule
Johann Lombardi (Intel)

·         PDSW (Nov 14 at 3:30pm CST)

Performance Comparison of DAOS and Lustre for Object Data Storage Approaches

http://www.pdsw.org/index.shtml

Nicolau Manubens Gil (ECMWF)

Simon Smart (ECMWF)

Tiago Quintino (ECMWF)

Adrian Jackson (EPCC)

·         SC'22 BoF (Nov 16 at 5:15pm CST)

DAOS Storage Community BoF
https://sc22.supercomputing.org/presentation/?id=bof147&sess=sess357
Kevin Harms (ANL)
Johann Lombardi (Intel)
Dean Hildebrand (Google)
Panagiotis Adamidis (DKRZ)

·         SC'22 BoF (Nov 17 at 12:15pm CST)

The Storage Tower of Babel? ... Not! Actually, maybe?
https://sc22.supercomputing.org/presentation/?id=bof150&sess=sess378
Philippe Deniel (CEA)
John Bent (Seagate)
Tiago Quintino (ECMWF)
Johann Lombardi (Intel)

 

SC22 Exhibition:

 

·         Intel booth presentations at SC22 Showcase

Panel: Google Cloud Platform (GCP) and Intel DAOS Further HPC Data Pipeline Optimization (Nov 15 at 3:30pm CST)
Margaret Lawson (Google)

Dean Hildebrand (Google)

Kelsey Prantis (Intel)

Tech Talk: Aurora DAOS Storage Innovation at Argonne National Laboratory (Nov 16 at 3:30pm CST)

Kevin Harms (ANL)

Tech Talk: Cambridge University is Tackling Tomorrow’s Problems Today (Nov 16 at 11:00am CST)

Paul Calleja (UoC)

·         Google booth at SC22 Showcase (all time)

DAOS Storage live demo in GCP environment 

Margaret Lawson (Google)

·         Croit booth at SC22 Showcase (all time)

DAOS live demo running Block and NFS interface with Croit.io

Denis Nuja (Croit)

 

All the best, DAOS team.

 

 

-- 

Andrey Kudryavtsev, 

DAOS Product Manager

Intel Corp. 

 

 

From: Andrey O Kudryavtsev <andrey.o.kudryavtsev@...>
Date: Thursday, October 27, 2022 at 4:47 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [DUG'22] Save the date & call for presentations!

 

Greetings DAOS Community!

 

SC22 is around the corner and it means we have a special event coming again. The Intel DAOS team invites you to join us for the 6th annual DAOS User Group (DUG22). This will be the first in-person user group since the pandemic.

 

The agenda is not yet finalized and we’re inviting the community members to submit their presentation proposals. Please, send brief submissions to daos-info@daos.groups.io and keep me copied.

If you have any feedback to share, what you want to see the most, type of presentations, areas to cover and others to listen, - don’t hesitate to contact me directly. This is the event we make for you!

 

The event will take place on November 14th from 9am until 1pm. We did our best to avoid overlaps with other activities and that’s why Monday was selected. We hope it fits your plans and the agenda and doesn’t overlap with other workshops and tutorials that day.

 

Event Location: Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201), which is within one mile from the Kay Bailey Hutchison Convention Center.

 

Additional details will be shared once the agenda is finalized. We hope to see you all in person.

 

Best Regards,

Andrey, Kelsey, Johann and the rest of the DAOS team. 

 

-- 

Andrey Kudryavtsev, 

DAOS Product Manager

Intel Corp. 

 


Re: DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

Huang, Lei
 

Did you revise spdk_arch in site_scons/components/__init__.py before compiling?

Right. Using newer CPU could avoid such problem.

 

-lei

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of JiangYu
Sent: Friday, November 4, 2022 10:23 AM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

 

Thank you lei,
I tried compiling, but the problem still exists, I will try gdb as you said.
Maybe if I use the new CPU architecture there won't be this problem.

 

[root@Rocky-1 yum.repos.d]# vim /etc/yum.repos.d/Rocky-PowerTools.repo

enabled=1

 

[root@Rocky-3 ~]# yum install -y python2 gcc gcc-c++ libunwind-devel epel-release

 

[root@Rocky-1 daos]# yum makecache

 

[root@Rocky-1 daos]# dnf --enablerepo=powertools install python3-scons

 

[root@Rocky-1 daos]# pip3 install distro

 

[root@Rocky-1 daos]# git checkout -b local-v2.2.0 v2.2.0

[root@Rocky-1 daos]# ./utils/scripts/install-el8.sh

[root@Rocky-1 daos]# scons-3 --build-deps=yes


Re: DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

JiangYu
 

Thank you lei,
I tried compiling, but the problem still exists, I will try gdb as you said.
Maybe if I use the new CPU architecture there won't be this problem.
 
[root@Rocky-1 yum.repos.d]# vim /etc/yum.repos.d/Rocky-PowerTools.repo
enabled=1
 
[root@Rocky-3 ~]# yum install -y python2 gcc gcc-c++ libunwind-devel epel-release
 
[root@Rocky-1 daos]# yum makecache
 
[root@Rocky-1 daos]# dnf --enablerepo=powertools install python3-scons
 
[root@Rocky-1 daos]# pip3 install distro
 
[root@Rocky-1 daos]# git checkout -b local-v2.2.0 v2.2.0
[root@Rocky-1 daos]# ./utils/scripts/install-el8.sh
[root@Rocky-1 daos]# scons-3 --build-deps=yes


Re: DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

JiangYu
 

I'm using an RPM built and distributed by the DAOS project team.
So I need to adjust the compiler optimization to compile the daos binary to run it?
Can you tell me where I need to modify the compiler optimizations? How do I compile it?
Thank you very much!

[root@Rocky-1 ~]# rpm -qa | grep daos
daos-client-2.2.0-4.el8.x86_64
daos-2.2.0-4.el8.x86_64
daos-server-2.2.0-4.el8.x86_64
daos-admin-2.2.0-4.el8.x86_64

[root@Rocky-1 ~]# rpm -qa | grep spdk
spdk-tools-22.01.1-2.el8.noarch
spdk-22.01.1-2.el8.x86_64

[root@Rocky-1 ~]# rpm -qa | grep dpdk
dpdk-21.11.1-1.el8.x86_64

[root@Rocky-1 ~]# cat /etc/yum.repos.d/daos-packages.repo 
[daos-packages]
name=DAOS v2.2.0 Packages Packages
baseurl=https://packages.daos.io/private/v2.2.0/EL8/packages/x86_64
enabled=1
gpgcheck=1
protect=1
gpgkey="https://packages.daos.io/RPM-GPG-KEY"

[root@Rocky-1 ~]# cat /etc/os-release 
NAME="Rocky Linux"
VERSION="8.6 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.6"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.6 (Green Obsidian)"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
ROCKY_SUPPORT_PRODUCT="Rocky Linux"
ROCKY_SUPPORT_PRODUCT_VERSION="8"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8"


Re: DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

Murrell, Brian
 

On Wed, 2022-11-02 at 20:25 -0700, lnsyyj@... wrote:
ERROR: /usr/bin/daos_admin SIGILL: illegal instruction
PC=0x7fbc78755c0e m=0 sigcode=2
signal arrived during cgo execution
instruction bytes: 0xc4 0xe2 0x69 0xf7 0xc0 0x41 0x89 0x85 0xe0 0x19
0x0 0x0 0xe8 0xb1 0x98 0xfe
Are you using the RPMs that the DAOS project team builds and
distributes or did you build your own from source?

If the latter, you might have built with compiler optimizations set to
aggressively target the CPU of the build system and then tried to use
the binaries produced there on a different system with a lesser-capable
CPU. You will have to reduce the optimization level of your build if
you wish to make more portable binaries.

Cheers,
b.


Re: DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

Huang, Lei
 

E5-2680 v2 is Ivy Bridge.

I guess some packages (maybe spdk) was compiled to run on haswell or newer architecture. You may use gdb to run “daos_admin”. Gdb will stop when cpu runs into an unsupported instruction. With “bt” and “info proc mappings” in gdb, you can find out which module/library contains unsupported instruction. You need to compile your library from source targeting your own CPU and install it.

 

Or you can compile daos and its dependent libraries from scratch.

https://github.com/daos-stack/daos/blob/release/2.2/site_scons/components/__init__.py#L348-L355

You may need to replace this part with

spdk_arch = 'nehalem'

 

Not sure whether there are other places to fix the compiling flags for CPU optimizations.

 

-lei

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of JiangYu
Sent: Friday, November 4, 2022 12:58 AM
To: daos@daos.groups.io
Subject: Re: [daos] DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

 

Thanks, the CPU information is as follows, how can I choose the CPU?

[root@Rocky-1 ~]# lscpu

Architecture:        x86_64

CPU op-mode(s):      32-bit, 64-bit

Byte Order:          Little Endian

CPU(s):              40

On-line CPU(s) list: 0-39

Thread(s) per core:  2

Core(s) per socket:  10

Socket(s):           2

NUMA node(s):        2

Vendor ID:           GenuineIntel

BIOS Vendor ID:      Intel

CPU family:          6

Model:               62

Model name:          Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz

BIOS Model name:           Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz

Stepping:            4

CPU MHz:             3600.000

CPU max MHz:         3600.0000

CPU min MHz:         1200.0000

BogoMIPS:            5599.96

Virtualization:      VT-x

L1d cache:           32K

L1i cache:           32K

L2 cache:            256K

L3 cache:            25600K

NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38

NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39

Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d


Re: DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

JiangYu
 

Thanks, the CPU information is as follows, how can I choose the CPU?

[root@Rocky-1 ~]# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              40
On-line CPU(s) list: 0-39
Thread(s) per core:  2
Core(s) per socket:  10
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Intel
CPU family:          6
Model:               62
Model name:          Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
BIOS Model name:           Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
Stepping:            4
CPU MHz:             3600.000
CPU max MHz:         3600.0000
CPU min MHz:         1200.0000
BogoMIPS:            5599.96
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            25600K
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d


Re: DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

Huang, Lei
 

ERROR: /usr/bin/daos_admin SIGILL: illegal instruction

 

You CPU does not support certain instructions inside daos_admin process. Could you please attach the output of “lscpu” of your computer? Thank you!

 

-lei

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of lnsyyj@...
Sent: Wednesday, November 2, 2022 10:26 PM
To: daos@daos.groups.io
Subject: [daos] DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

 

Hello everyone,

When I start Daos Server, the following information appears. How should I solve it?

 

[root@Rocky-1 ~]# /usr/share/spdk/scripts/setup.sh

0000:44:00.0 (1d78 1512): nvme -> vfio-pci

[root@Rocky-1 ~]# cat /etc/daos/daos_server.yml 

name: daos_server

access_points: ['Rocky-1']

port: 10001

transport_config:

  allow_insecure: false

  client_cert_dir: /etc/daos/certs/clients

  ca_cert: /etc/daos/certs/daosCA.crt

  cert: /etc/daos/certs/server.crt

  key: /etc/daos/certs/server.key

provider: ofi+sockets

socket_dir: /var/run/

nr_hugepages: 4096

control_log_mask: DEBUG

control_log_file: /var/log/daos_server.log

helper_log_file: /var/log/daos_admin.log

 

engines:

-

  targets: 8

  nr_xs_helpers: 0

  fabric_iface: enp3s0f1

  fabric_iface_port: 31316

  log_mask: INFO

  log_file: /var/log/daos_engine_0.log

  env_vars:

      - CRT_TIMEOUT=30

  storage:

  -

    class: ram

    scm_mount: /mnt/daos0

    scm_size: 2 #gb to allocate for tmpfs to emulate SCM

  -

    class: nvme

    bdev_list: ["0000:44:00.0"]

 


[root@Rocky-1 ~]# /usr/bin/daos_server start

DAOS Server config loaded from /etc/daos/daos_server.yml

/usr/bin/daos_server logging to file /var/log/daos_server.log

DEBUG 11:17:31.720878 start.go:90: Switching control log level to DEBUG

DEBUG 11:17:31.721131 defaults.go:92: failed to load library: unable to open a handle to the library

ERROR: unable to open a handle to the library

DEBUG 11:17:31.721209 fabric.go:875: waiting for fabric interfaces to become ready...

DEBUG 11:17:31.721299 fabric.go:892: fabric interface "enp3s0f1" is ready

DEBUG 11:17:31.721372 provider.go:87: getting topology with hwloc version 0x20100

DEBUG 11:17:31.769773 provider.go:145: adding device found at "/sys/class/net/eno1" (type network interface, NUMA node 0)

DEBUG 11:17:31.769933 provider.go:145: adding device found at "/sys/class/net/eno2" (type network interface, NUMA node 0)

DEBUG 11:17:31.770081 provider.go:145: adding device found at "/sys/class/net/eno3" (type network interface, NUMA node 0)

DEBUG 11:17:31.770212 provider.go:145: adding device found at "/sys/class/net/eno4" (type network interface, NUMA node 0)

DEBUG 11:17:31.770357 provider.go:145: adding device found at "/sys/class/net/enp3s0f0" (type network interface, NUMA node 0)

DEBUG 11:17:31.770485 provider.go:145: adding device found at "/sys/class/net/enp3s0f1" (type network interface, NUMA node 0)

DEBUG 11:17:31.770537 provider.go:125: failed to read net device: open /sys/class/net/lo/device/net: no such file or directory

DEBUG 11:17:31.770749 provider.go:264: adding virtual device at "/sys/devices/virtual/net/lo"

DEBUG 11:17:31.886150 provider.go:83: found fabric interfaces:

enp3s0f1 (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

lo (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

shm (providers: shm)

DEBUG 11:17:31.886239 provider.go:292: no cxi subsystem in sysfs

DEBUG 11:17:31.886338 fabric.go:441: unable to open a handle to the library

DEBUG 11:17:31.886419 fabric.go:511: ignoring fabric interface "shm" (shm) not found in topology

DEBUG 11:17:31.886534 fabric.go:793: discovered 2 fabric interfaces:

enp3s0f1 (interface: enp3s0f1) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

lo (interface: lo) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)

DEBUG 11:17:31.886645 server.go:750: detected NUMA affinity 0 for engine 0

DEBUG 11:17:31.886675 server.go:757: enabling single-engine legacy core allocation algorithm

DEBUG 11:17:31.886703 server.go:420: validating config file read from "/etc/daos/daos_server.yml"

DEBUG 11:17:31.886742 server.go:443: vfio=true hotplug=false vmd=true requested in config

WARNING: Configuration includes only one access point. This provides no redundancy in the event of an access point failure.

DEBUG 11:17:31.886841 server.go:549: engine 0 fabric numa 0, storage numa 0

DEBUG 11:17:31.887914 server_utils.go:148: setting OFI_DOMAIN=enp3s0f1 for enp3s0f1

DEBUG 11:17:31.889170 server.go:377: active config saved to /var/run/.daos_server.active.yml (read-only)

DEBUG 11:17:31.889251 server.go:525: fault domain: /rocky-1

DEBUG 11:17:31.889862 server.go:236: setting core dump filter to 0x13

DEBUG 11:17:31.890615 database.go:280: set db replica addr: 192.168.1.215:10001

DEBUG 11:17:31.891076 server.go:164: time to init network: 242.45µs

DEBUG 11:17:31.891195 server_utils.go:260: allocating 4098 hugepages on each of these numa nodes: [0]

DEBUG 11:17:31.891267 ctl_storage.go:53: calling bdev provider prepare: {ForwardableRequest:{Forwarded:false} HugePageCount:4098 HugeNodes:0 CleanHugePagesOnly:false PCIAllowList: PCIBlockList: TargetUser:root Reset_:false DisableVFIO:false EnableVMD:true}

DEBUG 11:17:32.224164 server.go:164: time to prepare bdev storage: 332.967644ms

DEBUG 11:17:32.224261 ctl_storage.go:59: calling bdev provider scan: {ForwardableRequest:{Forwarded:false} DeviceList:0000:44:00.0 VMDEnabled:false BypassCache:true}

ERROR: /usr/bin/daos_admin SIGILL: illegal instruction

PC=0x7fbc78755c0e m=0 sigcode=2

signal arrived during cgo execution

instruction bytes: 0xc4 0xe2 0x69 0xf7 0xc0 0x41 0x89 0x85 0xe0 0x19 0x0 0x0 0xe8 0xb1 0x98 0xfe

 

goroutine 1 [syscall]:

runtime.cgocall(0x92049d, 0xc0001d1c20)

/usr/src/runtime/cgocall.go:158 +0x5c fp=0xc0001d1bf8 sp=0xc0001d1bc0 pc=0x408b1c

github.com/daos-stack/daos/src/control/lib/spdk._Cfunc_nvme_discover()

_cgo_gotypes.go:321 +0x49 fp=0xc0001d1c20 sp=0xc0001d1bf8 pc=0x904fc9

github.com/daos-stack/daos/src/control/lib/spdk.(*NvmeImpl).Discover(0xc0000bcd00?, {0xb267f8, 0xc000184300})

/builddir/build/BUILD/daos-2.2.0/src/control/lib/spdk/nvme.go:127 +0x54 fp=0xc0001d1cd8

ERROR: /usr/bin/daos_admin  sp=0xc0001d1c20 pc=0x9059b4

github.com/daos-stack/daos/src/control/server/storage/bdev.(*spdkBackend).Scan(0xc0000bcce0, {{0x56?}, 0xc0001d5030?, 0x20?, 0x1?})

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/backend.go:341 +0x1b7 fp=0xc0001d1da8 sp=0xc0001d1cd8 pc=0x909f37

github.com/daos-stack/daos/src/control/server/storage/bdev.(*Provider).Scan(...)

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/provider.go:54

main.(*bdevScanHandler).Handle(0xc000014788, {0xb267f8?, 0xc000184300}, 0xc0002e4240)

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/handler.go:175 +0x27a fp=0xc0001d1e08 sp=0xc0001d1da8 pc=0x91d1fa

github.com/daos-stack/daos/src/control/pbin.(*App).handleRequest(0xc0000caae0, 0xc0002e4240)

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:214 +0x62 fp=0xc0001d1e58 sp=0xc0001d1e08 pc=0x5949c2

github.com/daos-stack/daos/src/control/pbin.(*App).Run(0xc0000caae0)

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:155 +0x2ed fp=0xc0001d1f50 sp=0xc0001d1e58 pc=0x59448d

main.main()

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/main.go:25 +0xaf fp=0xc0001d1f80 sp=0xc0001d1f50 pc=0x91de6f

runtime.main()

/usr/src/runtime/proc.go:250 +0x212 fp=0xc0001d1fe0 sp=0xc0001d1f80 pc=0x43dd32

runtime.goexit

ERROR: /usr/bin/daos_admin ()

/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0001d1fe8 sp=0xc0001d1fe0 pc=0x46b9c1

 

goroutine 2 [force gc (idle)]:

runtime.gopark(0x0?, 0x0?, 0x0?

ERROR: /usr/bin/daos_admin , 0x0?, 0x0?)

/usr/src/runtime/proc.go:363 +0xd6 fp=0xc00009efb0 sp=0xc00009ef90 pc=0x43e0f6

ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)

ERROR: /usr/bin/daos_admin  /usr/src/runtime/proc.go:369

runtime.forcegchelper()

 

ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:302 +0xad fp=0xc00009efe0 sp=0xc00009efb0 pc=0x43df8d

runtime.goexit()

/usr/src/runtime/asm_amd64.s

ERROR: /usr/bin/daos_admin :1594 +0x1 fp=0xc00009efe8 sp=0xc00009efe0 pc=0x46b9c1

created by 

ERROR: /usr/bin/daos_admin runtime.init.6

/usr/src/runtime/proc.go:290 +0x25

ERROR: /usr/bin/daos_admin 

goroutine 3 [GC sweep wait]:

runtime.gopark(0x0

ERROR: /usr/bin/daos_admin ?, 0x0?, 0x0?, 0x0?

ERROR: /usr/bin/daos_admin , 0x0?)

/usr/src/runtime/proc.go:363 +

ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009f790 sp=0xc00009f770 pc=0x43e0f6

ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)

/usr/src/runtime/proc.go:369

runtime.bgsweep(0x0?)

 

ERROR: /usr/bin/daos_admin /usr/src/runtime/mgcsweep.go:278 +0x8e fp=0xc00009f7c8 sp=0xc00009f790 pc=0x429c2e

runtime.gcenable.func1()

/usr/src/runtime/mgc.go:178 +

ERROR: /usr/bin/daos_admin 0x26 fp=0xc00009f7e0 sp=0xc00009f7c8 pc=0x41e8c6

runtime.goexit()

/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc00009f7e8 sp=0xc00009f7e0 pc=0x46b9c1

created by runtime.gcenable

/usr/src/runtime/mgc.go:

ERROR: /usr/bin/daos_admin 178 +0x6b

 

goroutine 4 [GC scavenge wait]:

runtime.gopark(0xc0000c6000?, 0xb1e1e8?

ERROR: /usr/bin/daos_admin , 0x1?, 0x0?, 0x0?)

/usr/src/runtime/proc.go:363 +

ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009ff70 sp=0xc00009ff50 pc=0x43e0f6

runtime.goparkunlock(...)

ERROR: /usr/bin/daos_admin  /usr/src/runtime/proc.go:369

runtime.(*scavengerState).park(0x10a3a20)

/usr/src/runtime/mgcscavenge.go:389 +0x53 fp=

ERROR: /usr/bin/daos_admin 0xc00009ffa0 sp=0xc00009ff70 pc=0x427cd3

runtime.bgscavenge(0x0?)

/usr/src/runtime/mgcscavenge.go:

ERROR: /usr/bin/daos_admin 617 +0x45 fp=0xc00009ffc8 sp=0xc00009ffa0 pc=0x4282a5

runtime.gcenable.func2

ERROR: /usr/bin/daos_admin ()

/usr/src/runtime/mgc.go:179 +0x26 fp=0xc00009ffe0 sp=0xc00009ffc8 pc=0x41e866

ERROR: /usr/bin/daos_admin 

runtime.goexit()

/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=

ERROR: /usr/bin/daos_admin 0xc00009ffe8 sp=0xc00009ffe0 pc=0x46b9c1

created by runtime.gcenable

/usr/src/runtime/mgc.go:179

ERROR: /usr/bin/daos_admin  +0xaa

 

goroutine 5 [finalizer wait]:

runtime.gopark(0x10a4520?, 

ERROR: /usr/bin/daos_admin 0xc000007860?, 0x0?, 0x0?, 0xc00009e770?)

/usr/src/runtime/proc.go:363

ERROR: /usr/bin/daos_admin  +0xd6 fp=0xc00009e628 sp=0xc00009e608 pc=0x43e0f6

runtime.goparkunlock(...)

 

ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369

runtime.runfinq()

/usr/src/runtime/mfinal.go:

ERROR: /usr/bin/daos_admin 180 +0x10f fp=0xc00009e7e0 sp=0xc00009e628 pc=0x41d9cf

runtime.goexit()

/usr/src/runtime/asm_amd64.s:

ERROR: /usr/bin/daos_admin 1594 +0x1 fp=0xc00009e7e8 sp=0xc00009e7e0 pc=0x46b9c1

created by runtime.createfing

/usr/src/runtime/mfinal.go:157 +0x45

ERROR: /usr/bin/daos_admin 

rax    0x1

rbx    0x2492240

rcx    0x7fbc78da4e60

rdx    0x0

rdi    0x2492240

rsi    

ERROR: /usr/bin/daos_admin 0x7fbc78da1af0

rbp    0x2000003e7240

rsp    0x7ffc5180b120

r8     0x7fbc78da2460

r9     0x0

r10    0x70000000004

r11    0x0

ERROR: /usr/bin/daos_admin 

r12    0x202001000000

r13    0x2000003e7240

r14    0x7ffc5180b150

r15    0x0

rip    0x7fbc78755c0e

rflags 

ERROR: /usr/bin/daos_admin 0x13246

cs     0x33

fs     0x0

gs     0x0

DEBUG 11:17:32.627353 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627423 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627466 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627498 exec.go:188: discarding garbage response ""

DEBUG 11:17:32.627541 exec.go:188: discarding garbage response ""

ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts

DEBUG 11:17:32.627658 server.go:164: time to scan bdev storage: 403.426657ms

DEBUG 11:17:32.627726 pubsub.go:259: stopping event loop

DEBUG 11:17:32.627853 main.go:69: Unable to decode response after 5 attempts

github.com/daos-stack/daos/src/control/pbin.ExecReq

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/exec.go:197

github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:100

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586

github.com/daos-stack/daos/src/control/server/storage.scanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483

github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493

github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan

/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60

github.com/daos-stack/daos/src/control/server.scanBdevStorage

/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297

github.com/daos-stack/daos/src/control/server.(*server).addEngines

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306

github.com/daos-stack/daos/src/control/server.Start

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549

main.(*startCmd).Execute

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147

main.parseOpts.func1

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126

github.com/jessevdk/go-flags.(*Parser).ParseArgs

/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314

main.parseOpts

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134

main.main

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151

runtime.main

/usr/src/runtime/proc.go:250

runtime.goexit

/usr/src/runtime/asm_amd64.s:1594

privileged binary execution failed

github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:105

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579

github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586

github.com/daos-stack/daos/src/control/server/storage.scanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483

github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs

/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493

github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan

/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60

github.com/daos-stack/daos/src/control/server.scanBdevStorage

/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297

github.com/daos-stack/daos/src/control/server.(*server).addEngines

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306

github.com/daos-stack/daos/src/control/server.Start

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549

main.(*startCmd).Execute

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147

main.parseOpts.func1

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126

github.com/jessevdk/go-flags.(*Parser).ParseArgs

/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314

main.parseOpts

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134

main.main

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151

runtime.main

/usr/src/runtime/proc.go:250

runtime.goexit

/usr/src/runtime/asm_amd64.s:1594

NVMe Scan Failed

github.com/daos-stack/daos/src/control/server.scanBdevStorage

/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:302

github.com/daos-stack/daos/src/control/server.(*server).addEngines

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306

github.com/daos-stack/daos/src/control/server.Start

/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549

main.(*startCmd).Execute

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147

main.parseOpts.func1

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126

github.com/jessevdk/go-flags.(*Parser).ParseArgs

/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314

main.parseOpts

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134

main.main

/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151

runtime.main

/usr/src/runtime/proc.go:250

runtime.goexit

/usr/src/runtime/asm_amd64.s:1594

ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts

 


Re: DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

JiangYu
 

Does this refer to NVMe instructions or CPU instructions? Is my device not supported?


DAOS server start failed(NVMe Scan Failed: privileged binary execution failed)

JiangYu
 

Hello everyone,
When I start Daos Server, the following information appears. How should I solve it?


[root@Rocky-1 ~]# /usr/share/spdk/scripts/setup.sh
0000:44:00.0 (1d78 1512): nvme -> vfio-pci

[root@Rocky-1 ~]# cat /etc/daos/daos_server.yml 
name: daos_server
access_points: ['Rocky-1']
port: 10001
transport_config:
  allow_insecure: false
  client_cert_dir: /etc/daos/certs/clients
  ca_cert: /etc/daos/certs/daosCA.crt
  cert: /etc/daos/certs/server.crt
  key: /etc/daos/certs/server.key
provider: ofi+sockets
socket_dir: /var/run/
nr_hugepages: 4096
control_log_mask: DEBUG
control_log_file: /var/log/daos_server.log
helper_log_file: /var/log/daos_admin.log
 
engines:
-
  targets: 8
  nr_xs_helpers: 0
  fabric_iface: enp3s0f1
  fabric_iface_port: 31316
  log_mask: INFO
  log_file: /var/log/daos_engine_0.log
  env_vars:
      - CRT_TIMEOUT=30
  storage:
  -
    class: ram
    scm_mount: /mnt/daos0
    scm_size: 2 #gb to allocate for tmpfs to emulate SCM
  -
    class: nvme
    bdev_list: ["0000:44:00.0"]
 

[root@Rocky-1 ~]# /usr/bin/daos_server start
DAOS Server config loaded from /etc/daos/daos_server.yml
/usr/bin/daos_server logging to file /var/log/daos_server.log
DEBUG 11:17:31.720878 start.go:90: Switching control log level to DEBUG
DEBUG 11:17:31.721131 defaults.go:92: failed to load library: unable to open a handle to the library
ERROR: unable to open a handle to the library
DEBUG 11:17:31.721209 fabric.go:875: waiting for fabric interfaces to become ready...
DEBUG 11:17:31.721299 fabric.go:892: fabric interface "enp3s0f1" is ready
DEBUG 11:17:31.721372 provider.go:87: getting topology with hwloc version 0x20100
DEBUG 11:17:31.769773 provider.go:145: adding device found at "/sys/class/net/eno1" (type network interface, NUMA node 0)
DEBUG 11:17:31.769933 provider.go:145: adding device found at "/sys/class/net/eno2" (type network interface, NUMA node 0)
DEBUG 11:17:31.770081 provider.go:145: adding device found at "/sys/class/net/eno3" (type network interface, NUMA node 0)
DEBUG 11:17:31.770212 provider.go:145: adding device found at "/sys/class/net/eno4" (type network interface, NUMA node 0)
DEBUG 11:17:31.770357 provider.go:145: adding device found at "/sys/class/net/enp3s0f0" (type network interface, NUMA node 0)
DEBUG 11:17:31.770485 provider.go:145: adding device found at "/sys/class/net/enp3s0f1" (type network interface, NUMA node 0)
DEBUG 11:17:31.770537 provider.go:125: failed to read net device: open /sys/class/net/lo/device/net: no such file or directory
DEBUG 11:17:31.770749 provider.go:264: adding virtual device at "/sys/devices/virtual/net/lo"
DEBUG 11:17:31.886150 provider.go:83: found fabric interfaces:
enp3s0f1 (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
lo (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
shm (providers: shm)
DEBUG 11:17:31.886239 provider.go:292: no cxi subsystem in sysfs
DEBUG 11:17:31.886338 fabric.go:441: unable to open a handle to the library
DEBUG 11:17:31.886419 fabric.go:511: ignoring fabric interface "shm" (shm) not found in topology
DEBUG 11:17:31.886534 fabric.go:793: discovered 2 fabric interfaces:
enp3s0f1 (interface: enp3s0f1) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
lo (interface: lo) (providers: ofi+sockets, ofi+tcp, ofi+tcp;ofi_rxm, udp, udp;ofi_rxd)
DEBUG 11:17:31.886645 server.go:750: detected NUMA affinity 0 for engine 0
DEBUG 11:17:31.886675 server.go:757: enabling single-engine legacy core allocation algorithm
DEBUG 11:17:31.886703 server.go:420: validating config file read from "/etc/daos/daos_server.yml"
DEBUG 11:17:31.886742 server.go:443: vfio=true hotplug=false vmd=true requested in config
WARNING: Configuration includes only one access point. This provides no redundancy in the event of an access point failure.
DEBUG 11:17:31.886841 server.go:549: engine 0 fabric numa 0, storage numa 0
DEBUG 11:17:31.887914 server_utils.go:148: setting OFI_DOMAIN=enp3s0f1 for enp3s0f1
DEBUG 11:17:31.889170 server.go:377: active config saved to /var/run/.daos_server.active.yml (read-only)
DEBUG 11:17:31.889251 server.go:525: fault domain: /rocky-1
DEBUG 11:17:31.889862 server.go:236: setting core dump filter to 0x13
DEBUG 11:17:31.890615 database.go:280: set db replica addr: 192.168.1.215:10001
DEBUG 11:17:31.891076 server.go:164: time to init network: 242.45µs
DEBUG 11:17:31.891195 server_utils.go:260: allocating 4098 hugepages on each of these numa nodes: [0]
DEBUG 11:17:31.891267 ctl_storage.go:53: calling bdev provider prepare: {ForwardableRequest:{Forwarded:false} HugePageCount:4098 HugeNodes:0 CleanHugePagesOnly:false PCIAllowList: PCIBlockList: TargetUser:root Reset_:false DisableVFIO:false EnableVMD:true}
DEBUG 11:17:32.224164 server.go:164: time to prepare bdev storage: 332.967644ms
DEBUG 11:17:32.224261 ctl_storage.go:59: calling bdev provider scan: {ForwardableRequest:{Forwarded:false} DeviceList:0000:44:00.0 VMDEnabled:false BypassCache:true}
ERROR: /usr/bin/daos_admin SIGILL: illegal instruction
PC=0x7fbc78755c0e m=0 sigcode=2
signal arrived during cgo execution
instruction bytes: 0xc4 0xe2 0x69 0xf7 0xc0 0x41 0x89 0x85 0xe0 0x19 0x0 0x0 0xe8 0xb1 0x98 0xfe
 
goroutine 1 [syscall]:
runtime.cgocall(0x92049d, 0xc0001d1c20)
/usr/src/runtime/cgocall.go:158 +0x5c fp=0xc0001d1bf8 sp=0xc0001d1bc0 pc=0x408b1c
github.com/daos-stack/daos/src/control/lib/spdk._Cfunc_nvme_discover()
_cgo_gotypes.go:321 +0x49 fp=0xc0001d1c20 sp=0xc0001d1bf8 pc=0x904fc9
github.com/daos-stack/daos/src/control/lib/spdk.(*NvmeImpl).Discover(0xc0000bcd00?, {0xb267f8, 0xc000184300})
/builddir/build/BUILD/daos-2.2.0/src/control/lib/spdk/nvme.go:127 +0x54 fp=0xc0001d1cd8
ERROR: /usr/bin/daos_admin  sp=0xc0001d1c20 pc=0x9059b4
github.com/daos-stack/daos/src/control/server/storage/bdev.(*spdkBackend).Scan(0xc0000bcce0, {{0x56?}, 0xc0001d5030?, 0x20?, 0x1?})
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/backend.go:341 +0x1b7 fp=0xc0001d1da8 sp=0xc0001d1cd8 pc=0x909f37
github.com/daos-stack/daos/src/control/server/storage/bdev.(*Provider).Scan(...)
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev/provider.go:54
main.(*bdevScanHandler).Handle(0xc000014788, {0xb267f8?, 0xc000184300}, 0xc0002e4240)
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/handler.go:175 +0x27a fp=0xc0001d1e08 sp=0xc0001d1da8 pc=0x91d1fa
github.com/daos-stack/daos/src/control/pbin.(*App).handleRequest(0xc0000caae0, 0xc0002e4240)
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:214 +0x62 fp=0xc0001d1e58 sp=0xc0001d1e08 pc=0x5949c2
github.com/daos-stack/daos/src/control/pbin.(*App).Run(0xc0000caae0)
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/app.go:155 +0x2ed fp=0xc0001d1f50 sp=0xc0001d1e58 pc=0x59448d
main.main()
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_admin/main.go:25 +0xaf fp=0xc0001d1f80 sp=0xc0001d1f50 pc=0x91de6f
runtime.main()
/usr/src/runtime/proc.go:250 +0x212 fp=0xc0001d1fe0 sp=0xc0001d1f80 pc=0x43dd32
runtime.goexit
ERROR: /usr/bin/daos_admin ()
/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc0001d1fe8 sp=0xc0001d1fe0 pc=0x46b9c1
 
goroutine 2 [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?
ERROR: /usr/bin/daos_admin , 0x0?, 0x0?)
/usr/src/runtime/proc.go:363 +0xd6 fp=0xc00009efb0 sp=0xc00009ef90 pc=0x43e0f6
ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369
runtime.forcegchelper()
 
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:302 +0xad fp=0xc00009efe0 sp=0xc00009efb0 pc=0x43df8d
runtime.goexit()
/usr/src/runtime/asm_amd64.s
ERROR: /usr/bin/daos_admin :1594 +0x1 fp=0xc00009efe8 sp=0xc00009efe0 pc=0x46b9c1
created by 
ERROR: /usr/bin/daos_admin runtime.init.6
/usr/src/runtime/proc.go:290 +0x25
ERROR: /usr/bin/daos_admin 
goroutine 3 [GC sweep wait]:
runtime.gopark(0x0
ERROR: /usr/bin/daos_admin ?, 0x0?, 0x0?, 0x0?
ERROR: /usr/bin/daos_admin , 0x0?)
/usr/src/runtime/proc.go:363 +
ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009f790 sp=0xc00009f770 pc=0x43e0f6
ERROR: /usr/bin/daos_admin runtime.goparkunlock(...)
/usr/src/runtime/proc.go:369
runtime.bgsweep(0x0?)
 
ERROR: /usr/bin/daos_admin /usr/src/runtime/mgcsweep.go:278 +0x8e fp=0xc00009f7c8 sp=0xc00009f790 pc=0x429c2e
runtime.gcenable.func1()
/usr/src/runtime/mgc.go:178 +
ERROR: /usr/bin/daos_admin 0x26 fp=0xc00009f7e0 sp=0xc00009f7c8 pc=0x41e8c6
runtime.goexit()
/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc00009f7e8 sp=0xc00009f7e0 pc=0x46b9c1
created by runtime.gcenable
/usr/src/runtime/mgc.go:
ERROR: /usr/bin/daos_admin 178 +0x6b
 
goroutine 4 [GC scavenge wait]:
runtime.gopark(0xc0000c6000?, 0xb1e1e8?
ERROR: /usr/bin/daos_admin , 0x1?, 0x0?, 0x0?)
/usr/src/runtime/proc.go:363 +
ERROR: /usr/bin/daos_admin 0xd6 fp=0xc00009ff70 sp=0xc00009ff50 pc=0x43e0f6
runtime.goparkunlock(...)
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369
runtime.(*scavengerState).park(0x10a3a20)
/usr/src/runtime/mgcscavenge.go:389 +0x53 fp=
ERROR: /usr/bin/daos_admin 0xc00009ffa0 sp=0xc00009ff70 pc=0x427cd3
runtime.bgscavenge(0x0?)
/usr/src/runtime/mgcscavenge.go:
ERROR: /usr/bin/daos_admin 617 +0x45 fp=0xc00009ffc8 sp=0xc00009ffa0 pc=0x4282a5
runtime.gcenable.func2
ERROR: /usr/bin/daos_admin ()
/usr/src/runtime/mgc.go:179 +0x26 fp=0xc00009ffe0 sp=0xc00009ffc8 pc=0x41e866
ERROR: /usr/bin/daos_admin 
runtime.goexit()
/usr/src/runtime/asm_amd64.s:1594 +0x1 fp=
ERROR: /usr/bin/daos_admin 0xc00009ffe8 sp=0xc00009ffe0 pc=0x46b9c1
created by runtime.gcenable
/usr/src/runtime/mgc.go:179
ERROR: /usr/bin/daos_admin  +0xaa
 
goroutine 5 [finalizer wait]:
runtime.gopark(0x10a4520?, 
ERROR: /usr/bin/daos_admin 0xc000007860?, 0x0?, 0x0?, 0xc00009e770?)
/usr/src/runtime/proc.go:363
ERROR: /usr/bin/daos_admin  +0xd6 fp=0xc00009e628 sp=0xc00009e608 pc=0x43e0f6
runtime.goparkunlock(...)
 
ERROR: /usr/bin/daos_admin /usr/src/runtime/proc.go:369
runtime.runfinq()
/usr/src/runtime/mfinal.go:
ERROR: /usr/bin/daos_admin 180 +0x10f fp=0xc00009e7e0 sp=0xc00009e628 pc=0x41d9cf
runtime.goexit()
/usr/src/runtime/asm_amd64.s:
ERROR: /usr/bin/daos_admin 1594 +0x1 fp=0xc00009e7e8 sp=0xc00009e7e0 pc=0x46b9c1
created by runtime.createfing
/usr/src/runtime/mfinal.go:157 +0x45
ERROR: /usr/bin/daos_admin 
rax    0x1
rbx    0x2492240
rcx    0x7fbc78da4e60
rdx    0x0
rdi    0x2492240
rsi    
ERROR: /usr/bin/daos_admin 0x7fbc78da1af0
rbp    0x2000003e7240
rsp    0x7ffc5180b120
r8     0x7fbc78da2460
r9     0x0
r10    0x70000000004
r11    0x0
ERROR: /usr/bin/daos_admin 
r12    0x202001000000
r13    0x2000003e7240
r14    0x7ffc5180b150
r15    0x0
rip    0x7fbc78755c0e
rflags 
ERROR: /usr/bin/daos_admin 0x13246
cs     0x33
fs     0x0
gs     0x0
DEBUG 11:17:32.627353 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627423 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627466 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627498 exec.go:188: discarding garbage response ""
DEBUG 11:17:32.627541 exec.go:188: discarding garbage response ""
ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts
DEBUG 11:17:32.627658 server.go:164: time to scan bdev storage: 403.426657ms
DEBUG 11:17:32.627726 pubsub.go:259: stopping event loop
DEBUG 11:17:32.627853 main.go:69: Unable to decode response after 5 attempts
github.com/daos-stack/daos/src/control/pbin.ExecReq
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/exec.go:197
github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:100
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586
github.com/daos-stack/daos/src/control/server/storage.scanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483
github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493
github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan
/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60
github.com/daos-stack/daos/src/control/server.scanBdevStorage
/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297
github.com/daos-stack/daos/src/control/server.(*server).addEngines
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306
github.com/daos-stack/daos/src/control/server.Start
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549
main.(*startCmd).Execute
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147
main.parseOpts.func1
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126
github.com/jessevdk/go-flags.(*Parser).ParseArgs
/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314
main.parseOpts
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134
main.main
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151
runtime.main
/usr/src/runtime/proc.go:250
runtime.goexit
/usr/src/runtime/asm_amd64.s:1594
privileged binary execution failed
github.com/daos-stack/daos/src/control/pbin.(*Forwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/pbin/forwarding.go:105
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).SendReq
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:579
github.com/daos-stack/daos/src/control/server/storage.(*BdevAdminForwarder).Scan
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/bdev.go:586
github.com/daos-stack/daos/src/control/server/storage.scanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:483
github.com/daos-stack/daos/src/control/server/storage.(*Provider).ScanBdevs
/builddir/build/BUILD/daos-2.2.0/src/control/server/storage/provider.go:493
github.com/daos-stack/daos/src/control/server.(*StorageControlService).NvmeScan
/builddir/build/BUILD/daos-2.2.0/src/control/server/ctl_storage.go:60
github.com/daos-stack/daos/src/control/server.scanBdevStorage
/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:297
github.com/daos-stack/daos/src/control/server.(*server).addEngines
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306
github.com/daos-stack/daos/src/control/server.Start
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549
main.(*startCmd).Execute
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147
main.parseOpts.func1
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126
github.com/jessevdk/go-flags.(*Parser).ParseArgs
/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314
main.parseOpts
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134
main.main
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151
runtime.main
/usr/src/runtime/proc.go:250
runtime.goexit
/usr/src/runtime/asm_amd64.s:1594
NVMe Scan Failed
github.com/daos-stack/daos/src/control/server.scanBdevStorage
/builddir/build/BUILD/daos-2.2.0/src/control/server/server_utils.go:302
github.com/daos-stack/daos/src/control/server.(*server).addEngines
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:306
github.com/daos-stack/daos/src/control/server.Start
/builddir/build/BUILD/daos-2.2.0/src/control/server/server.go:549
main.(*startCmd).Execute
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/start.go:147
main.parseOpts.func1
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:126
github.com/jessevdk/go-flags.(*Parser).ParseArgs
/builddir/build/BUILD/daos-2.2.0/src/control/vendor/github.com/jessevdk/go-flags/parser.go:314
main.parseOpts
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:134
main.main
/builddir/build/BUILD/daos-2.2.0/src/control/cmd/daos_server/main.go:151
runtime.main
/usr/src/runtime/proc.go:250
runtime.goexit
/usr/src/runtime/asm_amd64.s:1594
ERROR: NVMe Scan Failed: privileged binary execution failed: Unable to decode response after 5 attempts
 


[DUG'22] Save the date & call for presentations!

Kudryavtsev, Andrey O <andrey.o.kudryavtsev@...>
 

Greetings DAOS Community!

 

SC22 is around the corner and it means we have a special event coming again. The Intel DAOS team invites you to join us for the 6th annual DAOS User Group (DUG22). This will be the first in-person user group since the pandemic.

 

The agenda is not yet finalized and we’re inviting the community members to submit their presentation proposals. Please, send brief submissions to daos-info@daos.groups.io and keep me copied.

If you have any feedback to share, what you want to see the most, type of presentations, areas to cover and others to listen, - don’t hesitate to contact me directly. This is the event we make for you!

 

The event will take place on November 14th from 9am until 1pm. We did our best to avoid overlaps with other activities and that’s why Monday was selected. We hope it fits your plans and the agenda and doesn’t overlap with other workshops and tutorials that day.

 

Event Location: Venetian Room, Fairmont Hotel (1717 N Akard St, Dallas, TX 75201), which is within one mile from the Kay Bailey Hutchison Convention Center.

 

Additional details will be shared once the agenda is finalized. We hope to see you all in person.

 

Best Regards,

Andrey, Kelsey, Johann and the rest of the DAOS team. 

 

-- 

Andrey Kudryavtsev, 

DAOS Product Manager

Intel Corp. 

 


Community Roadmap Update

Lombardi, Johann
 

Hi there,

 

Please note that the DAOS community roadmap has been updated on the wiki. Those changes were required to accelerate support for the “Non-PMem phase 1 and phase 2” I/O path (labelled md_on_ssd in jira, see here for more info) and also better align with our upcoming deployments/projects. Please let us know if you have any questions or comments.

 

Best regards,

Johann

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 5 208 026.16 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


DAOS Community Update / Oct'22

Lombardi, Johann
 

Hi there,

Please find below the DAOS community newsletter for October 2022. A copy of this newsletter is also available on the wiki.

Past Events

Upcoming Events

Release

  • Current stable release is 2.2.0 released on Oct 21. See https://docs.daos.io/v2.2/ and https://packages.daos.io/v2.2/ for more information. Please see the release notes for more details.
  • With the release of 2.2.0, 2.0.x releases are declared end-of-life.
  • Branches:
    • release/2.2 is the release branch for the stable 2.2 release. Latest bug fix release is 2.2.0 (v2.2.0 tag).
    • Master is the development branch for the future 2.4 release. Latest test build is 2.3.101 (v2.3.101-tb tag) including the EC rotation feature.
  • Major recent changes on release/2.2 (future 2.2 release):
    • Fix VMD domain parsing
    • Fix PS replica leaks
    • Fix 2.0/2.2 interoperability issue with pool RF
    • Fix assertion failure in dc_cont_free()
    • Fix race condition in cart
    • Address memory corruption during key_query
    • Several fixes for EC migration
    • Check and reset NONEXIST in iter_next and probe
    • Bump protobuf-java from 3.16.1 to 3.16.3
  • Major recent changes on master (future 2.4 release):
    • All patches listed in the 2.2 section above.
    • Fix a bug in key enumeration associated with ads[0].kd_key_len
    • Add support for rf_lvl to cont create api on pydaos
    • Enable EC parity rotation by default
    • Add missing void in dfs_init/fini declaration
    • Remove RPC post increment restriction preventing extra RPC handles from being posted upon exhaustion
    • Re-enable custom RPC timeout in RDB
    • Remove ability to build w/o stdatomic.h
    • Add bulk and vos latency to metrics
    • Skip reclaim job during merge
    • Fix some DTX visibility issues
    • Allo daos_server network scan to run w/o config
    • Update DAOS to use UCX 1.13 and disable UCX multi-rail support
    • Don't hold lock for d_hhash_link_get/putref
    • Add dmg system exclude
    • Fix auto object class selection for RP hints for arrays
    • Don't set pool destroy state if service is not up
    • Improve PS reconfigurations
    • Add IOPS info to daos pool autotest
    • Fix swim paranoia
    • Reject invalid number of pool create ranks
    • Add config option to agent to ignore interfaces
    • Several fixes to EC parity rotation
    • Add support for pull request template
    • Fix a number of python flake issues
    • Add ability to run server under valgrind
    • Add NUMA affinity to tmpfs mount options
    • Add pool svc list to property query
    • Bypass checks in pool evict rdb tx update
    • Several IV fixes
    • Remove CentOS7 leftovers
    • Add DFS readdirplus API
    • Several checksum scrubbing upgrade fixes
    • Rename privileged helper from daos_admin to daos_server_helper
    • Rename rf and rf_level properties to rd_fac and rd_lvl
    • Add rebuild version to pool query
    • Bump garbage collection ULT stack size
  • What is coming:
    • 2.2.1 bug fix release
    • 2.4.0 feature freeze

R&D

  • Major features under development:
    • VOS on SPDK blob
      • Detailed design documented here Metadata on SSDs including the WAL layout (Meta blob and WAL blob layout)
      • All development and testing tasks are tracked under DAOS-11040 for phase 1.
      • Changes to the yaml file implemented. WAL infrastructure and metadata blob creation landed.
      • PMDK-based allocator extracted and integrated into DAOS. Early performance evaluation in progress.
      • Branch: feature/vos-on-blob
      • Target release: 2.4 (phase 1 preview)
    • Multi-user dfuse
    • More aggressive caching in dfuse for AI APPs
      • FUSE version updated for EL8 for readdir caching support, not needed on Leap that was recent enough FUSE version.
      • FUSE kernel readdir is on enabled, dfuse readdir still under work.
      • PR: https://github.com/daos-stack/daos/pull/6776
      • Target release: 2.4
    • Catastrophic recovery
      • Aka distributed fsck or checker
      • Tests for ddb (low level debugger utility similar to debugfs for ext4) landed
      • Testing for the dmg checker landed.
      • Testing for pass 3 and 4 under development.
      • Pass 4 for container recovery completed.
      • Branch: feature/cat_recovery
      • Target release: 2.6
    • Multi-homed network support
      • Aka multi-provider support
      • This feature aims at supporting multiple network provider in the engine
      • Branch is feature complete now and testing is underway
      • Branch: feature/multiprovider
      • Target release: 2.6
    • Client-side metrics
    • Performance domain
      • Extend placement algorithm to be aware of fabric topology
      • Fix to avoid putting shards on the same domain landed
      • Branch: feature/perf_dom
      • Target release: 2.8
  • Pathfinding:
    • DAOS Pipeline API for active storage
    • Leveraging the Intel Data Streaming Accelerator (DSA) to accelerate DAOS
      • Prototype leveraging DSA for VOS aggregation delivered
      • Initial results shared at IXPUG conference.
    • OPX provider support in collaboration with Cornelis Networks
      • OPX provider merged upstream in libfabric
      • Provider supported in latest mercury version
      • Changes to DAOS to enable OPX as part of the build in progress
    • GPU data path optimizations
  • I/O Middleware / Framework Support:

News

  • Congratulation to the Seagate team for the integration of the DAOS backend to the Rados Gateway (RGW)!
  • Updated DAOS roadmap including changes for the md_on_ssd phase 1 and phase 2 project to be available soon.

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 5 208 026.16 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Announcement: DAOS 2.2 is generally available

Poddubnyy, Ivan
 

The DAOS team would like to announce the release of DAOS Version 2.2.

 

It is a major release containing the following new features and improvements:

 

  • Rocky Linux 8 and Alma Linux 8 support have been added
  • CentOS Linux 8 support is removed
  • Support for the libfabric/tcp provider is added. It replaces libfabric/sockets
  • UCX support has been added (Technology Preview)
  • Interoperability of DAOS 2.2 with DAOS 2.0
  • Intel VMD devices are now supported in the control plane
  • POSIX containers (DFS) now support file modification time (mtime)

 

The release also contains a number of the bugfixes and stability improvements.

 

With the release of DAOS 2.2, the previous version – DAOS 2.0.3 – is now declared End-Of-Life.

 

The complete list of changes can be found here: https://docs.daos.io/v2.2/release/release_notes/

 

There are several resources available for the release:

 

RPM Repositories: https://packages.daos.io/v2.2/

Admin Guide: https://docs.daos.io/v2.2/admin/hardware/

User Guide: https://docs.daos.io/v2.2/user/workflow/

Architecture Overview: https://docs.daos.io/v2.2/overview/architecture/

Source Code: https://github.com/daos-stack/daos/releases/

 

As always, feel free to use this mailing list for any issues you may find with the release or our JIRA bug tracking system, available at https://daosio.atlassian.net/jira or on our Slack channel at https://daos-stack.slack.com.

 

 

Thank you,

 

Ivan Poddubnyy

DAOS Customer Enablement and Support Manager

Super Compute Storage Architecture and Development Division

Intel

 


Re: How to install DAOS on ARM64 platform

Groot
 

Yes, the /root/huzj/daos/install/lib64/daos_srv/librdb.so exists.
And the environment variables we set just like the introduction in https://docs.daos.io/v2.0/QSG/build_from_scratch/#environment-setup
export daospath=/root/huzj/daos
export CPATH=${daospath}/install/include/:$CPATH
export PATH=${daospath}/install/bin/:${daospath}/install/sbin:$PATH
the server config file is:
#For a single-server system
 
name: daos_server
access_points: ['master']
port: 10001
 
 
provider: ofi+sockets
control_log_file: /tmp/daos_server.log
transport_config:
  allow_insecure: false
  client_cert_dir: /etc/daos/certs/clients
  ca_cert: /etc/daos/certs/daosCA.crt
  cert: /etc/daos/certs/server.crt
  key: /etc/daos/certs/server.key
 
telemetry_port: 9191
 
engines:
  -
    rank: 1
    pinned_numa_node: 0
    targets: 2
    nr_xs_helpers: 4
    fabric_iface: enp3s0
    fabric_iface_port: 31416
    log_file: /tmp/daos_engine.0.log
 
    env_vars:
      - FI_SOCKETS_MAX_CONN_RETRY=1
      - FI_SOCKETS_CONN_TIMEOUT=2000
    # Storage definitions (one per tier)
    storage:
      -
        # When scm_class is set to ram, tmpfs will be used to emulate SCM.
        # The size of ram is specified by scm_size in GB units.
        class: ram
        scm_size: 2
        scm_mount: /mnt/daos
 
Thanks.
Groot


Re: How to install DAOS on ARM64 platform

Faccini, Bruno
 

Can you check if /root/huzj/daos/install/lib64/daos_srv/librdb.so exists ?

And if not, is there any log for this build available ?

Also, what are the environment variables for the session you are using to start the server/engine ?

And last, can you attach your server/engine config file ?

Thanks in advance for your help,

Bruno.

 

From: <daos@daos.groups.io> on behalf of Groot <kukougu@...>
Reply to: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Friday 30 September 2022 at 10:48
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] How to install DAOS on ARM64 platform

 

Since I build by source on ARM64 platform. I use daos_server start to start the daos server. But get the error bleow and I mkdir the /var/run/daos_server directory and the daos_server start successfully.
$ daos_server start 
ERROR: dRPC server setup: missing socket directory /var/run/daos_server: stat /var/run/daos_server: no such file or directory

But the daos_engine.0 (/tmp/daos_engine.0.log) get error after format the storage.

09/30-16:15:24.84 slave1 DAOS[1213401/-1/0] server ERR  src/engine/module.c:90 dss_module_load() cannot load librdb.so: /root/huzj/daos/install/bin/../lib64/daos_srv/librdb.so: undefined symbol: ds_obj_enum_pack

09/30-16:15:24.84 slave1 DAOS[1213401/-1/0] server ERR  src/engine/init.c:231 modules_load() Failed to load module rdb: -1003

Thanks a lot.
Groot

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 5 208 026.16 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: How to install DAOS on ARM64 platform

Groot
 

Since I build by source on ARM64 platform. I use daos_server start to start the daos server. But get the error bleow and I mkdir the /var/run/daos_server directory and the daos_server start successfully.
$ daos_server start 
ERROR: dRPC server setup: missing socket directory /var/run/daos_server: stat /var/run/daos_server: no such file or directory

But the daos_engine.0 (/tmp/daos_engine.0.log) get error after format the storage.
09/30-16:15:24.84 slave1 DAOS[1213401/-1/0] server ERR  src/engine/module.c:90 dss_module_load() cannot load librdb.so: /root/huzj/daos/install/bin/../lib64/daos_srv/librdb.so: undefined symbol: ds_obj_enum_pack
09/30-16:15:24.84 slave1 DAOS[1213401/-1/0] server ERR  src/engine/init.c:231 modules_load() Failed to load module rdb: -1003
Thanks a lot.
Groot

1 - 20 of 1653