Date   

Re: Unable to build libmfu properly

netsurfed
 

Yes, the same errors.

daos_agent@sw2:~/git/mpifileutils/build$ export LD_LIBRARY_PATH=${MY_DAOS_INSTALL_PATH}/lib64:$LD_LIBRARY_PATH
daos_agent@sw2:~/git/mpifileutils/build$ echo $LD_LIBRARY_PATH
/home/daos_agent/git/daos/build/lib64:
daos_agent@sw2:~/git/mpifileutils/build$ make
[ 32%] Built target mfu_o
[ 34%] Built target mfu-static
[ 36%] Built target mfu
[ 37%] Linking C executable dbcast
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_insert'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_lookup'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_unlinked'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_stat'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_readdir'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_table_create'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_obj_anchor_split'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_release'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_find'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_obj_anchor_set'
collect2: error: ld returned 1 exit status
make[2]: *** [src/dbcast/CMakeFiles/dbcast.dir/build.make:90: src/dbcast/dbcast] Error 1
make[1]: *** [CMakeFiles/Makefile2:539: src/dbcast/CMakeFiles/dbcast.dir/all] Error 2
make: *** [Makefile:130: all] Error 2
 


Re: Unable to build libmfu properly

Bohning, Dalton
 

Will you try appending ${MY_DAOS_INSTALL_PATH}/lib64 to your LD_LIBRARY_PATH and re-run the “make” command?

export LD_LIBRARY_PATH=${MY_DAOS_INSTALL_PATH}/lib64:$LD_LIBRARY_PATH

 

~Dalton Bohning

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of netsurfed
Sent: Wednesday, December 29, 2021 6:04 PM
To: daos@daos.groups.io
Subject: Re: [daos] Unable to build libmfu properly

 

Hi, 

$ cat ${MY_MFU_BUILD_PATH}/src/common/CMakeFiles/mfu.dir/link.txt

/usr/bin/cc -fPIC -I/home/daos_agent/git/daos/build/include -L/home/daos_agent/git/daos/build/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread -shared -Wl,-soname,libmfu.so -o libmfu.so CMakeFiles/mfu_o.dir/mfu_bz2.c.o CMakeFiles/mfu_o.dir/mfu_bz2_static.c.o CMakeFiles/mfu_o.dir/mfu_compress_bz2_libcircle.c.o CMakeFiles/mfu_o.dir/mfu_decompress_bz2_libcircle.c.o CMakeFiles/mfu_o.dir/mfu_flist.c.o CMakeFiles/mfu_o.dir/mfu_flist_chunk.c.o CMakeFiles/mfu_o.dir/mfu_flist_copy.c.o CMakeFiles/mfu_o.dir/mfu_flist_io.c.o CMakeFiles/mfu_o.dir/mfu_flist_chmod.c.o CMakeFiles/mfu_o.dir/mfu_flist_create.c.o CMakeFiles/mfu_o.dir/mfu_flist_remove.c.o CMakeFiles/mfu_o.dir/mfu_flist_sort.c.o CMakeFiles/mfu_o.dir/mfu_flist_usrgrp.c.o CMakeFiles/mfu_o.dir/mfu_flist_walk.c.o CMakeFiles/mfu_o.dir/mfu_io.c.o CMakeFiles/mfu_o.dir/mfu_param_path.c.o CMakeFiles/mfu_o.dir/mfu_path.c.o CMakeFiles/mfu_o.dir/mfu_pred.c.o CMakeFiles/mfu_o.dir/mfu_util.c.o CMakeFiles/mfu_o.dir/strmap.c.o  -Wl,-rpath,/usr/lib/x86_64-linux-gnu/openmpi/lib:/home/daos_agent/install/lib: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so /home/daos_agent/install/lib/libdtcmp.so -larchive /home/daos_agent/install/lib/libcircle.so -lbz2

 

And The build command of libmfu is:
git clone 
https://github.com/mchaarawi/mpifileutils -b pfind_integration "${MY_MFU_SOURCE_PATH}" &&
mkdir -p "${MY_MFU_BUILD_PATH}" &&
cd "${MY_MFU_BUILD_PATH}" &&
CFLAGS="-I${MY_DAOS_INSTALL_PATH}/include" \
LDFLAGS="-L${MY_DAOS_INSTALL_PATH}/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread" \
cmake "${MY_MFU_SOURCE_PATH}" \
-DENABLE_XATTRS=OFF \
-DWITH_DTCMP_PREFIX=${MY_MFU_INSTALL_PATH} \
-DWITH_LibCircle_PREFIX=${MY_MFU_INSTALL_PATH} \
-DCMAKE_INSTALL_PREFIX=${MY_MFU_INSTALL_PATH} &&
make -j8 install


Re: Unable to build libmfu properly

netsurfed
 

Hi, 
$ cat ${MY_MFU_BUILD_PATH}/src/common/CMakeFiles/mfu.dir/link.txt
/usr/bin/cc -fPIC -I/home/daos_agent/git/daos/build/include -L/home/daos_agent/git/daos/build/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread -shared -Wl,-soname,libmfu.so -o libmfu.so CMakeFiles/mfu_o.dir/mfu_bz2.c.o CMakeFiles/mfu_o.dir/mfu_bz2_static.c.o CMakeFiles/mfu_o.dir/mfu_compress_bz2_libcircle.c.o CMakeFiles/mfu_o.dir/mfu_decompress_bz2_libcircle.c.o CMakeFiles/mfu_o.dir/mfu_flist.c.o CMakeFiles/mfu_o.dir/mfu_flist_chunk.c.o CMakeFiles/mfu_o.dir/mfu_flist_copy.c.o CMakeFiles/mfu_o.dir/mfu_flist_io.c.o CMakeFiles/mfu_o.dir/mfu_flist_chmod.c.o CMakeFiles/mfu_o.dir/mfu_flist_create.c.o CMakeFiles/mfu_o.dir/mfu_flist_remove.c.o CMakeFiles/mfu_o.dir/mfu_flist_sort.c.o CMakeFiles/mfu_o.dir/mfu_flist_usrgrp.c.o CMakeFiles/mfu_o.dir/mfu_flist_walk.c.o CMakeFiles/mfu_o.dir/mfu_io.c.o CMakeFiles/mfu_o.dir/mfu_param_path.c.o CMakeFiles/mfu_o.dir/mfu_path.c.o CMakeFiles/mfu_o.dir/mfu_pred.c.o CMakeFiles/mfu_o.dir/mfu_util.c.o CMakeFiles/mfu_o.dir/strmap.c.o  -Wl,-rpath,/usr/lib/x86_64-linux-gnu/openmpi/lib:/home/daos_agent/install/lib: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so /home/daos_agent/install/lib/libdtcmp.so -larchive /home/daos_agent/install/lib/libcircle.so -lbz2
 
And The build command of libmfu is:
git clone https://github.com/mchaarawi/mpifileutils -b pfind_integration "${MY_MFU_SOURCE_PATH}" &&
mkdir -p "${MY_MFU_BUILD_PATH}" &&
cd "${MY_MFU_BUILD_PATH}" &&
CFLAGS="-I${MY_DAOS_INSTALL_PATH}/include" \
LDFLAGS="-L${MY_DAOS_INSTALL_PATH}/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread" \
cmake "${MY_MFU_SOURCE_PATH}" \
-DENABLE_XATTRS=OFF \
-DWITH_DTCMP_PREFIX=${MY_MFU_INSTALL_PATH} \
-DWITH_LibCircle_PREFIX=${MY_MFU_INSTALL_PATH} \
-DCMAKE_INSTALL_PREFIX=${MY_MFU_INSTALL_PATH} &&
make -j8 install


Re: Unable to build libmfu properly

Bohning, Dalton
 

Interesting. As you said, the libraries are in fact there. And I think cmake is finding the libraries, or else cmake would print something like “/usr/bin/ld: cannot find -ldfs” before getting to the “make” command. Could you please the compile command used to build libmfu?

$ cat ${MY_MFU_BUILD_PATH}/src/common/CMakeFiles/mfu.dir/link.txt

 

~Dalton Bohning

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of netsurfed
Sent: Tuesday, December 28, 2021 9:40 PM
To: daos@daos.groups.io
Subject: Re: [daos] Unable to build libmfu properly

 

Hi, yes, "build" is where daos is installed.

$ ls -l /home/daos_agent/git/daos/build/lib64/

total 22080

drwxrwxr-x 3 daos_agent daos_agent    4096 Dec 13 08:06 daos

drwxrwxr-x 2 daos_agent daos_agent    4096 Dec 13 10:18 daos_srv

lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libcart.so -> libcart.so.4.9.0

lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libcart.so.4 -> libcart.so.4.9.0

-rwxrwxr-x 1 daos_agent daos_agent 4948104 Dec 13 10:10 libcart.so.4.9.0

-rwxrwxr-x 1 daos_agent daos_agent  710680 Dec 13 10:10 libdaos_cmd_hdlrs.so

-rwxrwxr-x 1 daos_agent daos_agent 2955992 Dec 13 10:10 libdaos_common_pmem.so

-rwxrwxr-x 1 daos_agent daos_agent 2905240 Dec 13 10:10 libdaos_common.so

lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libdaos.so -> libdaos.so.2.0.0

lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libdaos.so.2 -> libdaos.so.2.0.0

-rwxrwxr-x 1 daos_agent daos_agent 6814552 Dec 13 10:04 libdaos.so.2.0.0

-rwxrwxr-x 1 daos_agent daos_agent  274776 Dec 13 10:10 libdaos_tests.so

-rwxrwxr-x 1 daos_agent daos_agent   27160 Dec 13 10:18 libdfs_internal.so

-rwxrwxr-x 1 daos_agent daos_agent  816728 Dec 13 10:10 libdfs.so

-rw-rw-r-- 1 daos_agent daos_agent  206304 Dec 13 08:07 libdfuse.a

-rwxrwxr-x 1 daos_agent daos_agent  137440 Dec 13 10:18 libdfuse.so

-rwxrwxr-x 1 daos_agent daos_agent  116120 Dec 13 10:10 libdts.so

-rwxrwxr-x 1 daos_agent daos_agent  158712 Dec 13 10:10 libduns.so

lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libgurt.so -> libgurt.so.4.9.0

lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libgurt.so.4 -> libgurt.so.4.9.0

-rwxrwxr-x 1 daos_agent daos_agent 1132488 Dec 13 10:03 libgurt.so.4.9.0

-rw-rw-r-- 1 daos_agent daos_agent  777432 Dec 13 08:06 libioil.a

-rwxrwxr-x 1 daos_agent daos_agent  411776 Dec 13 10:18 libioil.so

-rw-rw-r-- 1 daos_agent daos_agent  169190 Dec 13 08:03 libnvme_control.a

drwxrwxr-x 3 daos_agent daos_agent    4096 Dec 13 08:07 python3.8


And "${MY_DAOS_INSTALL_PATH}/include" is the installed include directory.

$ ls -l ${MY_DAOS_INSTALL_PATH}/include

total 332

drwxrwxr-x 2 daos_agent daos_agent  4096 Dec 17 08:11 cart

drwxrwxr-x 2 daos_agent daos_agent  4096 Dec 13 08:06 daos

-rw-r--r-- 1 daos_agent daos_agent  6274 Dec 13 06:39 daos_api.h

-rw-r--r-- 1 daos_agent daos_agent 14770 Dec 13 06:39 daos_array.h

-rw-r--r-- 1 daos_agent daos_agent 27543 Dec 13 06:39 daos_cont.h

-rw-r--r-- 1 daos_agent daos_agent 13947 Dec 13 06:39 daos_errno.h

-rw-r--r-- 1 daos_agent daos_agent  8573 Dec 13 06:39 daos_event.h

-rw-r--r-- 1 daos_agent daos_agent 32290 Dec 13 06:39 daos_fs.h

-rw-r--r-- 1 daos_agent daos_agent 16091 Dec 13 06:39 daos_fs_sys.h

-rw-r--r-- 1 daos_agent daos_agent   974 Dec 13 06:39 daos.h

-rw-r--r-- 1 daos_agent daos_agent  6895 Dec 13 06:39 daos_kv.h

-rw-r--r-- 1 daos_agent daos_agent  2990 Dec 13 06:39 daos_mgmt.h

-rw-r--r-- 1 daos_agent daos_agent 17727 Dec 13 06:39 daos_obj_class.h

-rw-r--r-- 1 daos_agent daos_agent 39305 Dec 13 06:39 daos_obj.h

-rw-r--r-- 1 daos_agent daos_agent 14316 Dec 13 06:39 daos_pool.h

-rw-r--r-- 1 daos_agent daos_agent 16599 Dec 13 06:39 daos_prop.h

-rw-r--r-- 1 daos_agent daos_agent 17364 Dec 13 06:39 daos_security.h

drwxrwxr-x 2 daos_agent daos_agent  4096 Dec 13 08:06 daos_srv

-rw-r--r-- 1 daos_agent daos_agent 27072 Dec 13 06:39 daos_task.h

-rw-r--r-- 1 daos_agent daos_agent  4872 Dec 13 06:39 daos_types.h

-rw-r--r-- 1 daos_agent daos_agent  8607 Dec 13 06:39 daos_uns.h

-rw-rw-r-- 1 daos_agent daos_agent   486 Dec 13 08:02 daos_version.h

drwxrwxr-x 2 daos_agent daos_agent  4096 Dec 20 02:12 gurt

drwxrwxr-x 2 daos_agent daos_agent  4096 Dec 13 08:06 spdk


My command to compile daos is "scons PREFIX=./build install --build-deps=yes --config=force".


Re: Unable to build libmfu properly

netsurfed
 

Hi, yes, "build" is where daos is installed.
$ ls -l /home/daos_agent/git/daos/build/lib64/
total 22080
drwxrwxr-x 3 daos_agent daos_agent    4096 Dec 13 08:06 daos
drwxrwxr-x 2 daos_agent daos_agent    4096 Dec 13 10:18 daos_srv
lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libcart.so -> libcart.so.4.9.0
lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libcart.so.4 -> libcart.so.4.9.0
-rwxrwxr-x 1 daos_agent daos_agent 4948104 Dec 13 10:10 libcart.so.4.9.0
-rwxrwxr-x 1 daos_agent daos_agent  710680 Dec 13 10:10 libdaos_cmd_hdlrs.so
-rwxrwxr-x 1 daos_agent daos_agent 2955992 Dec 13 10:10 libdaos_common_pmem.so
-rwxrwxr-x 1 daos_agent daos_agent 2905240 Dec 13 10:10 libdaos_common.so
lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libdaos.so -> libdaos.so.2.0.0
lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libdaos.so.2 -> libdaos.so.2.0.0
-rwxrwxr-x 1 daos_agent daos_agent 6814552 Dec 13 10:04 libdaos.so.2.0.0
-rwxrwxr-x 1 daos_agent daos_agent  274776 Dec 13 10:10 libdaos_tests.so
-rwxrwxr-x 1 daos_agent daos_agent   27160 Dec 13 10:18 libdfs_internal.so
-rwxrwxr-x 1 daos_agent daos_agent  816728 Dec 13 10:10 libdfs.so
-rw-rw-r-- 1 daos_agent daos_agent  206304 Dec 13 08:07 libdfuse.a
-rwxrwxr-x 1 daos_agent daos_agent  137440 Dec 13 10:18 libdfuse.so
-rwxrwxr-x 1 daos_agent daos_agent  116120 Dec 13 10:10 libdts.so
-rwxrwxr-x 1 daos_agent daos_agent  158712 Dec 13 10:10 libduns.so
lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libgurt.so -> libgurt.so.4.9.0
lrwxrwxrwx 1 daos_agent daos_agent      16 Dec 13 10:18 libgurt.so.4 -> libgurt.so.4.9.0
-rwxrwxr-x 1 daos_agent daos_agent 1132488 Dec 13 10:03 libgurt.so.4.9.0
-rw-rw-r-- 1 daos_agent daos_agent  777432 Dec 13 08:06 libioil.a
-rwxrwxr-x 1 daos_agent daos_agent  411776 Dec 13 10:18 libioil.so
-rw-rw-r-- 1 daos_agent daos_agent  169190 Dec 13 08:03 libnvme_control.a
drwxrwxr-x 3 daos_agent daos_agent    4096 Dec 13 08:07 python3.8

And "${MY_DAOS_INSTALL_PATH}/include" is the installed include directory.
$ ls -l ${MY_DAOS_INSTALL_PATH}/include
total 332
drwxrwxr-x 2 daos_agent daos_agent  4096 Dec 17 08:11 cart
drwxrwxr-x 2 daos_agent daos_agent  4096 Dec 13 08:06 daos
-rw-r--r-- 1 daos_agent daos_agent  6274 Dec 13 06:39 daos_api.h
-rw-r--r-- 1 daos_agent daos_agent 14770 Dec 13 06:39 daos_array.h
-rw-r--r-- 1 daos_agent daos_agent 27543 Dec 13 06:39 daos_cont.h
-rw-r--r-- 1 daos_agent daos_agent 13947 Dec 13 06:39 daos_errno.h
-rw-r--r-- 1 daos_agent daos_agent  8573 Dec 13 06:39 daos_event.h
-rw-r--r-- 1 daos_agent daos_agent 32290 Dec 13 06:39 daos_fs.h
-rw-r--r-- 1 daos_agent daos_agent 16091 Dec 13 06:39 daos_fs_sys.h
-rw-r--r-- 1 daos_agent daos_agent   974 Dec 13 06:39 daos.h
-rw-r--r-- 1 daos_agent daos_agent  6895 Dec 13 06:39 daos_kv.h
-rw-r--r-- 1 daos_agent daos_agent  2990 Dec 13 06:39 daos_mgmt.h
-rw-r--r-- 1 daos_agent daos_agent 17727 Dec 13 06:39 daos_obj_class.h
-rw-r--r-- 1 daos_agent daos_agent 39305 Dec 13 06:39 daos_obj.h
-rw-r--r-- 1 daos_agent daos_agent 14316 Dec 13 06:39 daos_pool.h
-rw-r--r-- 1 daos_agent daos_agent 16599 Dec 13 06:39 daos_prop.h
-rw-r--r-- 1 daos_agent daos_agent 17364 Dec 13 06:39 daos_security.h
drwxrwxr-x 2 daos_agent daos_agent  4096 Dec 13 08:06 daos_srv
-rw-r--r-- 1 daos_agent daos_agent 27072 Dec 13 06:39 daos_task.h
-rw-r--r-- 1 daos_agent daos_agent  4872 Dec 13 06:39 daos_types.h
-rw-r--r-- 1 daos_agent daos_agent  8607 Dec 13 06:39 daos_uns.h
-rw-rw-r-- 1 daos_agent daos_agent   486 Dec 13 08:02 daos_version.h
drwxrwxr-x 2 daos_agent daos_agent  4096 Dec 20 02:12 gurt
drwxrwxr-x 2 daos_agent daos_agent  4096 Dec 13 08:06 spdk

My command to compile daos is "scons PREFIX=./build install --build-deps=yes --config=force".


Re: Unable to build libmfu properly

Bohning, Dalton
 

Is this directory correct?

/home/daos_agent/git/daos/build/lib64/

Libraries are usually in the “install” directory, not “build”. Is “build” in fact where daos is installed?

 

Similarly, is ${MY_DAOS_INSTALL_PATH}/include the installed include directory?

 

~Dalton Bohning

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of netsurfed
Sent: Monday, December 27, 2021 9:38 PM
To: daos@daos.groups.io
Subject: Re: [daos] Unable to build libmfu properly

 

On Tue, Dec 28, 2021 at 03:13 AM, Bohning, Dalton wrote:

grep -B 2 "CMAKE_SHARED_LINKER_FLAGS:" ${MY_MFU_BUILD_PATH}/CMakeCache.txt

Hi, I run your command and it prints as follows:
$ grep -B 2 "CMAKE_SHARED_LINKER_FLAGS:" ${MY_MFU_BUILD_PATH}/CMakeCache.txt

//Flags used by the linker during the creation of shared libraries

// during all build types.

CMAKE_SHARED_LINKER_FLAGS:STRING=-L/home/daos_agent/git/daos/build/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread

 


Re: Unable to build libmfu properly

netsurfed
 

On Tue, Dec 28, 2021 at 03:13 AM, Bohning, Dalton wrote:
grep -B 2 "CMAKE_SHARED_LINKER_FLAGS:" ${MY_MFU_BUILD_PATH}/CMakeCache.txt
Hi, I run your command and it prints as follows:
$ grep -B 2 "CMAKE_SHARED_LINKER_FLAGS:" ${MY_MFU_BUILD_PATH}/CMakeCache.txt
//Flags used by the linker during the creation of shared libraries
// during all build types.
CMAKE_SHARED_LINKER_FLAGS:STRING=-L/home/daos_agent/git/daos/build/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread
 


Re: Unable to build libmfu properly

Bohning, Dalton
 

Apologies, my previous command snippet was muddled. Correction:

$ grep -B 2 "CMAKE_SHARED_LINKER_FLAGS:" ${MY_MFU_BUILD_PATH}/CMakeCache.txt

//Flags used by the linker during the creation of shared libraries

// during all build types.

CMAKE_SHARED_LINKER_FLAGS:STRING=-L/home/dbohning/daos/install/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread

 

~Dalton Bohning

 

 

From: Bohning, Dalton
Sent: Monday, December 27, 2021 8:50 AM
To: daos@daos.groups.io
Subject: RE: [daos] Unable to build libmfu properly

 

Hello,

 

It looks like most  of the libraries specified by “-luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread” are not linked. Something to keep in mind is that “CFLAGS=… LDFLAGS=… cmake…” should be executed as a single command so CFLAGS and LDFLAGS is propagated to the cmake environment. You can check this by running:

$ grep -B 1 "CMAKE_EXE_LINKER_FLAGS:" ${MY_MFU_BUILD_PATH}/CMakeCache.txt

grep -B 2 "CMAKE_SHARED_LINKER_FLAGS:" CMakeCache.txt

//Flags used by the linker during the creation of shared libraries

// during all build types.

CMAKE_SHARED_LINKER_FLAGS:STRING=-L/home/dbohning/daos/install/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread

 

Where CMAKE_SHARED_LINKER_FLAGS should be set to the LDFLAGS from the build command.

 

~Dalton Bohning

 

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of netsurfed
Sent: Thursday, December 23, 2021 7:11 PM
To: daos@daos.groups.io
Subject: [daos] Unable to build libmfu properly

 

Hi,
I built MFU according to https://daosio.atlassian.net/wiki/spaces/DC/pages/4874571083/IO-500+ISC21 but encountered some "undefined reference" errors.
[ 39%] Linking C executable dbcast
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_insert'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_lookup'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_unlinked'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_stat'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_readdir'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_table_create'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_obj_anchor_split'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_release'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_find'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_obj_anchor_set'

The build command is:
git clone https://github.com/mchaarawi/mpifileutils -b pfind_integration "${MY_MFU_SOURCE_PATH}" &&
mkdir -p "${MY_MFU_BUILD_PATH}" &&
cd "${MY_MFU_BUILD_PATH}" &&
CFLAGS="-I${MY_DAOS_INSTALL_PATH}/include" \
LDFLAGS="-L${MY_DAOS_INSTALL_PATH}/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread" \
cmake "${MY_MFU_SOURCE_PATH}" \
-DENABLE_XATTRS=OFF \
-DWITH_DTCMP_PREFIX=${MY_MFU_INSTALL_PATH} \
-DWITH_LibCircle_PREFIX=${MY_MFU_INSTALL_PATH} \
-DCMAKE_INSTALL_PREFIX=${MY_MFU_INSTALL_PATH} &&
make -j8 install

And there is libgurt.so in ${MY_DAOS_INSTALL_PATH}/lib64/.
$ nm -D ${MY_DAOS_INSTALL_PATH}/lib64/libgurt.so | grep d_hash_rec_insert
000000000000fb40 T d_hash_rec_insert
0000000000010100 T d_hash_rec_insert_anonym

But looks like libgurt.so is not linked by libmfu.so.
$ ldd src/common/libmfu.so
linux-vdso.so.1 (0x00007ffda13f3000)
libmpi.so.40 => /lib/x86_64-linux-gnu/libmpi.so.40 (0x00007fd75c830000)
libdtcmp.so => /home/daos_agent/install/lib/libdtcmp.so (0x00007fd75c814000)
libcircle.so.2 => /home/daos_agent/install/lib/libcircle.so.2 (0x00007fd75c806000)
libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007fd75c7f3000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd75c601000)
libopen-rte.so.40 => /lib/x86_64-linux-gnu/libopen-rte.so.40 (0x00007fd75c545000)
libopen-pal.so.40 => /lib/x86_64-linux-gnu/libopen-pal.so.40 (0x00007fd75c497000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd75c348000)
libhwloc.so.15 => /lib/x86_64-linux-gnu/libhwloc.so.15 (0x00007fd75c2f7000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fd75c2d4000)
liblwgrp.so => /home/daos_agent/install/lib/liblwgrp.so (0x00007fd75c2c5000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd75c9b2000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fd75c2a7000)
libevent-2.1.so.7 => /lib/x86_64-linux-gnu/libevent-2.1.so.7 (0x00007fd75c251000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd75c24b000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fd75c246000)
libevent_pthreads-2.1.so.7 => /lib/x86_64-linux-gnu/libevent_pthreads-2.1.so.7 (0x00007fd75c241000)
libudev.so.1 => /lib/x86_64-linux-gnu/libudev.so.1 (0x00007fd75c214000)
libltdl.so.7 => /lib/x86_64-linux-gnu/libltdl.so.7 (0x00007fd75c207000)"


Re: Unable to build libmfu properly

Bohning, Dalton
 

Hello,

 

It looks like most  of the libraries specified by “-luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread” are not linked. Something to keep in mind is that “CFLAGS=… LDFLAGS=… cmake…” should be executed as a single command so CFLAGS and LDFLAGS is propagated to the cmake environment. You can check this by running:

$ grep -B 1 "CMAKE_EXE_LINKER_FLAGS:" ${MY_MFU_BUILD_PATH}/CMakeCache.txt

grep -B 2 "CMAKE_SHARED_LINKER_FLAGS:" CMakeCache.txt

//Flags used by the linker during the creation of shared libraries

// during all build types.

CMAKE_SHARED_LINKER_FLAGS:STRING=-L/home/dbohning/daos/install/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread

 

Where CMAKE_SHARED_LINKER_FLAGS should be set to the LDFLAGS from the build command.

 

~Dalton Bohning

 

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of netsurfed
Sent: Thursday, December 23, 2021 7:11 PM
To: daos@daos.groups.io
Subject: [daos] Unable to build libmfu properly

 

Hi,
I built MFU according to https://daosio.atlassian.net/wiki/spaces/DC/pages/4874571083/IO-500+ISC21 but encountered some "undefined reference" errors.
[ 39%] Linking C executable dbcast
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_insert'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_lookup'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_unlinked'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_stat'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_readdir'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_table_create'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_obj_anchor_split'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_release'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_find'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_obj_anchor_set'

The build command is:
git clone https://github.com/mchaarawi/mpifileutils -b pfind_integration "${MY_MFU_SOURCE_PATH}" &&
mkdir -p "${MY_MFU_BUILD_PATH}" &&
cd "${MY_MFU_BUILD_PATH}" &&
CFLAGS="-I${MY_DAOS_INSTALL_PATH}/include" \
LDFLAGS="-L${MY_DAOS_INSTALL_PATH}/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread" \
cmake "${MY_MFU_SOURCE_PATH}" \
-DENABLE_XATTRS=OFF \
-DWITH_DTCMP_PREFIX=${MY_MFU_INSTALL_PATH} \
-DWITH_LibCircle_PREFIX=${MY_MFU_INSTALL_PATH} \
-DCMAKE_INSTALL_PREFIX=${MY_MFU_INSTALL_PATH} &&
make -j8 install

And there is libgurt.so in ${MY_DAOS_INSTALL_PATH}/lib64/.
$ nm -D ${MY_DAOS_INSTALL_PATH}/lib64/libgurt.so | grep d_hash_rec_insert
000000000000fb40 T d_hash_rec_insert
0000000000010100 T d_hash_rec_insert_anonym

But looks like libgurt.so is not linked by libmfu.so.
$ ldd src/common/libmfu.so
linux-vdso.so.1 (0x00007ffda13f3000)
libmpi.so.40 => /lib/x86_64-linux-gnu/libmpi.so.40 (0x00007fd75c830000)
libdtcmp.so => /home/daos_agent/install/lib/libdtcmp.so (0x00007fd75c814000)
libcircle.so.2 => /home/daos_agent/install/lib/libcircle.so.2 (0x00007fd75c806000)
libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007fd75c7f3000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd75c601000)
libopen-rte.so.40 => /lib/x86_64-linux-gnu/libopen-rte.so.40 (0x00007fd75c545000)
libopen-pal.so.40 => /lib/x86_64-linux-gnu/libopen-pal.so.40 (0x00007fd75c497000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd75c348000)
libhwloc.so.15 => /lib/x86_64-linux-gnu/libhwloc.so.15 (0x00007fd75c2f7000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fd75c2d4000)
liblwgrp.so => /home/daos_agent/install/lib/liblwgrp.so (0x00007fd75c2c5000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd75c9b2000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fd75c2a7000)
libevent-2.1.so.7 => /lib/x86_64-linux-gnu/libevent-2.1.so.7 (0x00007fd75c251000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd75c24b000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fd75c246000)
libevent_pthreads-2.1.so.7 => /lib/x86_64-linux-gnu/libevent_pthreads-2.1.so.7 (0x00007fd75c241000)
libudev.so.1 => /lib/x86_64-linux-gnu/libudev.so.1 (0x00007fd75c214000)
libltdl.so.7 => /lib/x86_64-linux-gnu/libltdl.so.7 (0x00007fd75c207000)"


Re: Questions about ULT Schedule

Niu, Yawei
 

Hi,

 

The design is to ensure that all IO requests from different pools are processed in FIFO order, and space pressure from one pool doesn’t interfere request processing for other pools, but the implementation of policy_fifo_process() (as you pointed out) does have a defect which makes the latter requests (in FIFO queue) are impeded by the former request for a different pool with space pressure.

 

I’ll  cook a quick fix soon, thanks a lot for spotting this!

 

Thanks

-Niu  

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of 段世博 <duanshibo.d@...>
Date: Sunday, December 26, 2021 at 10:01 PM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] Questions about ULT Schedule

Hello, Everyone!

Now all update/fetch requests of pools are in sched_info->si_fifo_list. If a pool request exceeds req_kick_limit, will it also block other pool requests?

thanks!


Re: failed to create daos cont #chat

hmu102
 

solved at 23 Dec

at the agent.yml, edited the interface numa from "1" to "0", though there is no port under numa 0, but it works at last.


Questions about ULT Schedule

段世博
 

Hello, Everyone!

Now all update/fetch requests of pools are in sched_info->si_fifo_list. If a pool request exceeds req_kick_limit, will it also block other pool requests?

thanks!


Unable to build libmfu properly

netsurfed
 

Hi,
I built MFU according to https://daosio.atlassian.net/wiki/spaces/DC/pages/4874571083/IO-500+ISC21 but encountered some "undefined reference" errors.
[ 39%] Linking C executable dbcast
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_insert'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_lookup'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_unlinked'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_stat'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_readdir'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_table_create'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_obj_anchor_split'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_release'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `d_hash_rec_find'
/usr/bin/ld: ../common/libmfu.so: undefined reference to `dfs_obj_anchor_set'

The build command is:
git clone https://github.com/mchaarawi/mpifileutils -b pfind_integration "${MY_MFU_SOURCE_PATH}" &&
mkdir -p "${MY_MFU_BUILD_PATH}" &&
cd "${MY_MFU_BUILD_PATH}" &&
CFLAGS="-I${MY_DAOS_INSTALL_PATH}/include" \
LDFLAGS="-L${MY_DAOS_INSTALL_PATH}/lib64/ -luuid -ldaos -ldfs -ldaos_common -lgurt -lpthread" \
cmake "${MY_MFU_SOURCE_PATH}" \
-DENABLE_XATTRS=OFF \
-DWITH_DTCMP_PREFIX=${MY_MFU_INSTALL_PATH} \
-DWITH_LibCircle_PREFIX=${MY_MFU_INSTALL_PATH} \
-DCMAKE_INSTALL_PREFIX=${MY_MFU_INSTALL_PATH} &&
make -j8 install

And there is libgurt.so in ${MY_DAOS_INSTALL_PATH}/lib64/.
$ nm -D ${MY_DAOS_INSTALL_PATH}/lib64/libgurt.so | grep d_hash_rec_insert
000000000000fb40 T d_hash_rec_insert
0000000000010100 T d_hash_rec_insert_anonym

But looks like libgurt.so is not linked by libmfu.so.
$ ldd src/common/libmfu.so
linux-vdso.so.1 (0x00007ffda13f3000)
libmpi.so.40 => /lib/x86_64-linux-gnu/libmpi.so.40 (0x00007fd75c830000)
libdtcmp.so => /home/daos_agent/install/lib/libdtcmp.so (0x00007fd75c814000)
libcircle.so.2 => /home/daos_agent/install/lib/libcircle.so.2 (0x00007fd75c806000)
libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007fd75c7f3000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd75c601000)
libopen-rte.so.40 => /lib/x86_64-linux-gnu/libopen-rte.so.40 (0x00007fd75c545000)
libopen-pal.so.40 => /lib/x86_64-linux-gnu/libopen-pal.so.40 (0x00007fd75c497000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd75c348000)
libhwloc.so.15 => /lib/x86_64-linux-gnu/libhwloc.so.15 (0x00007fd75c2f7000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fd75c2d4000)
liblwgrp.so => /home/daos_agent/install/lib/liblwgrp.so (0x00007fd75c2c5000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd75c9b2000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fd75c2a7000)
libevent-2.1.so.7 => /lib/x86_64-linux-gnu/libevent-2.1.so.7 (0x00007fd75c251000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd75c24b000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fd75c246000)
libevent_pthreads-2.1.so.7 => /lib/x86_64-linux-gnu/libevent_pthreads-2.1.so.7 (0x00007fd75c241000)
libudev.so.1 => /lib/x86_64-linux-gnu/libudev.so.1 (0x00007fd75c214000)
libltdl.so.7 => /lib/x86_64-linux-gnu/libltdl.so.7 (0x00007fd75c207000)"


Announcement: DAOS 2.0 is generally available

Prantis, Kelsey
 

All,

 

We are pleased to announce that DAOS 2.0 release is now generally available. This release brings support for the following features:

  • Erasure code
  • Telemetry and monitoring
  • Pool and container labels
  • Improved usability and management capabilities
  • Increased flexibility in object layout
  • mpifileutils integration
  • Improved stability in the network stack

There are a number of resources available for the release:

As always, feel free to use this mailing list for any issues you may find with the release, or our JIRA bug tracking system, available at https://daosio.atlassian.net/jira, or on our Slack channel, available at https://daos-stack.slack.com.

 

Regards,

 

Kelsey Prantis

Senior Software Engineering Manager

Extreme Storage Architecture and Development Division

Intel

 


failed to create daos cont #chat

hmu102
 

Hi guys

Hi I am using the daos system with mellanox card, tyring to test roce

my system is centos and daos 2.1.100, libdaos 1.6.0

cause many failures, i let the server and client in one server, created pool succeed with command "dmg pool create -t 30 -z 1TB --label tank"

but filed when create cont "daos cont create tank --path /tmp/mycontainer --type=POSIX --oclass=SX"

shows in angent side:
"ERROR: failed to fetch fabric interface of type ETHER: no suitable fabric interface found of type "ETHER"
ERROR: HandleCall for Management:Management:206 failed: no suitable fabric interface found of type "ETHER" "

I already changed the daos_agent.yml, is there something I missed?


DAOS Community Update / Dec'21

Lombardi, Johann
 

Hi there,

 

Please find below the DAOS community newsletter for December 2021.

 

Past Events (November)

  • SC’21 Tutorial (Nov 15)
    Practical Persistent Memory Programming: PMDK and DAOS
    Adrian Jackson (University of Edinburgh)
    Mohamad Chaarawi (Intel)
    Johann Lombardi (Intel)

    https://sc21.supercomputing.org/presentation/?id=tut134&sess=sess210
  • SC’21 BoF (Nov 16)
    Object-stores for HPC - a Devonian Explosion or an Extinction Event?
    Philippe Deniel, CEA
    John Bent, Seagate
    Tiago Quinto, ECMWF
    Johann Lombardi, Intel

    https://sc21.supercomputing.org/presentation/?id=bof122&sess=sess368
  • DAOS User Group (DUG’21)
    Slide decks and recordings available at http://dug.daos.io
  • Intel SC21 Booth
    Dev led talk:
    DAOS Unleashes the Power in HPC Applications: QCT(Quanta) DevCloud Experience Sharing
    A Compact, Scalable, Efficient Data Collection Solution for Edge/IoT to Cloud by Zettar
    Fireside chat:
    Cambridge Service for Data Driven Discovery Enables Real-time Hospital Decision Support Systems to Improve Outcomes
    https://hpcevents.intel.com/devhub

 

Upcoming Events

  • None this month. 

 

Release

  • 2.0.0 RC2 was tagged (v2.0.0-rc2) earlier this week.
  • The only difference with 2.0.0 RC1 is the log4j CVE update in the Hadoop connector (not included in the RPMs).
  • Master is the development branch for the future 2.2 release.
  • Major recent 2.0 changes:
    • Integrate libfabric fix for fi_cancel()
    • Bump log4j-core version to 2.16.0 in the Hadoop connector
    • Fix a race between EC aggregation and reintegrate operation exposed by soak testing
    • Many documentation updates that will be on http://docs.daos.io soon
    • Add new recipes and scripts to install RPMs and run DAOS in docker containers
    • Fix a memory registration issue with verbs & rebuild
    • Quiet group version mismatch errors at pool creation
    • Fix a SWIM regression introduced recently
    • Fix a bulk leak on error path
    • Refactor DAOS object ID to encode the number of groups/shards and improve performance
    • Add interception for openat
    • Lock memory only for VOS pools instead of using mlockall
  • Major recent master changes:
    • Add new logic for punch propagation to accelerate empty directory removal
    • Restructure usage of hwloc in the control plane
    • Add CXI support to netdetect
    • Several VMD-related fixes
    • Rework how we use GitHub actions
    • Add new interface to create (and track) a ABT ULT (sched_create_ult())
    • Several DTX enhancements
  • What is coming:
    • rc2 validation

 

R&D

  • Major features under development:
  • Pathfinding:
    • MariaDB DAOS engine with predicate pushdown to the DAOS storage nodes
    • Leveraging the Intel Data Streaming Accelerator (DSA) to accelerate DAOS
      • Prototype leveraging DSA for VOS aggregation delivered

 

News

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: dmg pool operation stuck

Niu, Yawei
 

Hi, Allen

 

The log showed it was stuck on creating blobstore. It looks like your device isn’t well supported by SPDK, could you collect some device information by SPDK ‘identify’ tool? Also, there is a ‘blob_cli’ tool from SPDK, could you try it to see if you can create blobstore on the device specified in your server yaml?

 

Thanks

-Niu

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Allen <allen.zhuo@...>
Date: Monday, December 6, 2021 at 10:13 AM
To: daos <daos@daos.groups.io>
Subject: Re: [daos] dmg pool operation stuck

Hi Tom,

 

I reduced the variables according to your suggestion, but there is still the same error.

 


Regards,

Allen

 

From: Nabarro, Tom

Date: 2021-12-04 01:33

Subject: Re: [daos] dmg pool operation stuck

The failure seems to be happening when creating the blobstore but just to reduce variables, can you try with the following env_vars:

  env_vars:

  - CRT_TIMEOUT=300

  - CRT_CREDIT_EP_CTX=0

 

And simplify by running with

provider: ofi+sockets

 

Regards,

Tom

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of allen.zhuo@...
Sent: Friday, December 3, 2021 1:17 PM
To: daos@daos.groups.io
Subject: Re: [daos] dmg pool operation stuck

 

Hi Tom,
I noticed an error in the engine log.
DAOS[11610/11614] bio  DBUG src/bio/bio_xstream.c:662 load_blobstore() load blobstore failed -1025
Is it because of this? And what does "-1025" mean?

Some parameters of calling spdk_bs_load are as follows:
bs_dev->blocklen = 512
bs_dev->blockcnt = 7814037168
bs_opts.max_md_ops = 32
bs_opts.max_channel_ops = 4096 
bs_opts.cluster_sz = 1073741824

And the memory information of the server is as follows:

daos_debug@sw2:~$ free -h

              total        used        free      shared  buff/cache   available

Mem:          503Gi        11Gi       490Gi       131Mi       1.7Gi       489Gi

Swap:         8.0Gi          0B       8.0Gi

daos_debug@sw2:~$ numastat -mc | egrep "Node|Huge"

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

                 Node 0 Node 1  Total

AnonHugePages         0      0      0

HugePages_Total    4096   4096   8192

HugePages_Free     3766   4096   7862

HugePages_Surp        0      0      0


Re: dmg pool operation stuck

Allen
 

Hi Tom,

I reduced the variables according to your suggestion, but there is still the same error.


Regards,
Allen

 
Date: 2021-12-04 01:33
Subject: Re: [daos] dmg pool operation stuck

The failure seems to be happening when creating the blobstore but just to reduce variables, can you try with the following env_vars:

  env_vars:

  - CRT_TIMEOUT=300

  - CRT_CREDIT_EP_CTX=0

 

And simplify by running with

provider: ofi+sockets

 

Regards,

Tom

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of allen.zhuo@...
Sent: Friday, December 3, 2021 1:17 PM
To: daos@daos.groups.io
Subject: Re: [daos] dmg pool operation stuck

 

Hi Tom,
I noticed an error in the engine log.
DAOS[11610/11614] bio  DBUG src/bio/bio_xstream.c:662 load_blobstore() load blobstore failed -1025
Is it because of this? And what does "-1025" mean?

Some parameters of calling spdk_bs_load are as follows:
bs_dev->blocklen = 512
bs_dev->blockcnt = 7814037168
bs_opts.max_md_ops = 32
bs_opts.max_channel_ops = 4096 
bs_opts.cluster_sz = 1073741824

And the memory information of the server is as follows:

daos_debug@sw2:~$ free -h

              total        used        free      shared  buff/cache   available

Mem:          503Gi        11Gi       490Gi       131Mi       1.7Gi       489Gi

Swap:         8.0Gi          0B       8.0Gi

daos_debug@sw2:~$ numastat -mc | egrep "Node|Huge"

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

                 Node 0 Node 1  Total

AnonHugePages         0      0      0

HugePages_Total    4096   4096   8192

HugePages_Free     3766   4096   7862

HugePages_Surp        0      0      0


Re: dmg pool operation stuck

Nabarro, Tom
 

The failure seems to be happening when creating the blobstore but just to reduce variables, can you try with the following env_vars:

  env_vars:

  - CRT_TIMEOUT=300

  - CRT_CREDIT_EP_CTX=0

 

And simplify by running with

provider: ofi+sockets

 

Regards,

Tom

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of allen.zhuo@...
Sent: Friday, December 3, 2021 1:17 PM
To: daos@daos.groups.io
Subject: Re: [daos] dmg pool operation stuck

 

Hi Tom,
I noticed an error in the engine log.
DAOS[11610/11614] bio  DBUG src/bio/bio_xstream.c:662 load_blobstore() load blobstore failed -1025
Is it because of this? And what does "-1025" mean?

Some parameters of calling spdk_bs_load are as follows:
bs_dev->blocklen = 512
bs_dev->blockcnt = 7814037168
bs_opts.max_md_ops = 32
bs_opts.max_channel_ops = 4096 
bs_opts.cluster_sz = 1073741824

And the memory information of the server is as follows:

daos_debug@sw2:~$ free -h

              total        used        free      shared  buff/cache   available

Mem:          503Gi        11Gi       490Gi       131Mi       1.7Gi       489Gi

Swap:         8.0Gi          0B       8.0Gi

daos_debug@sw2:~$ numastat -mc | egrep "Node|Huge"

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

                 Node 0 Node 1  Total

AnonHugePages         0      0      0

HugePages_Total    4096   4096   8192

HugePages_Free     3766   4096   7862

HugePages_Surp        0      0      0


回复:Re: [daos] dmg pool operation stuck

Allen
 


Hi Tom,
Yes, please check the previous reply.

--------------原始邮件--------------
发件人:"Nabarro, Tom "<tom.nabarro@...>;
发送时间:2021年12月3日(星期五) 晚上9:24
收件人:"daos@daos.groups.io" <daos@daos.groups.io>;
主题:Re: [daos] dmg pool operation stuck
-----------------------------------
.qmbox v:* {} .qmbox o:* {} .qmbox w:* {} .qmbox .shape {} .qmbox !-- @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ .qmbox p.MsoNormal, .qmbox li.MsoNormal, .qmbox div.MsoNormal {margin:0in; font-size:11.0pt; font-family:"Calibri",sans-serif;} .qmbox a:link, .qmbox span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} .qmbox span.EmailStyle18 {mso-style-type:personal-reply; font-family:"Calibri",sans-serif; color:windowtext;} .qmbox .MsoChpDefault {mso-style-type:export-only; font-size:10.0pt; font-family:"Calibri",sans-serif; mso-fareast-language:EN-US;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in;} .qmbox div.WordSection1 {page:WordSection1;}

Did you manage to get the engine log with DD_MASK=all, that will give us more information about why the engine is not completing start-up (and why you don’t have any joined ranks reported by "dmg system query").

The load blobstore failed message is expected. It just means they need to be created.

 

Also can you please confirm that you have tried with the adjusted settings as per previous e-mail:

"- set engines->targets to 4 and engines->nr_xs_helpers to 0"

 

Regards,

Tom

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of allen.zhuo@...
Sent: Friday, December 3, 2021 1:17 PM
To: daos@daos.groups.io
Subject: Re: [daos] dmg pool operation stuck

 

Hi Tom,
I noticed an error in the engine log.
DAOS[11610/11614] bio  DBUG src/bio/bio_xstream.c:662 load_blobstore() load blobstore failed -1025
Is it because of this? And what does "-1025" mean?

Some parameters of calling spdk_bs_load are as follows:
bs_dev->blocklen = 512
bs_dev->blockcnt = 7814037168
bs_opts.max_md_ops = 32
bs_opts.max_channel_ops = 4096
bs_opts.cluster_sz = 1073741824

And the memory information of the server is as follows:

daos_debug@sw2:~$ free -h

              total        used        free      shared  buff/cache   available

Mem:          503Gi        11Gi       490Gi       131Mi       1.7Gi       489Gi

Swap:         8.0Gi          0B       8.0Gi

daos_debug@sw2:~$ numastat -mc | egrep "Node|Huge"

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

Token Node not in hash table.

                 Node 0 Node 1  Total

AnonHugePages         0      0      0

HugePages_Total    4096   4096   8192

HugePages_Free     3766   4096   7862

HugePages_Surp        0      0      0