Failed load module rdb


Colin Ngam
 

Hi,

 

03/30-10:20:29.65 delphi-006 DAOS[249525/249525] server ERR  src/iosrv/module.c:105 dss_module_load() cannot load librdb.so: librdb.so: cannot open shared object file: No such file or directory

03/30-10:20:29.65 delphi-006 DAOS[249525/249525] server ERR  src/iosrv/init.c:195 modules_load() Failed to load module rdb: -1003

 

We solved this by using LD_LIBRARY_PATH to point at ${daospath}/install/lib64/daos_srv/ or copying it to ${daospath}/install/lib64 etc.

 

What’s the game plan on this issue?

 

May be I missed something in the docs .. is there info on needed LD_LIBRARY_PATH?

 

Thanks.

 

Colin


Alex Barcelo
 

I had a similar issue as you (I think), and I solved it in the same fashion: the error banished as soon as I added those two path to the LD_LIBRARY_PATH in my .bashrc.

In my scenario, the specific error was on daos_agent execution, which was failing with a libgurt.so.4 error, because libgurt.so could not be found.

I am putting this info here just in case some future time traveler has the same error and tries to search for that library name (as I did).

I could not find anything saying that LD_LIBRARY_PATH should be needed, maybe it should be added here? https://daos-stack.github.io/admin/installation/

Or maybe that's not the proper way of solving those dynamic linking library issues?


Kevan Rehm
 

Alex,

 

Try running daos_agent without your LD_LIBRARY_PATH settings, after first setting “export LD_DEBUG=all” and capture the output.  Look for libgurt in the output, see where ld searched for it.    In our case, ld didn’t use the rpath/runpath value from the binary (see “readelf -d daos_agent” output), it used a value from a previously-loaded dynamic library (libfabric) that had referenced dlopen (libdl.so), and that value didn’t contain the path where librdb resides.   Sorry to say, we never solved the problem, but found a way to work around it.  

 

It will be interesting to see if this is a different instance of the same problem.  Post the above LD_DEBUG output, and also the output from “readelf -d daos_agent”. 

 

Thanks, Kevan

 

From: <daos@daos.groups.io> on behalf of "Alex Barcelo via groups.io" <alex@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, May 4, 2020 at 9:03 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Failed load module rdb

 

I had a similar issue as you (I think), and I solved it in the same fashion: the error banished as soon as I added those two path to the LD_LIBRARY_PATH in my .bashrc.

In my scenario, the specific error was on daos_agent execution, which was failing with a libgurt.so.4 error, because libgurt.so could not be found.

I am putting this info here just in case some future time traveler has the same error and tries to search for that library name (as I did).

I could not find anything saying that LD_LIBRARY_PATH should be needed, maybe it should be added here? https://daos-stack.github.io/admin/installation/

Or maybe that's not the proper way of solving those dynamic linking library issues?


Olivier, Jeffrey V
 

Hi Alex,

 

Does it still happen in latest master?   The workaround was checked into master a couple of weeks ago.   We changed the order of dlopen calls as there appeared to be some issue that doesn’t exactly follow documented dynamic dynamic loader behavior on some systems.

 

One can always set LD_LIBRARY_PATH as well but it should not be needed with a developer build in latest master.

 

-Jeff

 

From: <daos@daos.groups.io> on behalf of "Alex Barcelo via groups.io" <alex@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, May 4, 2020 at 8:03 AM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Failed load module rdb

 

I had a similar issue as you (I think), and I solved it in the same fashion: the error banished as soon as I added those two path to the LD_LIBRARY_PATH in my .bashrc.

In my scenario, the specific error was on daos_agent execution, which was failing with a libgurt.so.4 error, because libgurt.so could not be found.

I am putting this info here just in case some future time traveler has the same error and tries to search for that library name (as I did).

I could not find anything saying that LD_LIBRARY_PATH should be needed, maybe it should be added here? https://daos-stack.github.io/admin/installation/

Or maybe that's not the proper way of solving those dynamic linking library issues?


Alex Barcelo
 

@Kevan sorry I didn't explain myself properly: I "successfully" fixed my problem, but I was trying to understand if my fix was proper or if I had somehow messed up something in my installation. I am definetely lost in some steps on your diagnosis instructions; what I did was: ldd ./daos_agent which yielded:

    libgurt.so.4 => not found

That ofc is fixed when LD_LIBRARY_PATH is tweaked.

@Olivier I am in b27788cf667a36d7628b06cb374495f7e2aff382 (April 7th). Sorry, the machine I am working at has certain network limitations and updating things is... unpleasant to say the least. If master has been fixed in the last month, then all my hiccups are old news.


Kevan Rehm
 

Alex,

 

Well, I didn’t explain myself properly either. 😊.  You should not need to set LD_LIBRARY_PATH to get things working, daos has code in it to find all the dynamic libraries it needs, so what you are seeing is a bug.   Execute this:

LD_DEBUG=all ./daos_agent 2> /tmp/debug.txt

readelf -d ./daos_agent > /tmp/readelf.txt

 

and then post those two files.   If you are curious, search for libgurt in /tmp/debug.txt.  When things are working properly you should see something like the following.   In my case, libgurt is in /home/users/daos/daos/install/lib64, I’ve bolded that path in the daos_agent RPATH variable, so it works for me.

 

     18109:     file=libgurt.so.4 [0];  needed by ./daos_agent [0]

     18109:     find library=libgurt.so.4 [0]; searching

     18109:      search path=/delphi/common/daos/build/dev/gcc/src/cart/src/cart/tls/x86_64:/delphi/common/daos/build/dev/gcc/src/cart/src/cart/tls:/delphi/common/daos/build/dev/gcc/src/cart/src/cart/x86_64:/delphi/common/daos/build/dev/gcc/src/cart/src/cart:/delphi/common/daos/build/dev/gcc/src/cart/src/gurt/tls/x86_64:/delphi/common/daos/build/dev/gcc/src/cart/src/gurt/tls:/delphi/common/daos/build/dev/gcc/src/cart/src/gurt/x86_64:/delphi/common/daos/build/dev/gcc/src/cart/src/gurt:/home/users/daos/daos/install/lib/tls/x86_64:/home/users/daos/daos/install/lib/tls:/home/users/daos/daos/install/lib/x86_64:/home/users/daos/daos/install/lib:/home/users/daos/daos/install/lib64/tls/x86_64:/home/users/daos/daos/install/lib64/tls:/home/users/daos/daos/install/lib64/x86_64:/home/users/daos/daos/install/lib64:/usr/lib/tls/x86_64:/usr/lib/tls:/usr/lib/x86_64:/usr/lib          (RPATH from file ./daos_agent)

     18109:       trying file=/delphi/common/daos/build/dev/gcc/src/cart/src/cart/tls/x86_64/libgurt.so.4

     18109:       trying file=/delphi/common/daos/build/dev/gcc/src/cart/src/cart/tls/libgurt.so.4

     18109:       trying file=/delphi/common/daos/build/dev/gcc/src/cart/src/cart/x86_64/libgurt.so.4

     18109:       trying file=/delphi/common/daos/build/dev/gcc/src/cart/src/cart/libgurt.so.4

     18109:       trying file=/delphi/common/daos/build/dev/gcc/src/cart/src/gurt/tls/x86_64/libgurt.so.4

     18109:       trying file=/delphi/common/daos/build/dev/gcc/src/cart/src/gurt/tls/libgurt.so.4

     18109:       trying file=/delphi/common/daos/build/dev/gcc/src/cart/src/gurt/x86_64/libgurt.so.4

     18109:       trying file=/delphi/common/daos/build/dev/gcc/src/cart/src/gurt/libgurt.so.4

     18109:

     18109:     file=libgurt.so.4 [0];  generating link map

     18109:       dynamic: 0x00007f4d63509d68  base: 0x00007f4d632e8000   size: 0x0000000000222c98

     18109:         entry: 0x00007f4d632ec2a0  phdr: 0x00007f4d632e8040  phnum:      

 

 

 

From: <daos@daos.groups.io> on behalf of "Alex Barcelo via groups.io" <alex@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Monday, May 4, 2020 at 5:10 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Failed load module rdb

 

@Kevan sorry I didn't explain myself properly: I "successfully" fixed my problem, but I was trying to understand if my fix was proper or if I had somehow messed up something in my installation. I am definetely lost in some steps on your diagnosis instructions; what I did was: ldd ./daos_agent which yielded:

    libgurt.so.4 => not found

That ofc is fixed when LD_LIBRARY_PATH is tweaked.

@Olivier I am in b27788cf667a36d7628b06cb374495f7e2aff382 (April 7th). Sorry, the machine I am working at has certain network limitations and updating things is... unpleasant to say the least. If master has been fixed in the last month, then all my hiccups are old news.