Topics

daos_agent starts running in directory ./daos/build/dev/gcc/src/cart/src/gurt

Kevan Rehm
 

Greetings,

 

Checking to see if this is a known issue.    We build DAOS in a NFS filesystem, as then we can log into any node in the cluster and use gdb with the source tree to debug a binary.   The binaries get installed in ~daos/daos/install, which is a local filesystem on each node.

 

What happens is that when we start the daos_agent on the node where we compiled DAOS, it ends up cd’d in directory daos/build/dev/gcc/src/cart/src/gurt in the NFS filesystem, so if we want to delete the daos source tree and start over, we can’t do so because there is a .nfs0000XXX file in that directory which prevents a “rm -rf daos” from succeeding.   We have to kill off the daos_agent in order to be able to delete the daos directory, then start the daos_agent again.

 

I suspect this is not intentional.  😊.  Known problem, or should I open a ticket?

 

Thanks, Kevan

Macdonald, Mjmac
 

Hi Kevan.

 

We haven’t built anything into daos_agent to daemonize itself – the expectation is that in production that sort of thing will be handled by systemd. Maybe you could use a wrapper to cd / and then exec daos_agent ?

 

mjmac

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm
Sent: Wednesday, 27 May, 2020 17:54
To: daos@daos.groups.io
Subject: [daos] daos_agent starts running in directory ./daos/build/dev/gcc/src/cart/src/gurt

 

Greetings,

 

Checking to see if this is a known issue.    We build DAOS in a NFS filesystem, as then we can log into any node in the cluster and use gdb with the source tree to debug a binary.   The binaries get installed in ~daos/daos/install, which is a local filesystem on each node.

 

What happens is that when we start the daos_agent on the node where we compiled DAOS, it ends up cd’d in directory daos/build/dev/gcc/src/cart/src/gurt in the NFS filesystem, so if we want to delete the daos source tree and start over, we can’t do so because there is a .nfs0000XXX file in that directory which prevents a “rm -rf daos” from succeeding.   We have to kill off the daos_agent in order to be able to delete the daos directory, then start the daos_agent again.

 

I suspect this is not intentional.  😊.  Known problem, or should I open a ticket?

 

Thanks, Kevan