Re: dfs_lookup behavior for non-existent files?


Tuffli, Chuck
 

Mohamad

Thank you for the sanity check regarding dfs_lookup. After a little sleuthing, the application (evidently) was modifying the effective UID/GID around the time of that lookup. And it was this *ID change that made networking fail. With those calls changed, DFS is now doing what I expected/thought/hoped 🙂

--chuck


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Chaarawi, Mohamad <mohamad.chaarawi@...>
Sent: Tuesday, April 5, 2022 5:21 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] dfs_lookup behavior for non-existent files?
 

Hi Chuck,

 

Neither dfs_lookup nor dfs_stat do set the st_ino in the stat buf.

The reason being is that files are uniquely identified by the daos object ID which is 128 bits (64 hi,  64 lo).

You can retrieve that using dfs_obj2id():

https://github.com/daos-stack/daos/blob/master/src/include/daos_fs.h#L316

 

now for the other error, that seems weird. The errors are coming from the network layer. At that point, are there any servers that are down or were killed (specifically the engine with rank 1)? This would explain the errors.

When I try this myself, I get ENOENT for lookup on “//.Trash” as expected.

 

Thanks,

Mohamad

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Tuffli, Chuck <chuck.tuffli@...>
Date: Tuesday, April 5, 2022 at 12:58 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] dfs_lookup behavior for non-existent files?

I'm porting an existing application to use DFS (DAOS v2.0.2) instead of POSIX and need help understanding the error messages printed to the console.

 

The code is using dfs_lookup() to retrieve the struct stat of a file. Note the implementation cannot use dfs_stat() as it requires valid values for fields such as st_ino that dfs_stat() does not provide. The code in question is:

 

int

d_lstat(const char * restrict path, struct stat * restrict sb)

{

    int rc;

    dfs_obj_t *obj = NULL;

 

    rc = dfs_lookup(dfs, path, O_RDONLY, &obj, NULL, sb);

    ...

 

If the file path exists (e.g. "/"), this works. But if the path, doesn't exist (e.g. "//.Trash"), the call to dfs_lookup() does not return. Instead, the console endlessly prints messages like:

 

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] external ERR  # [6937851.329315] mercury->msg: [error] /builddir/build/BUILD/mercury-2.1.0rc4/src/na/na_ofi.c:2972

 # na_ofi_msg_send(): fi_tsend() failed, rc: -13 (Permission denied)

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] external ERR  # [6937851.329374] mercury->hg: [error] /builddir/build/BUILD/mercury-2.1.0rc4/src/mercury_core.c:2727

 # hg_core_forward_na(): Could not post send for input buffer (NA_ACCESS)

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] hg   ERR  src/cart/crt_hg.c:1104 crt_hg_req_send_cb(0x1d0cd40) [opc=0x4070001 (DAOS) rpcid=0x63f8133700000008 rank:tag=1:2] RPC failed; rc: DER_HG(-1020): 'Transport layer mercury error'

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] object ERR  src/object/cli_shard.c:889 dc_rw_cb() RPC 1 failed, DER_HG(-1020): 'Transport layer mercury error'

 

Am I mis-using dfs_lookup() or using it incorrectly?

 

--chuck

Join daos@daos.groups.io to automatically receive all group messages.