Re: dfuse fails mounting container


Chaarawi, Mohamad
 

Hi Ruben,

I wouldn't call this requirement "new" as it has been there for a long time now.
But yes when using a DFS container (MPI-IO, dfuse, hdf5 over posix or MPI-IO), the container must be of type POSIX.
Yes I guess we can make this clearer in the documentation.

The HDF5 DAOS-VOL (this is different than running HDF5 with the native vol over MPI or POSIX over DFS DAOS) is a different thing. The HDF5 VOL plugin manages the container creation and it uses type HDF5 not POSIX, since underneath it doesn't use the DFS or POSIX API. So not sure what you mean there by DFS will not work at all. Could you elaborate more please?

As for MPICH, it should work fine still. Not sure what doesn't work. But the container created has to be of type POSIX.
Note that the driver has been integrated into upstream MPICH master repo, and there is no need to use the fork for that anymore.

Thanks,
Mohamad

On 7/11/20, 1:36 PM, "daos@daos.groups.io on behalf of Ruben Felgenhauer" <daos@daos.groups.io on behalf of 4felgenh@...> wrote:

Hi Mohamad,

thanks for the tip, I had not created the container as POSIX-type. I
remember that this wasn't explicitly necessary somewhere in the past, is
this a new behavior? The "POSIX Namespace" page of the user guide is
hinting in this direction, but the necessity isn't really stated
explicitly in my opinion.

Does this also mean that DFS will not work at all, if the container was
automatically created by DAOS-VOL / HDF5V? I figure that DAOS-VOL will
not create the container with the posix type?

And I guess, the forked MPICH with the adio driver wouldn't work either?

Kind regards,
Ruben

Am 11.07.20 um 18:12 schrieb Chaarawi, Mohamad:

> Hi Ruben,
>
> I assume you create the container with type POSIX?
>
> daos cont create --pool=$DAOS_POOL --svc=$DAOS_SVCL --type=POSIX
>
> and that succeeds and you pass that cont uuid to the dfuse command?
>
> After you create the container, could you please run a cont query on
> that to verify?
>
> daos cont query --pool=$DAOS_POOL --svc=$DAOS_SVCL --cont=$DAOS_CONT
>
> Thanks,
>
> Mohamad
>
> *From: *<daos@daos.groups.io> on behalf of Ruben Felgenhauer
> <4felgenh@...>
> *Reply-To: *"daos@daos.groups.io" <daos@daos.groups.io>
> *Date: *Saturday, July 11, 2020 at 4:32 AM
> *To: *"daos@daos.groups.io" <daos@daos.groups.io>
> *Subject: *[daos] dfuse fails mounting container
>
> Hi,
>
> I'm still failing to get the DAOS Fuse filesystem running. I'm using
> the daos_server_local.yml config with DAOS v0.9 and have server and
> agent running.
>
> Both the high level and low level Fuse interface are failing on me:
>
> Low level:
> OFI_INTERFACE=eth0 dfuse -S --mountpoint="$DFS_MNT" --svc="$DAOS_SVCL"
> --pool="$DAOS_POOL" --container="$DAOS_CONT" --foreground
>
> This returns immediately without any error message and will log the
> following in daos.log:
>
> 07/11-11:16:41.21 abu2 DAOS[62814/62814] fi INFO
> src/gurt/fault_inject.c:486 d_fault_inject_init() No config file,
> fault injection is OFF.
> 07/11-11:16:41.21 abu2 DAOS[62814/62814] crt INFO
> src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
> 07/11-11:16:41.21 abu2 DAOS[62814/62814] crt WARN
> src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set;
> setting to 2048
> 07/11-11:16:41.46 abu2 DAOS[62814/62814] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:328 main(0x559201d7b010) dfs_mount
> failed (22)
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:362 main(0x559201d7b010) DFP left at the end
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:365 main(0x559201d7af80) DFS left at the end
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:372 main(0x559201d7af80) dfs_umount()
> failed (22)
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:378 main(0x559201d7af80)
> daos_cont_close() failed: (-1002)
> 07/11-11:16:41.48 abu2 DAOS[62814/62814] dfuse ERR
> src/client/dfuse/dfuse_main.c:390 main(0x559201d7b010)
> daos_pool_disconnect() failed: (-1002)
> 07/11-11:16:41.49 abu2 DAOS[62814/62814] dfuse INFO
> src/client/dfuse/dfuse_main.c:404 main() Exiting with status 0
>
> High Level:
> $ OFI_INTERFACE=eth0 dfuse_hl "$DFS_MNT" -s -f -d -p "$DAOS_POOL" -l
> "$DAOS_SVCL" -c "$DAOS_CONT"
> Pool Connect...
> DFS Pool = 0cce90ce-2f6c-4621-836f-b24476acefd0
> DFS SVCL = 0
> DFS Container: 6312c82d-7daf-34a2-edb7-3b45441cdd6f
> Failed dfs mount (22)
>
> This logs the following:
> 07/11-11:21:11.29 abu2 DAOS[62833/62833] fi INFO
> src/gurt/fault_inject.c:486 d_fault_inject_init() No config file,
> fault injection is OFF.
> 07/11-11:21:11.29 abu2 DAOS[62833/62833] crt INFO
> src/cart/crt_init.c:278 crt_init_opt() libcart version 4.7.0 initializing
> 07/11-11:21:11.29 abu2 DAOS[62833/62833] crt WARN
> src/cart/crt_init.c:170 data_init() FI_UNIVERSE_SIZE was not set;
> setting to 2048
> 07/11-11:21:11.54 abu2 DAOS[62833/62833] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
> 07/11-11:21:11.56 abu2 DAOS[62833/62833] daos INFO
> src/common/drpc.c:664 drpc_close() Closing dRPC socket fd=21
>
> Neither of these are mounting anything in "$DFS_MNT". Interestingly,
> if I leave out the --container at the low level dfuse, it seems to
> work at first, but later fails if I want to do anything with the
> $DFS_MNT directory.
>
> Kind regards,
> Ruben
>
>

Join {daos@daos.groups.io to automatically receive all group messages.