Re: no dRPC client set on Ubuntu 20.04.1
Same problem here, I used the master branch of daos on ubuntu 20.04.1
I also met this error when starting the daos_server: ERROR: removing socket file: removing instance 0 socket file: no dRPC client set (data plane not started?)The same message was printed in /tmp/daos_server.log: bio ERR src/bio/bio_xstream.c:367 bios_spdk_env_init() Failed to init SPDK thread lib, DER_INVAL(-1003)The identify error also happened in my case, however, it is now solved thanks to @Tom. The results of the identify is attached, as well as the server logs and configurations i used.
|
|
Re: DAOS Distributed Transaction
Changwoo Min
Hi Johann, Thank you for your reply. It helps a lot. Regards, Changwoo Min
On Tue, Sep 8, 2020 at 1:12 PM Lombardi, Johann <johann.lombardi@...> wrote:
|
|
Re: no dRPC client set on Ubuntu 20.04.1
Hello Gert,
./identify error can be fixed by pre-fixing command with "LD_LIBRARY_PATH=/path/to/spdk/libs" which in your case is probably /root/daos/install/prereq/dev/spdk/lib/libspdk_sock_posix.so.2.0 .
Please post the results of identify and we can go from there.
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of
Gert Pauwels (intel)
Today on my Ubuntu 20.04.1 system I had the following error when running: $ daos_server start ERROR: removing socket file: removing instance 0 socket file: no dRPC client set (data plane not started?) root@intel-S2600WFD:~/daos/build/external/dev/spdk/examples/nvme/identify# ./identify ./identify: error while loading shared libraries: libspdk_sock_posix.so.2.0: cannot open shared object file: No such file or directory
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
no dRPC client set on Ubuntu 20.04.1
Today on my Ubuntu 20.04.1 system I had the following error when running:
$ daos_server start
ERROR: removing socket file: removing instance 0 socket file: no dRPC client set (data plane not started?)I tried to 'solve' the problem by reinstalling Ubuntu 20.04.1 from scratch and compiled the master branch of DAOS. Problem was still there. The first error in /tmp/daos_server.log log points a bit to SPDK: bio ERR src/bio/bio_xstream.c:367 bio_spdk_env_init() Failed to init SPDK thread lib, DER_INVAL(-1003) I did an attempt to check if SPDK runs fine without using DAOS by calling the identify application in the spdk example directory root@intel-S2600WFD:~/daos/build/external/dev/spdk/examples/nvme/identify# ./identify
./identify: error while loading shared libraries: libspdk_sock_posix.so.2.0: cannot open shared object file: No such file or directory
Any suggestion on how to find what is happening. Thanks in advance, Gert,
|
|
Re: DAOS Distributed Transaction
Lombardi, Johann
Hi there!
First of all, thanks for joining the community. I am *very* excited about the support of distributed transactions in DAOS. We have been working on this for a long time and completely changed the design in 2017 to properly support serializability and external consistency.
The first use case is to maintain internal consistency of our POSIX layer called DAOS File System (DFS). The integration with distributed transactions allows to improve the POSIX compliance of the DFS library by guaranteeing atomicity of metadata operations like rename(2). No orphans or dandling entries are left behind in case of a client crashing in the middle of a rename operation.
Since POSIX is just yet-another middleware library for us, the same applies to all I/O middleware libraries built natively over DAOS. With the HDF5 DAOS VOL, HDF5 datasets can be updated safely in place without any risk of corrupting the internal HDF5 data structures when the application crashes or quits unexpectedly. One HDF5 operation typically requires multiple KV fetches/updates over the DAOS case. By bundling all those low-level operations under a single DAOS transaction, we can guarantee that the high-level HDF5 operation is atomic. Since the HDF5 DAOS VOL also supports independent operations (e.g. concurrent non-collective HDF5 group creations), the use of DAOS transactions also allow to preserve internal consistency when processing concurrent uncoordinated HDF5 operations. It is used in a similar way in several domain-specific data models that are in the process of being ported to the native DAOS API, bypassing POSIX entirely.
Last but not least, distributed transactions allow to support database semantics directly over DAOS. We are actually in the process of porting a SQL engine over DAOS as a proof of concept. I do believe that this capability combined with computational storage (i.e. running simple data-intensive tasks on DAOS storage nodes directly w/o moving the data over the fabric) will open the door to many interesting opportunities (e.g. query/indexing, …) in areas like data analytics and AI.
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Changwoo Min <changwoo@...>
Hi DAOS community! --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Re: DAOS/OFI & MOFED Support
Farrell, Patrick Arthur <patrick.farrell@...>
We had no specific need of 5.1, so we rolled back for now, since 5.0.x is still supported from Mellanox. The little digging I did suggested the issue is in OFA, rather than DAOS, so I expect that the Open Fabrics people will fix the incompatibility.
-Patrick
From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Lombardi, Johann <johann.lombardi@...>
Sent: Monday, September 7, 2020 12:52 AM To: daos@daos.groups.io <daos@daos.groups.io> Subject: Re: [daos] DAOS/OFI & MOFED Support Hi Patrick,
We are using MOFED 5.0.2 on Frontera and I don’t think we have ever tested with 5.1. Were you able to figure it out? Johann
From:
<daos@daos.groups.io> on behalf of "Farrell, Patrick Arthur" <patrick.farrell@...>
Good afternoon,
I am curious if anyone has tried DAOS with MLNX_OFED_LINUX-5.1-0.6.6.0 - The current latest version of MOFED 5.1.
I did, and I'm getting mercury errors related to CQs...
So, before dumping the errors:
Thanks - error dump follows:
For example, on rank0 when trying to create a pool on ranks 0 and 1: 08/28-16:29:42.684123 delphi-002 DAOS[279252/279298] hg ERR # NA -- Error -- /delphi/common/daos/build/external/dev/mercury/src/na/na_ofi.c:2555 # na_ofi_cq_read(): Operation ID was not canceled 08/28-16:29:42.684134 delphi-002 DAOS[279252/279298] hg ERR # NA -- Error -- /delphi/common/daos/build/external/dev/mercury/src/na/na_ofi.c:4585 # na_ofi_progress(): Could not read events from context CQ 08/28-16:29:42.684154 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury_core.c:2758 # hg_core_progress_na(): Could not make progress on NA (NA_FAULT) 08/28-16:29:42.684161 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury_core.c:2926 # hg_core_progress(): hg_core_progress_na() failed 08/28-16:29:42.684168 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury_core.c:4317 # HG_Core_progress(): Could not make progress 08/28-16:29:42.684178 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury.c:1996 # HG_Progress(): Could not make progress on context (HG_FAULT) 08/28-16:29:42.684185 delphi-002 DAOS[279252/279298] hg ERR src/cart/crt_hg.c:1234 crt_hg_progress() HG_Progress failed, hg_ret: 7. 08/28-16:29:42.684194 delphi-002 DAOS[279252/279298] rpc ERR src/cart/crt_context.c:1316 crt_progress() crt_hg_progress failed, rc: -1020. 08/28-16:29:42.684201 delphi-002 DAOS[279252/279298] server ERR src/iosrv/srv.c:565 dss_srv_handler() failed to progress CART context: -1020 08/28-16:30:42.684033 delphi-002 DAOS[279252/279298] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fcd8d7a3870) [opc=0x1010007 rpcid=0x6608781e00000134 rank:tag=1:0] ctx_id 0, (status: 0x38) timed out, tgt rank 1, tag 0
08/28-16:07:42.807417 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820aba0) [opc=0xfe000000 rpcid=0x642008fe00000128 rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0 08/28-16:07:42.807443 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820aba0) [opc=0xfe000000 rpcid=0x642008fe00000128 rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null) 08/28-16:07:45.208410 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820b3f0) [opc=0xfe000000 rpcid=0x642008fe00000129 rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0 08/28-16:07:45.208419 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820b3f0) [opc=0xfe000000 rpcid=0x642008fe00000129 rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null) 08/28-16:07:47.609419 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820be90) [opc=0xfe000000 rpcid=0x642008fe0000012a rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0 08/28-16:07:47.609428 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820be90) [opc=0xfe000000 rpcid=0x642008fe0000012a rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null) 08/28-16:07:49.811412 delphi-002 DAOS[279251/279299] swim ERR src/cart/swim/swim.c:802 swim_progress() SWIM shutdown 08/28-16:07:50.10411 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820c930) [opc=0xfe000000 rpcid=0x642008fe0000012b rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0 08/28-16:07:50.10419 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820c930) [opc=0xfe000000 rpcid=0x642008fe0000012b rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null) 08/28-16:08:14.96837 delphi-002 DAOS[279251/279299] hg WARN # NA -- Warning -- /delphi/common/daos/build/external/dev/mercury/src/na/na_ofi.c:2575 # na_ofi_cq_read(): fi_cq_readerr() got err: 5 (Input/output error), prov_errno: 12 (transport retry counter exceeded) 08/28-16:08:14.96853 delphi-002 DAOS[279251/279299] hg ERR src/cart/crt_hg.c:1031 crt_hg_req_send_cb(0x7fc6f820b3f0) [opc=0xfe000000 rpcid=0x642008fe00000129 rank:tag=0:0] RPC failed; rc: -1011 08/28-16:08:14.96867 delphi-002 DAOS[279251/279299] hg ERR src/cart/crt_hg.c:1031 crt_hg_req_send_cb(0x7fc6f820be90) [opc=0xfe000000 rpcid=0x642008fe0000012a rank:tag=0:0] RPC failed; rc: -1011 08/28-16:08:14.96874 delphi-002 DAOS[279251/279299] hg ERR src/cart/crt_hg.c:1031 crt_hg_req_send_cb(0x7fc6f820c930) [opc=0xfe000000 rpcid=0x642008fe0000012b rank:tag=0:0] RPC failed; rc: -1011
-Patrick --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Re: DAOS/OFI & MOFED Support
Lombardi, Johann
Hi Patrick,
We are using MOFED 5.0.2 on Frontera and I don’t think we have ever tested with 5.1. Were you able to figure it out? Johann
From:
<daos@daos.groups.io> on behalf of "Farrell, Patrick Arthur" <patrick.farrell@...>
Good afternoon,
I am curious if anyone has tried DAOS with MLNX_OFED_LINUX-5.1-0.6.6.0 - The current latest version of MOFED 5.1.
I did, and I'm getting mercury errors related to CQs...
So, before dumping the errors:
Thanks - error dump follows:
For example, on rank0 when trying to create a pool on ranks 0 and 1: 08/28-16:29:42.684123 delphi-002 DAOS[279252/279298] hg ERR # NA -- Error -- /delphi/common/daos/build/external/dev/mercury/src/na/na_ofi.c:2555 # na_ofi_cq_read(): Operation ID was not canceled 08/28-16:29:42.684134 delphi-002 DAOS[279252/279298] hg ERR # NA -- Error -- /delphi/common/daos/build/external/dev/mercury/src/na/na_ofi.c:4585 # na_ofi_progress(): Could not read events from context CQ 08/28-16:29:42.684154 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury_core.c:2758 # hg_core_progress_na(): Could not make progress on NA (NA_FAULT) 08/28-16:29:42.684161 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury_core.c:2926 # hg_core_progress(): hg_core_progress_na() failed 08/28-16:29:42.684168 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury_core.c:4317 # HG_Core_progress(): Could not make progress 08/28-16:29:42.684178 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury.c:1996 # HG_Progress(): Could not make progress on context (HG_FAULT) 08/28-16:29:42.684185 delphi-002 DAOS[279252/279298] hg ERR src/cart/crt_hg.c:1234 crt_hg_progress() HG_Progress failed, hg_ret: 7. 08/28-16:29:42.684194 delphi-002 DAOS[279252/279298] rpc ERR src/cart/crt_context.c:1316 crt_progress() crt_hg_progress failed, rc: -1020. 08/28-16:29:42.684201 delphi-002 DAOS[279252/279298] server ERR src/iosrv/srv.c:565 dss_srv_handler() failed to progress CART context: -1020 08/28-16:30:42.684033 delphi-002 DAOS[279252/279298] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fcd8d7a3870) [opc=0x1010007 rpcid=0x6608781e00000134 rank:tag=1:0] ctx_id 0, (status: 0x38) timed out, tgt rank 1, tag 0
08/28-16:07:42.807417 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820aba0) [opc=0xfe000000 rpcid=0x642008fe00000128 rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0 08/28-16:07:42.807443 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820aba0) [opc=0xfe000000 rpcid=0x642008fe00000128 rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null) 08/28-16:07:45.208410 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820b3f0) [opc=0xfe000000 rpcid=0x642008fe00000129 rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0 08/28-16:07:45.208419 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820b3f0) [opc=0xfe000000 rpcid=0x642008fe00000129 rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null) 08/28-16:07:47.609419 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820be90) [opc=0xfe000000 rpcid=0x642008fe0000012a rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0 08/28-16:07:47.609428 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820be90) [opc=0xfe000000 rpcid=0x642008fe0000012a rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null) 08/28-16:07:49.811412 delphi-002 DAOS[279251/279299] swim ERR src/cart/swim/swim.c:802 swim_progress() SWIM shutdown 08/28-16:07:50.10411 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820c930) [opc=0xfe000000 rpcid=0x642008fe0000012b rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0 08/28-16:07:50.10419 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820c930) [opc=0xfe000000 rpcid=0x642008fe0000012b rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null) 08/28-16:08:14.96837 delphi-002 DAOS[279251/279299] hg WARN # NA -- Warning -- /delphi/common/daos/build/external/dev/mercury/src/na/na_ofi.c:2575 # na_ofi_cq_read(): fi_cq_readerr() got err: 5 (Input/output error), prov_errno: 12 (transport retry counter exceeded) 08/28-16:08:14.96853 delphi-002 DAOS[279251/279299] hg ERR src/cart/crt_hg.c:1031 crt_hg_req_send_cb(0x7fc6f820b3f0) [opc=0xfe000000 rpcid=0x642008fe00000129 rank:tag=0:0] RPC failed; rc: -1011 08/28-16:08:14.96867 delphi-002 DAOS[279251/279299] hg ERR src/cart/crt_hg.c:1031 crt_hg_req_send_cb(0x7fc6f820be90) [opc=0xfe000000 rpcid=0x642008fe0000012a rank:tag=0:0] RPC failed; rc: -1011 08/28-16:08:14.96874 delphi-002 DAOS[279251/279299] hg ERR src/cart/crt_hg.c:1031 crt_hg_req_send_cb(0x7fc6f820c930) [opc=0xfe000000 rpcid=0x642008fe0000012b rank:tag=0:0] RPC failed; rc: -1011
-Patrick --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
DAOS Distributed Transaction
Changwoo Min
Hi DAOS community!
I'm Changwoo Min, a professor at Virginia Tech. My group does research on persistent memory and storage systems. I found that DAOS is an exciting and cool project! In particular, it is interesting to me that DAOS supports distributed transactions. I am wondering what the typical/intended use cases and applications of the distributed transaction are. Especially considering, as far as I know, DAOS will be deployed to HPC systems. I wonder if transactions can benefit any HPC/AI/ML/analytics applications. Any comments will be helpful. Regards, Changwoo Min
|
|
[DUG'20] Save the date & call for presentations!
Lombardi, Johann
Hi there,
As every year since 2017, we would like to hold the 4th annual DAOS User Group (DUG) around SC’20. Due to the pandemic situation, the DUG will obviously be virtual this year with live presentations on Nov 19. We purposely picked a date after the SC Tutorials/Workshops/BoFs to minimize conflicts, but please don’t hesitate to let us know (on this mailing list, on the slack channel or privately) if you are aware of any major conflict(s) that day. The time hasn’t been finalized yet, but we are shooting for a 3h-ish slot in the morning for America, late afternoon for EMEA and late evening for APAC (sorry about that) to maximize participations. Details are yet to be finalized and will be shared on the mailing list once ready.
As previous years, we would like to invite community members to submit presentation proposals (i.e. title + short summary) to daos-info@daos.groups.io We encourage any feedback and would like to hear from you on your experience with DAOS, future plans, what you have contributed or intend to contribute, what worked … and did not work so well. We are looking forward to your submissions!
Take care. Johann – on behalf of the Intel DAOS Team --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
DAOS/OFI & MOFED Support
Farrell, Patrick Arthur <patrick.farrell@...>
Good afternoon,
I am curious if anyone has tried DAOS with MLNX_OFED_LINUX-5.1-0.6.6.0 - The current latest version of MOFED 5.1.
I did, and I'm getting mercury errors related to CQs...
So, before dumping the errors:
Should this work? Is it supported to run DAOS with MOFED 5.1?
Thanks - error dump follows:
For example, on rank0 when trying to create a pool on ranks 0 and 1:
08/28-16:29:42.684123 delphi-002 DAOS[279252/279298] hg ERR # NA -- Error -- /delphi/common/daos/build/external/dev/mercury/src/na/na_ofi.c:2555
# na_ofi_cq_read(): Operation ID was not canceled
08/28-16:29:42.684134 delphi-002 DAOS[279252/279298] hg ERR # NA -- Error -- /delphi/common/daos/build/external/dev/mercury/src/na/na_ofi.c:4585
# na_ofi_progress(): Could not read events from context CQ
08/28-16:29:42.684154 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury_core.c:2758
# hg_core_progress_na(): Could not make progress on NA (NA_FAULT)
08/28-16:29:42.684161 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury_core.c:2926
# hg_core_progress(): hg_core_progress_na() failed
08/28-16:29:42.684168 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury_core.c:4317
# HG_Core_progress(): Could not make progress
08/28-16:29:42.684178 delphi-002 DAOS[279252/279298] hg ERR # HG -- Error -- /delphi/common/daos/build/external/dev/mercury/src/mercury.c:1996
# HG_Progress(): Could not make progress on context (HG_FAULT)
08/28-16:29:42.684185 delphi-002 DAOS[279252/279298] hg ERR src/cart/crt_hg.c:1234 crt_hg_progress() HG_Progress failed, hg_ret: 7.
08/28-16:29:42.684194 delphi-002 DAOS[279252/279298] rpc ERR src/cart/crt_context.c:1316 crt_progress() crt_hg_progress failed, rc: -1020.
08/28-16:29:42.684201 delphi-002 DAOS[279252/279298] server ERR src/iosrv/srv.c:565 dss_srv_handler() failed to progress CART context: -1020
08/28-16:30:42.684033 delphi-002 DAOS[279252/279298] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fcd8d7a3870) [opc=0x1010007 rpcid=0x6608781e00000134 rank:tag=1:0] ctx_id 0, (status: 0x38) timed out, tgt rank 1, tag 0
And on rank 1:
08/28-16:07:42.807417 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820aba0) [opc=0xfe000000 rpcid=0x642008fe00000128 rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0
08/28-16:07:42.807443 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820aba0) [opc=0xfe000000 rpcid=0x642008fe00000128 rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null)
08/28-16:07:45.208410 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820b3f0) [opc=0xfe000000 rpcid=0x642008fe00000129 rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0
08/28-16:07:45.208419 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820b3f0) [opc=0xfe000000 rpcid=0x642008fe00000129 rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null)
08/28-16:07:47.609419 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820be90) [opc=0xfe000000 rpcid=0x642008fe0000012a rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0
08/28-16:07:47.609428 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820be90) [opc=0xfe000000 rpcid=0x642008fe0000012a rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null)
08/28-16:07:49.811412 delphi-002 DAOS[279251/279299] swim ERR src/cart/swim/swim.c:802 swim_progress() SWIM shutdown
08/28-16:07:50.10411 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:790 crt_context_timeout_check(0x7fc6f820c930) [opc=0xfe000000 rpcid=0x642008fe0000012b rank:tag=0:0] ctx_id 0, (status: 0x38) timed out, tgt rank 0, tag 0
08/28-16:07:50.10419 delphi-002 DAOS[279251/279299] rpc ERR src/cart/crt_context.c:748 crt_req_timeout_hdlr(0x7fc6f820c930) [opc=0xfe000000 rpcid=0x642008fe0000012b rank:tag=0:0] aborting to group daos_server, rank 0, tgt_uri (null)
08/28-16:08:14.96837 delphi-002 DAOS[279251/279299] hg WARN # NA -- Warning -- /delphi/common/daos/build/external/dev/mercury/src/na/na_ofi.c:2575
# na_ofi_cq_read(): fi_cq_readerr() got err: 5 (Input/output error), prov_errno: 12 (transport retry counter exceeded)
08/28-16:08:14.96853 delphi-002 DAOS[279251/279299] hg ERR src/cart/crt_hg.c:1031 crt_hg_req_send_cb(0x7fc6f820b3f0) [opc=0xfe000000 rpcid=0x642008fe00000129 rank:tag=0:0] RPC failed; rc: -1011
08/28-16:08:14.96867 delphi-002 DAOS[279251/279299] hg ERR src/cart/crt_hg.c:1031 crt_hg_req_send_cb(0x7fc6f820be90) [opc=0xfe000000 rpcid=0x642008fe0000012a rank:tag=0:0] RPC failed; rc: -1011
08/28-16:08:14.96874 delphi-002 DAOS[279251/279299] hg ERR src/cart/crt_hg.c:1031 crt_hg_req_send_cb(0x7fc6f820c930) [opc=0xfe000000 rpcid=0x642008fe0000012b rank:tag=0:0] RPC failed; rc: -1011
-Patrick
|
|
Re: DAOS in Docker
Lombardi, Johann
Hi,
Just to confirm, you are running docker on Linux, right? Could you please try to run the SPDK init script manually and send me the output? Johann
From:
<daos@daos.groups.io> on behalf of "helloworld@..." <helloworld@...>
Johann, Thank you for replying it --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Re: Behavior of daos_kv_get for non-existent Keys
Steffen Christgau
On 8/25/20 3:19 PM, Chaarawi, Mohamad wrote:
We have recently added conditional operations to the DAOS object and KV api to allow for such conditional operations:Thanks for pointing that out, Mohamad. However for the KV API, I actually see an issue where these flags are not properly set.Great. Looking forward for a notification. Just to be sure: Given that the API would work correctly, these conditional operations are passed with the flags parameter which are marked as "currently ignored"?! Regards, Steffen
|
|
Re: Behavior of daos_kv_get for non-existent Keys
Chaarawi, Mohamad
Hi Steffen,
We have recently added conditional operations to the DAOS object and KV api to allow for such conditional operations: DAOS_COND_KEY_INSERT/UPDATE/FETCH/PUNCH (for the daos_kv_* API) Which would give you what you need. However for the KV API, I actually see an issue where these flags are not properly set. I will push a patch to fix this soon and let you know. Thanks, Mohamad On 8/24/20, 7:59 AM, "daos@daos.groups.io on behalf of Steffen Christgau" <daos@daos.groups.io on behalf of christgau@...> wrote: Hi everybody, I'm experimenting with the (low level) DAOS Key Value API, i.e. daos_kv_get and friends. For the get function, I observed that passing an non-existent key returns both 0, indicating success, as well as an "actual size of the value" of again 0. However, it is also valid to put a key with a zero length value into the KV store. That key is subsequently found when enumerating the names inside the object (daos_kv_list). Is this behavior of the get operation, i.e. returning success and an empty (value), intended? If so, how can I check if a queried key really existed other than by enumerating the (whole) object? Regards, Steffen
|
|
Re: DAOS & HDF5
Steffen Christgau
Hi Patrick, hi everybody,
On 8/25/20 2:39 PM, Farrell, Patrick Arthur wrote: I'm aware there's an HDF5 plugin for DAOS, but I am not certain about the current status of the plugin,I'm interested in that information as well. And moreover: What about the support for netCDF? or where to find it.https://bitbucket.hdfgroup.org/projects/HDF5VOL/repos/daos-vol/browse I'm currently working on that matter but I'm struggling with the compilation process to get the tests compiled successfully. Just as a side note: The Bitbucket's HEAD is not working with DAOS 1.0.1 due to some API changes in DAOS, but commit 34f3d46 appears to do. At least it compiles (without tests). Steffen
|
|
DAOS & HDF5
Farrell, Patrick Arthur <patrick.farrell@...>
Good morning,
I'm aware there's an HDF5 plugin for DAOS, but I am not certain about the current status of the plugin, or where to find it.
Is there current info on this or can someone provide a pointer?
Thanks much.
-Patrick
|
|
Behavior of daos_kv_get for non-existent Keys
Steffen Christgau
Hi everybody,
I'm experimenting with the (low level) DAOS Key Value API, i.e. daos_kv_get and friends. For the get function, I observed that passing an non-existent key returns both 0, indicating success, as well as an "actual size of the value" of again 0. However, it is also valid to put a key with a zero length value into the KV store. That key is subsequently found when enumerating the names inside the object (daos_kv_list). Is this behavior of the get operation, i.e. returning success and an empty (value), intended? If so, how can I check if a queried key really existed other than by enumerating the (whole) object? Regards, Steffen
|
|
Re: DAOS in Docker
helloworld@...
Johann, Thank you for replying it
Of course I already loaded the uio_pci_generic kernel module. Now, Im using only SCM based on RAM emulation, not using NVMe SSD emulation So then it works now However I'd like to use NVMe SSD emulation based on RAM... How can I fix it?
|
|
Slack community channel
Lombardi, Johann
Hi there,
I got several requests recently to migrate the community chat from Gitter to Slack. I have thus created a daos-stack workspace on slack and also enabled the integration with groups.io. Any subscribers to the DAOS community mailing should thus automatically receive an invite to join the slack channel. Let me know if you have any problems/concerns.
Cheers, Johann --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Re: DAOS in Docker
Lombardi, Johann
Hi there,
Did you load the uio_pci_generic module in the kernel as specific in the note?
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "helloworld@..." <helloworld@...>
I'm configuring the DAOS in Docker with only-RAM emulation and scm_mount: /mnt/daos scm_class: ram scm_size: 4
bdev_class: file bdev_size: 16 bdev_list: [/tmp/daos-bdev] --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for
|
|
Avocado's upcoming LTS release
Cleber Rosa
Hi DAOS community,
Given that some of the DAOS testing[1] uses the Avocado testing framework, i'd like to bring to your attention that we have an upcoming 82.0 LTS release scheduled for Sept 7th[2]. For that release, we'd like to keep as much compatibility as possible, and when not possible, allow for a smoother migration. 69.x LTS will be maintained for another 6 months, after the release 82.0 LTS release, but the sooner any issue is addressed, the better. For that, we have an epic issue[3] in which we could use your help, with: * running the existing tests you have, with the most recent Avocado version possible * opening any issues[4] you encounter This will feed into either bug fixes, or documentation on how to migrate from 69.x LTS to 82.0 LTS. In addition to this this, feel free to engage with us about how the new Avocado features (and there's a lot of them) may be beneficial to the Falco project. Thanks! - Cleber -- [1] - https://github.com/daos-stack/daos/blob/master/src/tests/ftest/launch.py#L749 [2] - https://github.com/avocado-framework/avocado/milestone/8 [3] - https://github.com/avocado-framework/avocado/issues/4103 [4] - https://github.com/avocado-framework/avocado/issues/new/choose
|
|