Date   

Re: Error in installing via Docker and help needed for integrating with REST API

Lombardi, Johann
 

Hi,

 

Sorry for the confusion. The structure of the docker files was modified some time ago and we updated the documentation (in the source code too) accordingly, but the online github I/O pages have not been refreshed since. This explains the disconnection. Please run instead:

$ docker build https://github.com/daos-stack/daos.git#master -f utils/docker/Dockerfile.centos.7 -t daos

I have just tested it and it works.

 

@Durfey, Craig, could you please refresh the online documentation?

 

Regarding the management API, it is written in Go. For unknown reason, pkg.go.dev does not seem to be generating the documentation for our API any longer. I will get back to you ASAP on this.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "asharma@..." <asharma@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 4 February 2021 at 05:21
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Error in installing via Docker and help needed for integrating with REST API

 

Hi Johann,
I ran the exact same command: docker build https://github.com/daos-stack/daos.git#master:utils/docker \ -f Dockerfile.centos.7 -t daos

My major use case of connecting my Python API with the DAOS storage is to access the "management API". I know with pydaos we can read and write KV pairs to store with keys and value both to be strings. But is there any way to perform management tasks from my rest APi by interacting with the DAOS system. 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Error in installing via Docker and help needed for integrating with REST API

asharma@...
 

Hi Ashley,
I ran the following 3 commands: 
git clone --recurse-submodules -b v1.0.1 https://github.com/daos-stack/daos.git
cd daos
docker build https://github.com/daos-stack/daos.git#v1.0.1:utils/docker \
        -f Dockerfile.centos.7 -t daos

This means I checked out the v1.0.1 branch and ran with the v1.0.1 docker file on it. I was running v1.0.1 because I thought i should use the last stable version as I am using this in a project. Do you suggest I try with the master clone and master Docker file?


Re: Error in installing via Docker and help needed for integrating with REST API

Pittman, Ashley M
 

 

It looks like you’re trying to use 1.0.1?  I’ve just checked that out locally and there’s no reference to patchelf at all in the source tree at that point, we use it now for checking that build versions are relocatable.

 

Is there any chance you’re using a 1.0.1 dockerfile to build from the master branch?  The line numbers would support this, in particular SConstruct didn’t have 463 lines in v1.0.1, but in master it does match the trace you posted below.

 

Ashley.

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of asharma@... <asharma@...>
Date: Thursday, 4 February 2021 at 06:05
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Error in installing via Docker and help needed for integrating with REST API

Hi Ashley,

Thank you so much for the response. I was in the wrong directory it seems. Although when I try now, the build stops with the following error :- 

[...]

at include/opa_user.h    /tmp/tmp.8clvxfJtXv/opa_user_gen1.h  > /tmp/tmp.8clvxfJtXv/opa_user.h

cat include/opa_service.h /tmp/tmp.8clvxfJtXv/opa_service_gen1.h > /tmp/tmp.8clvxfJtXv/opa_service.h

install -m 0644 -D /tmp/tmp.8clvxfJtXv/opa_user.h    /usr/prereq/release/psm2/include/hfi1diag/opa_user.h

install -m 0644 -D /tmp/tmp.8clvxfJtXv/opa_service.h /usr/prereq/release/psm2/include/hfi1diag/opa_service.h

install -m 0644 -D /tmp/tmp.8clvxfJtXv/opa_common_gen1.h /usr/prereq/release/psm2/include/hfi1diag/opa_common.h

install -m 0644 -D include/opa_byteorder.h /usr/prereq/release/psm2/include/hfi1diag/opa_byteorder.h

install -m 0644 -D include/psm2_mock_testing.h /usr/prereq/release/psm2/include/hfi1diag/psm2_mock_testing.h

install -m 0644 -D include/opa_revision.h /usr/prereq/release/psm2/include/hfi1diag/opa_revision.h

install -m 0644 -D psmi_wrappers.h /usr/prereq/release/psm2/include/hfi1diag/psmi_wrappers.h

install -m 0644 -D psm_hal_gen1/hfi1_deprecated_gen1.h /usr/prereq/release/psm2/include/hfi1diag/hfi1_deprecated.h

rm -fr /tmp/tmp.8clvxfJtXv

Checking for C header file psm2.h... yes

Checking for C library psm2... yes

Checking whether patchelf program exists...no

MissingSystemLibs: ofi has unmet dependencies required for build:

  File "/home/daos/daos/SConstruct", line 463:

    scons()

  File "/home/daos/daos/SConstruct", line 386:

    preload_prereqs(prereqs)

  File "/home/daos/daos/SConstruct", line 175:

    prereqs.load_definitions(prebuild=reqs)

  File "/home/daos/daos/utils/sl/prereq_tools/base.py", line 992:

    self.require(env, comp)

  File "/home/daos/daos/utils/sl/prereq_tools/base.py", line 1061:

 

    raise error

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I believe it says that it could not find patchelf installed on the centOS image.  It should have been installed on the image as well when it installed the dependencies, I am not able to figure out why it will say so. My local has patchelf installed but that is of no use for this image build. 

I am using version 1.0.1 as I felt it was the last stable version. so I used the following commands

git clone --recurse-submodules -b v1.0.1 https://github.com/daos-stack/daos.git
cd daos

docker build https://github.com/daos-stack/daos.git#v1.0.1:utils/docker \

        -f Dockerfile.centos.7 -t daos

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Error in installing via Docker and help needed for integrating with REST API

asharma@...
 

Hi Ashley,

Thank you so much for the response. I was in the wrong directory it seems. Although when I try now, the build stops with the following error :- 

[...]

at include/opa_user.h    /tmp/tmp.8clvxfJtXv/opa_user_gen1.h  > /tmp/tmp.8clvxfJtXv/opa_user.h

cat include/opa_service.h /tmp/tmp.8clvxfJtXv/opa_service_gen1.h > /tmp/tmp.8clvxfJtXv/opa_service.h

install -m 0644 -D /tmp/tmp.8clvxfJtXv/opa_user.h    /usr/prereq/release/psm2/include/hfi1diag/opa_user.h

install -m 0644 -D /tmp/tmp.8clvxfJtXv/opa_service.h /usr/prereq/release/psm2/include/hfi1diag/opa_service.h

install -m 0644 -D /tmp/tmp.8clvxfJtXv/opa_common_gen1.h /usr/prereq/release/psm2/include/hfi1diag/opa_common.h

install -m 0644 -D include/opa_byteorder.h /usr/prereq/release/psm2/include/hfi1diag/opa_byteorder.h

install -m 0644 -D include/psm2_mock_testing.h /usr/prereq/release/psm2/include/hfi1diag/psm2_mock_testing.h

install -m 0644 -D include/opa_revision.h /usr/prereq/release/psm2/include/hfi1diag/opa_revision.h

install -m 0644 -D psmi_wrappers.h /usr/prereq/release/psm2/include/hfi1diag/psmi_wrappers.h

install -m 0644 -D psm_hal_gen1/hfi1_deprecated_gen1.h /usr/prereq/release/psm2/include/hfi1diag/hfi1_deprecated.h

rm -fr /tmp/tmp.8clvxfJtXv

Checking for C header file psm2.h... yes

Checking for C library psm2... yes

Checking whether patchelf program exists...no

MissingSystemLibs: ofi has unmet dependencies required for build:

  File "/home/daos/daos/SConstruct", line 463:

    scons()

  File "/home/daos/daos/SConstruct", line 386:

    preload_prereqs(prereqs)

  File "/home/daos/daos/SConstruct", line 175:

    prereqs.load_definitions(prebuild=reqs)

  File "/home/daos/daos/utils/sl/prereq_tools/base.py", line 992:

    self.require(env, comp)

  File "/home/daos/daos/utils/sl/prereq_tools/base.py", line 1061:

 

    raise error

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I believe it says that it could not find patchelf installed on the centOS image.  It should have been installed on the image as well when it installed the dependencies, I am not able to figure out why it will say so. My local has patchelf installed but that is of no use for this image build. 

I am using version 1.0.1 as I felt it was the last stable version. so I used the following commands

git clone --recurse-submodules -b v1.0.1 https://github.com/daos-stack/daos.git
cd daos
docker build https://github.com/daos-stack/daos.git#v1.0.1:utils/docker \
        -f Dockerfile.centos.7 -t daos


Re: Error in installing via Docker and help needed for integrating with REST API

asharma@...
 

Hi Johann,
I ran the exact same command: docker build https://github.com/daos-stack/daos.git#master:utils/docker \ -f Dockerfile.centos.7 -t daos

My major use case of connecting my Python API with the DAOS storage is to access the "management API". I know with pydaos we can read and write KV pairs to store with keys and value both to be strings. But is there any way to perform management tasks from my rest APi by interacting with the DAOS system. 


Re: Client application single value KV Put high latency using multiple threads (pthread)

ping.wong@...
 

For testing purpose, I run the servers and agents in the foreground.  I press ctrl-c to stop servers and agents.  Then, I start the servers one after the other and restart all agents on servers and start the client agent last.

Ping


Re: Client application single value KV Put high latency using multiple threads (pthread)

ping.wong@...
 

Hi Mohamad,

On the client node, I stopped the old agent and restart agent again.

Ping 


Re: Client application single value KV Put high latency using multiple threads (pthread)

Chaarawi, Mohamad
 

Hi Ping,

 

Did you restart the agent on the client side or did you have an older agent running?

 

Thanks,

Mohamad

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of ping.wong via groups.io <ping.wong@...>
Date: Wednesday, February 3, 2021 at 10:46 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Client application single value KV Put high latency using multiple threads (pthread)

Hi Johann,

Since I ran both servers as root, I did umount /mnt/root and rm -rf /mnt/root before restarting both servers and agents.  I am using Then I did the storage format.  After format, I restart both servers again to make sure the configuration persisted.  Both servers rejoin the domain and seem to restart ok. 

There are no configuration changes in the client side, correct?  The errors shown comes from the client log.  The client did not detect the provider changes and continues to use ofi+sockets as you observed.

Ping


Re: Client application single value KV Put high latency using multiple threads (pthread)

ping.wong@...
 

Hi Johann,

Since I ran both servers as root, I did umount /mnt/root and rm -rf /mnt/root before restarting both servers and agents.  I am using Then I did the storage format.  After format, I restart both servers again to make sure the configuration persisted.  Both servers rejoin the domain and seem to restart ok. 

There are no configuration changes in the client side, correct?  The errors shown comes from the client log.  The client did not detect the provider changes and continues to use ofi+sockets as you observed.

Ping


Re: Client application single value KV Put high latency using multiple threads (pthread)

Lombardi, Johann
 

Hey Ping,

 

I did look into your logs and notice messages like “Could not lookup ofi+sockets://11.11.200.48:31416” which mean that sockets URIs (instead of tcp) are still registered and storage nodes haven’t registered the new tcp-based URIs yet. Please make sure to stop the servers, umount /mnt/daos* (and wipefs -a /dev/pmem* if you use pmem) before restarting the servers.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "ping.wong via groups.io" <ping.wong@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Wednesday 3 February 2021 at 17:05
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Client application single value KV Put high latency using multiple threads (pthread)

 

Hi Johann,

I did reformat and restart all agents on client node and the two servers.  Both servers using ofi+tcp;ofi_rxm provider start fine; however, the client application failed.  Please refer the errors in my previous email (marked with ****).   For now, I can only get ofi+sockets provider to work reliably.  Are there any addition parameter settings in any of the yaml file (daos_server.yml, daos_control.yml, daos_agent.yml etc.) that I need to change beside switching from ofi+sockets to ofi+tcp;ofi_rxm? Any other environment variables to set?  

Ping

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Client application single value KV Put high latency using multiple threads (pthread)

ping.wong@...
 

Hi Johann,

I did reformat and restart all agents on client node and the two servers.  Both servers using ofi+tcp;ofi_rxm provider start fine; however, the client application failed.  Please refer the errors in my previous email (marked with ****).   For now, I can only get ofi+sockets provider to work reliably.  Are there any addition parameter settings in any of the yaml file (daos_server.yml, daos_control.yml, daos_agent.yml etc.) that I need to change beside switching from ofi+sockets to ofi+tcp;ofi_rxm? Any other environment variables to set?  

Ping


Setup DAOS

asharma@...
 

I was trying to setup DAOS from scratch following the documentation. I am doing it from scratch because I have a ubuntu machine and can not use the RPM built. Also, I want to understand the project structure. 
When I try to build from the GitHub link,
I get the following error. I understand it is while the setup is trying to install SPDK dependency. I even researched it online and found that some users had the same issue with SPDK but the ticket says the issue is fixed no from the SPDK side. 

Error:
[...]
github.com/daos-stack/daos/src/control/vendor/gopkg.in/yaml.v2
github.com/daos-stack/daos/src/control/vendor/github.com/dustin/go-humanize
github.com/daos-stack/daos/src/control/lib/ipmctl
github.com/daos-stack/daos/src/control/provider/system
log/syslog
github.com/daos-stack/daos/src/control/logging
github.com/daos-stack/daos/src/control/common
github.com/daos-stack/daos/src/control/lib/spdk
github.com/daos-stack/daos/src/control/server/storage
github.com/daos-stack/daos/src/control/pbin
github.com/daos-stack/daos/src/control/server/storage/scm
# github.com/daos-stack/daos/src/control/lib/spdk
/usr/bin/ld: /home/ayush/daos/daos/install/lib/libspdk_util.so: undefined reference to `crc16_t10dif'
/usr/bin/ld: /home/ayush/daos/daos/install/lib/libspdk_util.so: undefined reference to `crc16_t10dif_copy'
/usr/bin/ld: /home/ayush/daos/daos/install/lib/libspdk_util.so: undefined reference to `crc32_iscsi'
collect2: error: ld returned 1 exit status
scons: *** Error 2
scons: *** [build/src/control/bin/daos_admin] Error 2
scons: building terminated because of errors.

I am new to this and understand this could be something really small. Any help will be deeply appreciated. 
I also attempted running straight with the docker command which gives me this error: 


I know the SConstruct file is in the daos repo's root directory but for some reason, it is not found by the installer. 


Re: Error in installing via Docker and help needed for integrating with REST API

Pittman, Ashley M
 

 

This will create a build a docker image using the Dockerfile from github, but the sources from the current directory so you need to do this from a checked-out copy of the source tree.

 

Ashley.

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Lombardi, Johann <johann.lombardi@...>
Date: Wednesday, 3 February 2021 at 14:49
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Error in installing via Docker and help needed for integrating with REST API

Hi,

 

Could you please advise what docker command you ran?

“docker build https://github.com/daos-stack/daos.git#master:utils/docker -f Dockerfile.centos.7” should work fine.

 

Regarding the REST API, it would be great to know the type of information you would like to expose via your REST interface.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "asharma@..." <asharma@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Wednesday 3 February 2021 at 03:09
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Error in installing via Docker and help needed for integrating with REST API

 

I am trying to Set-up the DAOS system on my local machine. So, when I am trying to run the DockerFile for the same, it gives me the following error:



Also, I want to connect my REST API to the DAOS Server, and I know there is something called 'client APIs ' for DAOS. But, I need to know how to integrate my REST API with DAOS. 
It would be great if anyone could help me with this.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Error in installing via Docker and help needed for integrating with REST API

Lombardi, Johann
 

Hi,

 

Could you please advise what docker command you ran?

“docker build https://github.com/daos-stack/daos.git#master:utils/docker -f Dockerfile.centos.7” should work fine.

 

Regarding the REST API, it would be great to know the type of information you would like to expose via your REST interface.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "asharma@..." <asharma@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Wednesday 3 February 2021 at 03:09
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Error in installing via Docker and help needed for integrating with REST API

 

I am trying to Set-up the DAOS system on my local machine. So, when I am trying to run the DockerFile for the same, it gives me the following error:



Also, I want to connect my REST API to the DAOS Server, and I know there is something called 'client APIs ' for DAOS. But, I need to know how to integrate my REST API with DAOS. 
It would be great if anyone could help me with this.

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Client application single value KV Put high latency using multiple threads (pthread)

Lombardi, Johann
 

Hi Ping,

 

Sorry, I should have provided more details in my previous email. After switching to ofi+tcp;ofi_rxm in the config file, you will have to reformat and restart the agent since we don’t support live provider change yet. It would be great if you could provide me with the output of “daos pool autotest” with both ofi+sockets and ofi+tcp;ofi_rxm so that I can compare it with results that I have on my side with 40Gbps.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "ping.wong via groups.io" <ping.wong@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Wednesday 3 February 2021 at 08:13
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Client application single value KV Put high latency using multiple threads (pthread)

 

[Edited Message Follows]

Hi Johann,

I have the control interface on 10Gbps Ethernet and the data plane interface is on 100Gbps Ethernet.
 
Per your recommendation, I tried ofi+tcp;ofi_rxm; however, the client application failed (marked with ******).  

 

Server1 - connected ofi+tcp;ofi_rxm

 

DEBUG 01:02:25.452378 mgmt_system.go:183: processing 1 join requests

DEBUG 01:02:25.458189 mgmt_system.go:255: updated system member: rank 0, uri ofi+tcp;ofi_rxm://11.11.200.46:31416, Joined->Joined

daos_io_server:0 DAOS I/O server (v1.1.2.1) process 215563 started on rank 0 with 4 target, 2 helper XS, firstcore 0, host test46.autocache.com.

 

Server2 - conected  ofi+tcp;ofi_rxm

 

DEBUG 01:02:09.275423 raft.go:204: no known peers, aborting election:

DEBUG 01:02:09.911677 instance_drpc.go:66: DAOS I/O Server instance 0 drpc ready: uri:"ofi+tcp;ofi_rxm://11.11.200.48:31416" nctxs:7 drpcListenerSock:"/tmp/daos_sockets/daos_io_server_28178.sock" ntgts:4

DEBUG 01:02:09.914435 system.go:155: DAOS system join request: sys:"daos_server" uuid:"e32fcef5-c6c4-491f-a25b-f21ae4d3a75f" rank:1 uri:"ofi+tcp;ofi_rxm://11.11.200.48:31416" nctxs:7 addr:"0.0.0.0:10001" srvFaultDomain:"/test48.sdmsl.net"

DEBUG 01:02:09.915330 rpc.go:213: request hosts: [test46:10001 test48:10001 test62:10001]

daos_io_server:0 DAOS I/O server (v1.1.2.1) process 28178 started on rank 1 with 4 target, 2 helper XS, firstcore 1, host test48.sdmsl.net.

 

Client Failed

=================

DAOS Flat KV test..

=================

[==========] Running 1 test(s).

setup: creating pool, SCM size=4 GB, NVMe size=16 GB

setup: created pool a9177073-f014-477b-9ad1-5fe36d334f07

setup: connecting to pool

daos_pool_connect failed, rc: -1020                                *******************************

[  FAILED  ] GROUP SETUP

[  ERROR   ] DAOS KV API tests

state not set, likely due to group-setup issue

[==========] 0 test(s) run.

[  PASSED  ] 0 test(s).

daos_fini() failed with -1001               

 

This is part of the client log (with errors):

 

02/03-01:04:24.76 test62 DAOS[29842/29842] mgmt DBUG src/mgmt/cli_mgmt.c:192 fill_sys_info() GetAttachInfo Provider: ofi+tcp;ofi_rxm, Interface: enp24s0f0, Domain: enp24s0f0,CRT_CTX_SHARE_ADDR: 0, CRT_TIMEOUT: 0

                                                                 ...

 

02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # NA -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/na/na_ofi.c:3431

 # na_ofi_addr_lookup(): Unrecognized provider type found from: sockets://11.11.200.48:31416

02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # HG -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/mercury_core.c:1220

 # hg_core_addr_lookup(): Could not lookup address ofi+sockets://11.11.200.48:31416 (NA_INVALID_ARG)

02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # HG -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/mercury_core.c:3850

 # HG_Core_addr_lookup2(): Could not lookup address

02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # HG -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/mercury.c:1490

 # HG_Addr_lookup2(): Could not lookup ofi+sockets://11.11.200.48:31416 (HG_INVALID_ARG) ************************************************************************************************

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_rpc.c:1038 crt_req_hg_addr_lookup() HG_Addr_lookup2() failed. uri=ofi+sockets://11.11.200.48:31416, hg_ret=11 **********************************

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_rpc.c:1133 crt_req_send_internal() crt_req_hg_addr_lookup() failed, rc -1020, opc: 0x1010003.

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_rpc.c:1234 crt_req_send(0x1f7ea90) [opc=0x1010003 (DAOS) rpcid=0x636fb8e100000000 rank:tag=1:0] crt_req_send_internal() failed, DER_HG(-1020): 'Transport layer mercury error'

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_rpc.c:1580 timeout_bp_node_exit(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] exiting the timeout binheap.

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:629 crt_req_timeout_untrack(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 4.

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:1017 crt_context_req_untrack(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 3.

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_context.c:309 crt_rpc_complete(0x1f7ea90) [opc=0x1010003 (DAOS) rpcid=0x636fb8e100000000 rank:tag=1:0] failed, DER_HG(-1020): 'Transport layer mercury error'

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:316 crt_rpc_complete(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] Invoking RPC callback (rank 1 tag 0) rc: DER_HG(-1020): 'Transport layer mercury error'

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:321 crt_rpc_complete(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 2.

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_rpc.c:1260 crt_req_send(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 1.

02/03-01:04:32.78 test62 DAOS[29842/29842] mgmt DBUG src/mgmt/cli_mgmt.c:808 dc_mgmt_get_pool_svc_ranks() a9177073: daos_rpc_send_wait() failed, DER_HG(-1020): 'Transport layer mercury error'

02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_rpc.c:537 crt_req_decref(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 0.

02/03-01:04:32.78 test62 DAOS[29842/29842] hg   DBUG src/cart/crt_hg.c:971 crt_hg_req_destroy(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] destroying

 

02/03-01:04:32.78 test62 DAOS[29842/29842] crt  ERR  src/cart/crt_init.c:537 crt_finalize() cannot finalize, current ctx_num(1).    ***********************************

02/03-01:04:32.78 test62 DAOS[29842/29842] crt  ERR  src/cart/crt_init.c:596 crt_finalize() crt_finalize failed, rc: -1001.

02/03-01:04:32.78 test62 DAOS[29842/29842] client ERR  src/client/api/event.c:147 daos_eq_lib_fini() failed to shutdown crt: DER_NO_PERM(-1001): 'Operation not permitted'

02/03-01:04:32.78 test62 DAOS[29842/29842] client ERR  src/client/api/init.c:267 daos_fini() failed to finalize eq: DER_NO_PERM(-1001): 'Operation not permitted' ******************


I cannot find any documentation in the Deployment Guide about ofi+tcp;ofi_rxm settings on the server side and on the client side.   Perhaps, I missed some settings in some .yml file.


Thanks
Ping

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Error in installing via Docker and help needed for integrating with REST API

asharma@...
 

I am trying to Set-up the DAOS system on my local machine. So, when I am trying to run the DockerFile for the same, it gives me the following error:



Also, I want to connect my REST API to the DAOS Server, and I know there is something called 'client APIs ' for DAOS. But, I need to know how to integrate my REST API with DAOS. 
It would be great if anyone could help me with this.


Re: Client application single value KV Put high latency using multiple threads (pthread)

ping.wong@...
 
Edited

Hi Johann,

I have the control interface on 10Gbps Ethernet and the data plane interface is on 100Gbps Ethernet.
 
Per your recommendation, I tried ofi+tcp;ofi_rxm; however, the client application failed (marked with ******).  
 
Server1 - connected ofi+tcp;ofi_rxm
 
DEBUG 01:02:25.452378 mgmt_system.go:183: processing 1 join requests
DEBUG 01:02:25.458189 mgmt_system.go:255: updated system member: rank 0, uri ofi+tcp;ofi_rxm://11.11.200.46:31416, Joined->Joined
daos_io_server:0 DAOS I/O server (v1.1.2.1) process 215563 started on rank 0 with 4 target, 2 helper XS, firstcore 0, host test46.autocache.com.
 
Server2 - conected  ofi+tcp;ofi_rxm
 
DEBUG 01:02:09.275423 raft.go:204: no known peers, aborting election:
DEBUG 01:02:09.911677 instance_drpc.go:66: DAOS I/O Server instance 0 drpc ready: uri:"ofi+tcp;ofi_rxm://11.11.200.48:31416" nctxs:7 drpcListenerSock:"/tmp/daos_sockets/daos_io_server_28178.sock" ntgts:4
DEBUG 01:02:09.914435 system.go:155: DAOS system join request: sys:"daos_server" uuid:"e32fcef5-c6c4-491f-a25b-f21ae4d3a75f" rank:1 uri:"ofi+tcp;ofi_rxm://11.11.200.48:31416" nctxs:7 addr:"0.0.0.0:10001" srvFaultDomain:"/test48.sdmsl.net"
DEBUG 01:02:09.915330 rpc.go:213: request hosts: [test46:10001 test48:10001 test62:10001]
daos_io_server:0 DAOS I/O server (v1.1.2.1) process 28178 started on rank 1 with 4 target, 2 helper XS, firstcore 1, host test48.sdmsl.net.
 
Client Failed
=================
DAOS Flat KV test..
=================
[==========] Running 1 test(s).
setup: creating pool, SCM size=4 GB, NVMe size=16 GB
setup: created pool a9177073-f014-477b-9ad1-5fe36d334f07
setup: connecting to pool
daos_pool_connect failed, rc: -1020                                *******************************
[  FAILED  ] GROUP SETUP
[  ERROR   ] DAOS KV API tests
state not set, likely due to group-setup issue
[==========] 0 test(s) run.
[  PASSED  ] 0 test(s).
daos_fini() failed with -1001               
 
This is part of the client log (with errors):
 
02/03-01:04:24.76 test62 DAOS[29842/29842] mgmt DBUG src/mgmt/cli_mgmt.c:192 fill_sys_info() GetAttachInfo Provider: ofi+tcp;ofi_rxm, Interface: enp24s0f0, Domain: enp24s0f0,CRT_CTX_SHARE_ADDR: 0, CRT_TIMEOUT: 0
                                                                 ...
 
02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # NA -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/na/na_ofi.c:3431
 # na_ofi_addr_lookup(): Unrecognized provider type found from: sockets://11.11.200.48:31416
02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # HG -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/mercury_core.c:1220
 # hg_core_addr_lookup(): Could not lookup address ofi+sockets://11.11.200.48:31416 (NA_INVALID_ARG)
02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # HG -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/mercury_core.c:3850
 # HG_Core_addr_lookup2(): Could not lookup address
02/03-01:04:32.78 test62 DAOS[29842/29842] external ERR  # HG -- Error -- /home/ssgroot/git/daos/build/external/dev/mercury/src/mercury.c:1490
 # HG_Addr_lookup2(): Could not lookup ofi+sockets://11.11.200.48:31416 (HG_INVALID_ARG) ************************************************************************************************
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_rpc.c:1038 crt_req_hg_addr_lookup() HG_Addr_lookup2() failed. uri=ofi+sockets://11.11.200.48:31416, hg_ret=11 **********************************
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_rpc.c:1133 crt_req_send_internal() crt_req_hg_addr_lookup() failed, rc -1020, opc: 0x1010003.
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_rpc.c:1234 crt_req_send(0x1f7ea90) [opc=0x1010003 (DAOS) rpcid=0x636fb8e100000000 rank:tag=1:0] crt_req_send_internal() failed, DER_HG(-1020): 'Transport layer mercury error'
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_rpc.c:1580 timeout_bp_node_exit(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] exiting the timeout binheap.
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:629 crt_req_timeout_untrack(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 4.
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:1017 crt_context_req_untrack(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 3.
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  ERR  src/cart/crt_context.c:309 crt_rpc_complete(0x1f7ea90) [opc=0x1010003 (DAOS) rpcid=0x636fb8e100000000 rank:tag=1:0] failed, DER_HG(-1020): 'Transport layer mercury error'
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:316 crt_rpc_complete(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] Invoking RPC callback (rank 1 tag 0) rc: DER_HG(-1020): 'Transport layer mercury error'
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_context.c:321 crt_rpc_complete(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 2.
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_rpc.c:1260 crt_req_send(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 1.
02/03-01:04:32.78 test62 DAOS[29842/29842] mgmt DBUG src/mgmt/cli_mgmt.c:808 dc_mgmt_get_pool_svc_ranks() a9177073: daos_rpc_send_wait() failed, DER_HG(-1020): 'Transport layer mercury error'
02/03-01:04:32.78 test62 DAOS[29842/29842] rpc  DBUG src/cart/crt_rpc.c:537 crt_req_decref(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] decref to 0.
02/03-01:04:32.78 test62 DAOS[29842/29842] hg   DBUG src/cart/crt_hg.c:971 crt_hg_req_destroy(0x1f7ea90) [opc=0x1010003 rpcid=0x636fb8e100000000 rank:tag=1:0] destroying
 
02/03-01:04:32.78 test62 DAOS[29842/29842] crt  ERR  src/cart/crt_init.c:537 crt_finalize() cannot finalize, current ctx_num(1).    ***********************************
02/03-01:04:32.78 test62 DAOS[29842/29842] crt  ERR  src/cart/crt_init.c:596 crt_finalize() crt_finalize failed, rc: -1001.
02/03-01:04:32.78 test62 DAOS[29842/29842] client ERR  src/client/api/event.c:147 daos_eq_lib_fini() failed to shutdown crt: DER_NO_PERM(-1001): 'Operation not permitted'
02/03-01:04:32.78 test62 DAOS[29842/29842] client ERR  src/client/api/init.c:267 daos_fini() failed to finalize eq: DER_NO_PERM(-1001): 'Operation not permitted' ******************

I cannot find any documentation in the Deployment Guide about ofi+tcp;ofi_rxm settings on the server side and on the client side.   Perhaps, I missed some settings in some .yml file.


Thanks
Ping


 


Re: Client application single value KV Put high latency using multiple threads (pthread)

Lombardi, Johann
 

One roundtrip from client to leader and then one from leader to each other replica (i.e. one roundtrip for 2-way replication since the leader is a replica). Please check Liang’s paper (i.e. https://link.springer.com/chapter/10.1007/978-3-030-63393-6_22) for more information.

I am still interested in the type of network that you use. 40Gbps Ethernet with ofi+sockets provider? If so, you may want to try with ofi+tcp;ofi_rxm too.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "ping.wong via groups.io" <ping.wong@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Tuesday 2 February 2021 at 16:30
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] Client application single value KV Put high latency using multiple threads (pthread)

 

Hi Johann,

Thanks for the tips.  Could you tell me how many roundtrip RPCs are involved from the client's perspective?  Also, how many RPCs are involved between the leader and the replica?

Thanks
Ping

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: Client application single value KV Put high latency using multiple threads (pthread)

ping.wong@...
 

Hi Johann,

Thanks for the tips.  Could you tell me how many roundtrip RPCs are involved from the client's perspective?  Also, how many RPCs are involved between the leader and the replica?

Thanks
Ping


Re: Client application single value KV Put high latency using multiple threads (pthread)

Lombardi, Johann
 

Hi Ping,

 

Those latency numbers are indeed way higher than what we expect/see. Could you please advise what type of network and network provider you are using?
If you are on a recent master, could you please create a pool and give a try to “daos pool autotest --pool $PUUID”?

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "ping.wong via groups.io" <ping.wong@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Sunday 31 January 2021 at 22:16
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] Client application single value KV Put high latency using multiple threads (pthread)

 

Hi all,

To evaluate replication performance, I write a client application with multiple pthreads (schedule to run on different cores if possible) using daos_kv_put with async event API in a 2-servers cluster.

One server has 44 cores and the other server has 88 cores.
Client is running on a different node with 48 cores.  The leader server replicates to the replica server.  I notice that the leader role switches between the two servers.

To find out why the client has high latency, I added some timing counters to track the duration of KV Puts in the io servers (please refer to the third column of the table below).

The client test application calls daos_kv_put(oh, DAOS_TX_NONE, 0, key, buf_size, buf, &ev) with async event 
then calls daos_event_test(&ev, DAOS_EQ_WAIT, &ev_flag) for IO completion

Each pthread writes to different key for the same object for 1000 values (4K each); hence 1000 calls to daos_kv_put
Increasing number of threads in the client application, I observed a higher latency from the client's perspective (see table below)

In the 100 threads test case, io server has higher latency as well.  

Am I missing something critical?
Is the overhead caused by the daos library?
Is there a way to improve the application latency overhead?

Thanks
Ping


Number client threads | Number of daos_kv_put | daos_io_server average put duration | client average put duration |

-----------------------------------------------------------------------------------------------------------------------------------------------------------

        5                          |      1,000                         |           0.28 ms                                     |                   1.05 ms             |

-----------------------------------------------------------------------------------------------------------------------------------------------------------

        10                        |      1,000                         |           0.27 ms                                     |                   1.65 ms             |

-----------------------------------------------------------------------------------------------------------------------------------------------------------

        15                        |      1,000                         |           0.41 ms                                     |                   2.20 ms             |

-----------------------------------------------------------------------------------------------------------------------------------------------------------

        20                        |      1,000                         |           0.48 ms                                     |                   2.86 ms             |

-----------------------------------------------------------------------------------------------------------------------------------------------------------

        100                      |      1,000                         |           7.02 ms                                     |                 11.45 ms             |

-----------------------------------------------------------------------------------------------------------------------------------------------------------



 

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

81 - 100 of 1420