Topics

i/O Timeout on dmg storage query


Lombardi, Johann
 

Hi,

 

Maybe you have a firewall running on the VM? You could maybe try to run the containers with --network=host to use the host network.

Please note that you will also have to open up ports 31416 and 31417 for the engine/io_server to process incoming requests.

 

On my side, when I run multiple docker containers on the same node, I create a daosnet network (i.e. docker network create -d bridge daosnet) and then add “--network daosnet” to each “docker run” command line. That being said, I haven’t tried to run the containers on different nodes/VMs yet.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "asharma@..." <asharma@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Wednesday 10 February 2021 at 19:45
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] i/O Timeout on dmg storage query

 

[Edited Message Follows]

Hi Team,
I have a simple setup of 2 VM on the cloud. I have DAOS set up on both of them as Docker containers. One of the containers is running a server which I started with a daos_server_local.yml file. Just changed the access_point to the domain name of my VM host running the server. 
On the other Container, I have the DAOS agent installed with the sample daos_agent.yml config. I have also set up the daos_control.yml to have a host list with the server VM's domain name. My architecture looks something like the below picture. 



on running a simple query: dmg storage query usage
I am running the above query from inside the container running on the client VM

I get the following error:

Errors:

  Hosts                              Error                                                                                                                                      

  -----                              -----                                                                                                                                      

  letldaos.eastus.cloudapp.azure.com rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp 138.91.118.108:10001: i/o timeout" 


Note: The same query command runs absolutely fine if I run a daos server inside the client container with the access_point set to 'localhost'
I guess I am missing something really small on my part. In the beginning, I thought it was just that I do no have the port:10001 exposed to my host from inside my container. I tried running the container with -p 10001:10001 but still, it gives the same error. I am kind of out of ideas to try things here. Can someone please suggest to me a possible solution?


I am attaching my config files for reference. I have set allow_insecure: true in all 3 files

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.