Re: Startup Errors


Nabarro, Tom
 

Hello Neale,

 

First of all is there any chance you can try using a more recent version, 1.1.3 for example.

 

How are you launching daos_server? Using the start command directly from the commandline or systemd or other?

 

Is there any firewall blocking the traffic on port 10001?

 

Note that you can format storage across all your hosts in parallel by populating the "hostlist" parameter in the "daos_control.yml" dmg config file https://daos-stack.github.io/admin/deployment/#daos-server-remote-access , then you simply run dmg -i storage format (once you have the above issue sorted out).

 

Maybe to start, run the server and dmg commands on the same (single) host, also please paste your server config file.

 

Regards,

Tom

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Petrillo, Neale A. (Contractor) via groups.io
Sent: Wednesday, February 24, 2021 9:00 PM
To: daos@daos.groups.io
Subject: [daos] Startup Errors

 

Hello Group! 

 

I'm having some trouble getting my new DAOS cluster working. I've installed 6 servers all with the 1.0.1 RPMs. When I do a 'dmg storage format' from my test host, I get the following output:

 

[root@head ~]# dmg -i -l <host01>:10001 storage format

ERROR: <host01>:10001: socket connection is not active (TRANSIENT_FAILURE)

ERROR: dmg: no active connections

[root@head ~]# dmg -i -l <host01> system query

ERROR: <host01>:10001: socket connection is not active (TRANSIENT_FAILURE)

ERROR: dmg: no active connections

 

I'm also seeing these errors in the log files:

 

INFO 2021/02/18 10:40:15 DAOS I/O Server instance 0 storage not ready: context canceled

INFO 2021/02/18 10:40:19 SCM format required on instance 1

INFO 2021/02/18 10:40:19 DAOS I/O Server instance 1 storage not ready: context canceled

INFO 2021/02/18 10:40:19 DAOS Control Server (pid 9993) shutting down

ERROR 2021/02/18 10:40:54 /usr/bin/daos_admin EAL: No free hugepages reported in hugepages-1048576kB

INFO 2021/02/18 10:41:00 DAOS Control Server (pid 11507) listening on 0.0.0.0:10001

INFO 2021/02/18 10:41:00 Waiting for DAOS I/O Server instance storage to be ready...

INFO 2021/02/18 10:41:04 SCM format required on instance 0

 

Configuration files are attached. Any help would be appreciated! 

Neale

 

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Join daos@daos.groups.io to automatically receive all group messages.