Re: Install problem
Hi Bill,
I’m able to reproduce the issue. Now it’s just a matter of figuring out why it is happening. I will file a ticket on it.
Thanks, Jeff
From: daos@daos.groups.io [mailto:daos@daos.groups.io]
On Behalf Of Bill Katz
Sent: Monday, February 3, 2020 12:37 PM To: daos@daos.groups.io Subject: Re: [daos] Install problem
Thanks for the reply Jeff. I do not have libfabric-devel installed. I am installing from master. The command I ran is: docker build -t daos -f Dockerfile.centos.7 github.com/daos-stack/daos#:utils/docker |
|
Re: Pool Service List
Cain, Kenneth C
Hi Colin,
In the future the client will connect to the management service that (already today) maintains a key-value store mapping a pool UUID key to the essential pool information (of type struct pool_rec) such as the number of pool service replicas and their ranks. When the management service responds to the client with this information, the client can then proceed to find (among this modest sized list) the current pool service leader. So the clients will not need to search through the entire set of DAOS servers in the system to find the pool service.
Today the management service when serving a list-pools request will iterate through the key-value store and return the records directly to the client. On pool creation the management service establishes a record for the pool, and also selects a set of pool service replica ranks among the set of DAOS servers that will provide storage for the pool. In this initial state the management service is now ready to directly respond to clients with the information. When changes to the pool occur such as addition of a new pool service replica, or removal/replacement of a service replica then the pool’s entry in the key-value store will need to be updated with the latest list of pool service replica ranks. Keeping the management and pool service in sync for these changes is on the to-do list, part of the work that will be done when removing the requirement that applications/users remember and provide the pool service replica rank list.
Thanks,
Ken
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Colin Ngam
Sent: Monday, February 3, 2020 6:48 PM To: daos@daos.groups.io Subject: Re: [daos] Pool Service List
Hi Ken,
Thanks for the info.
Given just a Pool’s UUID, the client library (in the future) has to call each rank/daos_server until it hits 1 that is the Pool’s service replica? I assume the management code now has to hit every rank/daos_server to get all the pools in the system?
Thanks.
Colin
From: <daos@daos.groups.io> on behalf of "Cain, Kenneth C" <kenneth.c.cain@...>
Hello Colin,
Here is a summary:
It is currently the application’s responsibility to remember this initial list of service replica ranks. The dmg utility has a command “system list-pools” that requests the DAOS management service return a list of all pools in the DAOS system, and for each its current list of pool service replica ranks. This can be useful when dmg pool create may have been performed earlier but without recording the pool UUID and/or the list of replica ranks. The development plan includes implementing, within the client library, an automatic interaction with the DAOS management service to, given a pool UUID, retrieve its current list of service replicas. Once this is implemented then there will no longer be a need for the application/user to remember and provide the list of service ranks returned by pool create.
Here is a little more information about how the current DAOS client library works, dynamically maintaining a cached list of pool service replica ranks, starting from that initial list (for now) remembered and supplied by the application/administrator.
Ken
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Colin Ngam
Hi,
The create pool api returns the pool service list. This list is passed in to the pool connect.
Is it the application responsibility to remember this list? Or can you get the list given the pool’s uuid anytime?
Thanks.
Colin |
|
Re: Pool Service List
Colin Ngam
Hi Ken,
Thanks for the info.
Given just a Pool’s UUID, the client library (in the future) has to call each rank/daos_server until it hits 1 that is the Pool’s service replica? I assume the management code now has to hit every rank/daos_server to get all the pools in the system?
Thanks.
Colin
From: <daos@daos.groups.io> on behalf of "Cain, Kenneth C" <kenneth.c.cain@...>
Hello Colin,
Here is a summary:
It is currently the application’s responsibility to remember this initial list of service replica ranks. The dmg utility has a command “system list-pools” that requests the DAOS management service return a list of all pools in the DAOS system, and for each its current list of pool service replica ranks. This can be useful when dmg pool create may have been performed earlier but without recording the pool UUID and/or the list of replica ranks. The development plan includes implementing, within the client library, an automatic interaction with the DAOS management service to, given a pool UUID, retrieve its current list of service replicas. Once this is implemented then there will no longer be a need for the application/user to remember and provide the list of service ranks returned by pool create.
Here is a little more information about how the current DAOS client library works, dynamically maintaining a cached list of pool service replica ranks, starting from that initial list (for now) remembered and supplied by the application/administrator.
Ken
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Colin Ngam
Sent: Saturday, February 1, 2020 12:23 PM To: daos@daos.groups.io Subject: [daos] Pool Service List
Hi,
The create pool api returns the pool service list. This list is passed in to the pool connect.
Is it the application responsibility to remember this list? Or can you get the list given the pool’s uuid anytime?
Thanks.
Colin |
|
Re: Pool Service List
Cain, Kenneth C
Hello Colin,
Here is a summary:
It is currently the application’s responsibility to remember this initial list of service replica ranks. The dmg utility has a command “system list-pools” that requests the DAOS management service return a list of all pools in the DAOS system, and for each its current list of pool service replica ranks. This can be useful when dmg pool create may have been performed earlier but without recording the pool UUID and/or the list of replica ranks. The development plan includes implementing, within the client library, an automatic interaction with the DAOS management service to, given a pool UUID, retrieve its current list of service replicas. Once this is implemented then there will no longer be a need for the application/user to remember and provide the list of service ranks returned by pool create.
Here is a little more information about how the current DAOS client library works, dynamically maintaining a cached list of pool service replica ranks, starting from that initial list (for now) remembered and supplied by the application/administrator.
Ken
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Colin Ngam
Sent: Saturday, February 1, 2020 12:23 PM To: daos@daos.groups.io Subject: [daos] Pool Service List
Hi,
The create pool api returns the pool service list. This list is passed in to the pool connect.
Is it the application responsibility to remember this list? Or can you get the list given the pool’s uuid anytime?
Thanks.
Colin |
|
Re: Install problem
Bill Katz <bkatz@...>
Thanks for the reply Jeff. I do not have libfabric-devel installed. I am installing from master. The command I ran is:
docker build -t daos -f Dockerfile.centos.7 github.com/daos-stack/daos#:utils/docker |
|
Pool Service List
Colin Ngam
Hi,
The create pool api returns the pool service list. This list is passed in to the pool connect.
Is it the application responsibility to remember this list? Or can you get the list given the pool’s uuid anytime?
Thanks.
Colin |
|
Re: Install problem
Hi Bill,
Can you inform what version of daos you are using? Is it latest master? Also, do you have libfabric-devel package installed (DAOS doesn’t need this package to be installed). Also, what build command are you using? If you have a build log, that would also be helpful.
-Jeff
From:
<daos@daos.groups.io> on behalf of Bill Katz <bkatz@...>
Hi there. I’m attempting to do an install into a Docker container running on top of CentOS 7 host. Close to the end of the process, I get the errors below and the install process aborts. Can someone shed any light on what the cause might be, and how to get around it?
Jan 30 13:27:28 daos1 journal: //usr/lib/libna.so.2: undefined reference to `fi_dupinfo@...' Jan 30 13:27:28 daos1 journal: //usr/lib/libna.so.2: undefined reference to `fi_freeinfo@...' Jan 30 13:27:28 daos1 journal: //usr/lib/libna.so.2: undefined reference to `fi_getinfo@...' Jan 30 13:27:28 daos1 journal: collect2: error: ld returned 1 exit status Jan 30 13:27:28 daos1 journal: scons: building terminated because of errors. Jan 30 13:27:28 daos1 journal: scons: *** [build/src/tests/suite/io_conf/daos_gen_io_conf] Error 1 Jan 30 13:27:28 daos1 systemd: Scope libcontainer-29427-systemd-test-default-dependencies.scope has no PIDs. Refusing. Jan 30 13:27:28 daos1 systemd: Scope libcontainer-29427-systemd-test-default-dependencies.scope has no PIDs. Refusing. Jan 30 13:27:28 daos1 systemd: Created slice libcontainer_29427_systemd_test_default.slice. Jan 30 13:27:28 daos1 systemd: Removed slice libcontainer_29427_systemd_test_default.slice. Jan 30 13:27:28 daos1 systemd: Scope libcontainer-29443-systemd-test-default-dependencies.scope has no PIDs. Refusing. Jan 30 13:27:28 daos1 systemd: Scope libcontainer-29443-systemd-test-default-dependencies.scope has no PIDs. Refusing. Jan 30 13:27:28 daos1 systemd: Created slice libcontainer_29443_systemd_test_default.slice. Jan 30 13:27:28 daos1 systemd: Removed slice libcontainer_29443_systemd_test_default.slice. Jan 30 13:27:28 daos1 dockerd-current: time="2020-01-30T13:27:28.521914798-08:00" level=error msg="containerd: deleting container" error="exit status 1: \"container d64664c59abb475e8ee16508740f26487fb9dd2b070ff32779c2273a2ff5ea61 is not exist\\none or more of the container deletions failed\\n\"" Jan 30 13:27:28 daos1 NetworkManager[1929]: <info> [1580419648.5699] manager: (veth8f6c8cc): new Veth device (/org/freedesktop/NetworkManager/Devices/42) Jan 30 13:27:28 daos1 kernel: docker0: port 1(veth6dc55af) entered disabled state Jan 30 13:27:28 daos1 kernel: docker0: port 1(veth6dc55af) entered disabled state Jan 30 13:27:28 daos1 kernel: device veth6dc55af left promiscuous mode Jan 30 13:27:28 daos1 kernel: docker0: port 1(veth6dc55af) entered disabled state Jan 30 13:27:28 daos1 NetworkManager[1929]: <info> [1580419648.6321] device (veth6dc55af): released from master device docker0 Jan 30 13:27:31 daos1 dockerd-current: time="2020-01-30T13:27:31.102343294-08:00" level=warning msg="d64664c59abb475e8ee16508740f26487fb9dd2b070ff32779c2273a2ff5ea61 cleanup: failed to unmount secrets: invalid argument" Jan 30 13:28:17 daos1 dbus[1857]: [system] Activating service name='org.freedesktop.problems' (using servicehelper) Jan 30 13:28:17 daos1 dbus[1857]: [system] Successfully activated service 'org.freedesktop.problems'
Thanks, Bill
|
|
Install problem
bkatz@...
Hi there. I’m attempting to do an install into a Docker container running on top of CentOS 7 host. Close to the end of the process, I get the errors below and the install process aborts. Can someone shed any light on what the cause might be, and how to get around it?
Jan 30 13:27:28 daos1 journal: //usr/lib/libna.so.2: undefined reference to `fi_dupinfo@...' Jan 30 13:27:28 daos1 journal: //usr/lib/libna.so.2: undefined reference to `fi_freeinfo@...' Jan 30 13:27:28 daos1 journal: //usr/lib/libna.so.2: undefined reference to `fi_getinfo@...' Jan 30 13:27:28 daos1 journal: collect2: error: ld returned 1 exit status Jan 30 13:27:28 daos1 journal: scons: building terminated because of errors. Jan 30 13:27:28 daos1 journal: scons: *** [build/src/tests/suite/io_conf/daos_gen_io_conf] Error 1 Jan 30 13:27:28 daos1 systemd: Scope libcontainer-29427-systemd-test-default-dependencies.scope has no PIDs. Refusing. Jan 30 13:27:28 daos1 systemd: Scope libcontainer-29427-systemd-test-default-dependencies.scope has no PIDs. Refusing. Jan 30 13:27:28 daos1 systemd: Created slice libcontainer_29427_systemd_test_default.slice. Jan 30 13:27:28 daos1 systemd: Removed slice libcontainer_29427_systemd_test_default.slice. Jan 30 13:27:28 daos1 systemd: Scope libcontainer-29443-systemd-test-default-dependencies.scope has no PIDs. Refusing. Jan 30 13:27:28 daos1 systemd: Scope libcontainer-29443-systemd-test-default-dependencies.scope has no PIDs. Refusing. Jan 30 13:27:28 daos1 systemd: Created slice libcontainer_29443_systemd_test_default.slice. Jan 30 13:27:28 daos1 systemd: Removed slice libcontainer_29443_systemd_test_default.slice. Jan 30 13:27:28 daos1 dockerd-current: time="2020-01-30T13:27:28.521914798-08:00" level=error msg="containerd: deleting container" error="exit status 1: \"container d64664c59abb475e8ee16508740f26487fb9dd2b070ff32779c2273a2ff5ea61 is not exist\\none or more of the container deletions failed\\n\"" Jan 30 13:27:28 daos1 NetworkManager[1929]: <info> [1580419648.5699] manager: (veth8f6c8cc): new Veth device (/org/freedesktop/NetworkManager/Devices/42) Jan 30 13:27:28 daos1 kernel: docker0: port 1(veth6dc55af) entered disabled state Jan 30 13:27:28 daos1 kernel: docker0: port 1(veth6dc55af) entered disabled state Jan 30 13:27:28 daos1 kernel: device veth6dc55af left promiscuous mode Jan 30 13:27:28 daos1 kernel: docker0: port 1(veth6dc55af) entered disabled state Jan 30 13:27:28 daos1 NetworkManager[1929]: <info> [1580419648.6321] device (veth6dc55af): released from master device docker0 Jan 30 13:27:31 daos1 dockerd-current: time="2020-01-30T13:27:31.102343294-08:00" level=warning msg="d64664c59abb475e8ee16508740f26487fb9dd2b070ff32779c2273a2ff5ea61 cleanup: failed to unmount secrets: invalid argument" Jan 30 13:28:17 daos1 dbus[1857]: [system] Activating service name='org.freedesktop.problems' (using servicehelper) Jan 30 13:28:17 daos1 dbus[1857]: [system] Successfully activated service 'org.freedesktop.problems'
Thanks, Bill
|
|
Re: Is there a DOAS public release / roadmap available on the git (as per DOAS brief refernece)?
Harms, Kevin
https://wiki.hpdd.intel.com/display/DC/Roadmap
kevin ________________________________________ From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Nitta Mackay, Dan <dan.nitta.mackay@...> Sent: Wednesday, January 29, 2020 8:31 AM To: daos@daos.groups.io Subject: [daos] Is there a DOAS public release / roadmap available on the git (as per DOAS brief refernece)? Hello Daos group id, I’ve been searching around on the github and can’t find the public roadmap details as referenced in the DAOS brief https://www.intel.com/content/dam/www/public/us/en/documents/solution-briefs/high-performance-storage-brief.pdf I’m thinking I’m just bad at finding it so if you can send the link that would help a lot as looking for the publicly available info. Cheers, Dan Nitta Mackay Intel Canada Ltd. D 613 576 1205 ; cell: 613-697-3506 Ottawa, Ontario Canada www.intel.com<http://www.intel.com/> |
|
Is there a DOAS public release / roadmap available on the git (as per DOAS brief refernece)?
Nitta Mackay, Dan
Hello Daos group id,
I’ve been searching around on the github and can’t find the public roadmap details as referenced in the DAOS brief https://www.intel.com/content/dam/www/public/us/en/documents/solution-briefs/high-performance-storage-brief.pdf
I’m thinking I’m just bad at finding it so if you can send the link that would help a lot as looking for the publicly available info.
Cheers,
Dan Nitta Mackay Intel Canada Ltd. D 613 576 1205 ; cell: 613-697-3506 Ottawa, Ontario Canada
|
|
Re: infinite loop in daos_test
Wang, Di
Hello,
This is a known issue. Ideally, SWIM suppose to detect these two dead servers, then DAOS should delete these servers from the system map, then MSR can skip these dead servers for pool creation.
But this is missing at the moment. I am not sure if this is already at someone’s plate, otherwise I will cook a patch.
Thanks
WangDi
From: <daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io> Date: Wednesday, January 29, 2020 at 10:21 AM To: "daos@daos.groups.io" <daos@daos.groups.io> Subject: [daos] infinite loop in daos_test I have run into an infinite-loop problem with daos_test. Is this already a known problem? If not, I’m willing to open a Jira on it, but I’d like some input first on how the code is intended to work. If there is a work-around, I’d be interested in that as well.
My config is verbs;ofi_rxm, 6 server nodes, 1 client node, fake SCM (ram), fake NVMe (file). I run daos_test by hand on the client node.
The infinite loop is in run_daos_degraded_test() in daos_test. Just prior to this test, the previous test intentionally kills two of the six daos_io_servers. The test startup code for run_daos_degraded_test() then tries to create a pool. This fails because the Management Service Replica is unable to communicate with the two dead servers, it reports “No route to host” in its log, which makes sense. It returns DER_UNREACH as the result of the failed RPC attempt to create a pool.
In the client, routine mgmt._rsvc_client_complete_rpc() calls rsvc_client_complete_rpc() which returns RSVC_CLIENT_PROCEED because it took the branch: } else if (hint == NULL || !(hint->sh_flags & RSVC_HINT_VALID)) { /* This may happen if the service wasn't found. */ D_DEBUG(DB_MD, "\"leader\" reply without hint from rank %u: " "rc_svc=%d\n", ep->ep_rank, rc_svc); return RSVC_CLIENT_PROCEED;
Because of the above, routine mgmt._rsvc_client_complete_rpc() then enters the following if statement: if (rc == RSVC_CLIENT_RECHOOSE || (rc == RSVC_CLIENT_PROCEED && daos_rpc_retryable_rc(rc_svc))) { rc = tse_task_reinit(task); if (rc != 0) return rc; return RSVC_CLIENT_RECHOOSE; } because DER_UNREACH is considered to be a retryable RC code by daos_rpc_retryable_rc(). The task gets rescheduled, the dc_pool_create() routine gets called again, this goes round and round forever.
Note that the DER_UNREACH is for one of the servers that the MSR is trying to contact, not the MSR itself. The RECHOOSE is not selecting a different server, it picks the same (only) MSR every time. Which is surprising to me at least.
So what is the bug exactly? Was the MSR supposed to go ahead and create the pool anyway with two missing servers? Is the DER_UNREACH the wrong error code for the MSR to return? Does it make sense for RECHOOSE to pick the same server over and over?
Comments welcome,
Kevan
|
|
infinite loop in daos_test
Kevan Rehm
I have run into an infinite-loop problem with daos_test. Is this already a known problem? If not, I’m willing to open a Jira on it, but I’d like some input first on how the code is intended to work. If there is a work-around, I’d be interested in that as well.
My config is verbs;ofi_rxm, 6 server nodes, 1 client node, fake SCM (ram), fake NVMe (file). I run daos_test by hand on the client node.
The infinite loop is in run_daos_degraded_test() in daos_test. Just prior to this test, the previous test intentionally kills two of the six daos_io_servers. The test startup code for run_daos_degraded_test() then tries to create a pool. This fails because the Management Service Replica is unable to communicate with the two dead servers, it reports “No route to host” in its log, which makes sense. It returns DER_UNREACH as the result of the failed RPC attempt to create a pool.
In the client, routine mgmt._rsvc_client_complete_rpc() calls rsvc_client_complete_rpc() which returns RSVC_CLIENT_PROCEED because it took the branch: } else if (hint == NULL || !(hint->sh_flags & RSVC_HINT_VALID)) { /* This may happen if the service wasn't found. */ D_DEBUG(DB_MD, "\"leader\" reply without hint from rank %u: " "rc_svc=%d\n", ep->ep_rank, rc_svc); return RSVC_CLIENT_PROCEED;
Because of the above, routine mgmt._rsvc_client_complete_rpc() then enters the following if statement: if (rc == RSVC_CLIENT_RECHOOSE || (rc == RSVC_CLIENT_PROCEED && daos_rpc_retryable_rc(rc_svc))) { rc = tse_task_reinit(task); if (rc != 0) return rc; return RSVC_CLIENT_RECHOOSE; } because DER_UNREACH is considered to be a retryable RC code by daos_rpc_retryable_rc(). The task gets rescheduled, the dc_pool_create() routine gets called again, this goes round and round forever.
Note that the DER_UNREACH is for one of the servers that the MSR is trying to contact, not the MSR itself. The RECHOOSE is not selecting a different server, it picks the same (only) MSR every time. Which is surprising to me at least.
So what is the bug exactly? Was the MSR supposed to go ahead and create the pool anyway with two missing servers? Is the DER_UNREACH the wrong error code for the MSR to return? Does it make sense for RECHOOSE to pick the same server over and over?
Comments welcome,
Kevan
|
|
Re: current DAOS master deadlocks in daos_test when using verbs;ofi_rxm
Oganezov, Alexander A
Thanks for info Kevan,
We will update it locally and once it passes internal testing we will make build.config update
~~Alex.
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Kevan Rehm
Sent: Tuesday, January 28, 2020 1:22 PM To: daos@daos.groups.io Subject: [daos] current DAOS master deadlocks in daos_test when using verbs;ofi_rxm
All,
There is a bug in the version of ofi that CaRT is picking up in its build.config file. A new pthread was added in verbs;ofi_rm that handles unmap memory events so that the NIC can be notified when the user unmaps memory that is registered with the NIC. See https://github.com/ofiwg/libfabric/issues/5580 for details on how the deadlock occurs, it happens every time in the Array test section.
Sean Hefty suggested updating ofi to https://github.com/ofiwg/libfabric/commit/3d01df7716d099ba222f99865345d6767ae9e686 in order to fix the problem. It was merged on Jan 18, 2020.
I did something slightly different, I downloaded a fresh daos, then quickly did a ‘git rebase master’ in the ofi subdirectory before the compilation began, and the problem is definitely fixed, daos_test no longer hangs at the same point.
cart/build.config needs to be updated to this new commit or newer in order to avoid the problem.
Regards, Kevan |
|
current DAOS master deadlocks in daos_test when using verbs;ofi_rxm
Kevan Rehm
All,
There is a bug in the version of ofi that CaRT is picking up in its build.config file. A new pthread was added in verbs;ofi_rm that handles unmap memory events so that the NIC can be notified when the user unmaps memory that is registered with the NIC. See https://github.com/ofiwg/libfabric/issues/5580 for details on how the deadlock occurs, it happens every time in the Array test section.
Sean Hefty suggested updating ofi to https://github.com/ofiwg/libfabric/commit/3d01df7716d099ba222f99865345d6767ae9e686 in order to fix the problem. It was merged on Jan 18, 2020.
I did something slightly different, I downloaded a fresh daos, then quickly did a ‘git rebase master’ in the ofi subdirectory before the compilation began, and the problem is definitely fixed, daos_test no longer hangs at the same point.
cart/build.config needs to be updated to this new commit or newer in order to avoid the problem.
Regards, Kevan |
|
Re: does this IB problem look familiar to anyone?
Kevan Rehm
All,
This is to report back on the infiniband failures seen in my original email below.
The bottom line is, if a DAOS client is using ‘verbs;ofi_rxm’ as the fabric transport, and if there is any chance that the client’s node will have hugepages configured, then the environment variable RDMAV_HUGEPAGES_SAFE must be set or the program will fail like mine did below.
The problem was originally reported on April 2019 with libfabric issue 4969. Also see issue 4974. The same issue was raised with Mercury issue 280. Some code changes were made with libfabric PR 4973, but the end result is still that RDMAV_HUGEPAGES_SAFE is needed to prevent client failures when hugepages are present.
The problem arises in libibverbs.so and how it is called from ofi_rxm. It has to do with providing fork safety in verbs, see man page ibv_fork_init(3) for details. The requirement for the environment variable COULD be avoided if fork safety was not needed, but that is not an option for DAOS because MPI_Init() (in my case openmpi3) calls ibv_fork_init multiple times early in the client program, that is the function that enables fork safety. The problem does not occur in DAOS servers because DPDK calls a function that executes before main() which sets RDMAV_HUGEPAGES_SAFE=1 and then calls ibv_fork_init(), no way to disable fork safety there either (although it could reduce latency, see below).
When fork safety is enabled, the routine ibv_reg_mr() for registering memory must internally make a madvise(addr, length, MADV_DONTFORK) system call to tell the kernel not to clone the registered memory pages to a forked client process in order to prevent the possibility of that memory being moved in the parent during the fork process, causing silent memory corruption. The addr and length parameters must be page-aligned, or the request will fail. That is what caused the failure in my case.
One can call ibv_reg_mr() with an address and length that fall on any byte boundary. Ibv_reg_mr has code that will round down the address to a page boundary, and round up the length to a addr+length page boundary as well, I will call that “rounding out” here. The code works fine for regular memory where it is doing 4 KiB rounding out, but it does not work for hugepages, which might be 2 MiB aligned, or perhaps some other alignment. RDMAV_HUGEPAGES_SAFE was added to deal with this, ibv_fork_init checks for this variable and sets a huge_pages_enabled flag so that ibv_reg_mr() knows that it must check the addr/length fields to see if they are hugepages and round out accordingly.
The RDMAV_HUGEPAGES_SAFE solution comes at high cost, it adds three system calls to each ibv_reg_mr() memory registration. The code has to open(2) file /proc/<my-pid>/smaps, read(2) the file, parse the lines to find the entry which includes the specified address/length byte range, then parse the following lines to determine the page size used by that range. close(2) must be called as well.
This additional overhead doesn’t matter much during process initialization when registering buffers in the memory registration cache in ofi_rxm, but in cases where memory must be registered at the time of an I/O request it adds directly to the latency of the I/O. See April 12, 2019 comment in PR 4973 from James Swaro where he documents up to a 10X increase in latency when RDMAV_HUGEPAGES_SAFE is used.
I have looked at the code in ofi_rxm, and it looks like this issue could be fixed, such that RDMAV_HUGEPAGES_SAFE use and the resulting latency increase could be avoided, at least in DAOS clients. The problem is in routine ofi_bufpool_grow() in libfabrics file prov/util/src/util_buf.c. If the call to ofi_alloc_hugepage_buf() succeeds, then the resulting buf_region->alloc_region and pool->alloc_size are properly hugepage-aligned. If those values would be used in the ibv_reg_mr() call, the default 4 KiB rounding-out done by that code would be a no-op, and so the memory registration would succeed without having to set RDMAV_HUGEPAGES_SAFE. Instead, the routine uses buf_region->mem_region = buf_region->alloc_region + pool->entry_size; as the starting range address, which is not hugepage-aligned. The 4 KiB rounding-out done by ibv_reg_mr() will not result in correct hugepage alignment.
Specifically, the line ret = pool->attr.alloc_fn(buf_region); calls routine rxm_buf_reg(); the fix is to change the call to function rxm_msg_mr_reg_internal() to use pool->alloc_region and pool->alloc_size as parameters in place of region->mem_region and region->pool->region_size. There should be no harm in registering the additional pool->entry_size bytes at the beginning of pool->alloc_region even though they will not be used for I/O.
I’m interested in hearing any feedback on this before I pursue this proposed change with the libfabric community.
Kevan
From:
<daos@daos.groups.io> on behalf of Kevan Rehm <kevan.rehm@...>
Still no luck getting infiniband to work. I rebased to commit b5b51a329bdb5afa50eef2a948fa2ed7f7cf49fc, and am now seeing new, different problems. This post is just in case someone recognizes what I am seeing and could point me in the right direction.
I built and installed daos on all the server and client nodes. The NICs are mlx4. Servers come up fine, storage (fake SCM, fake NVMe) is formatted, servers are waiting for clients to connect. When I start daos_test on a client node, I get an immediate infinite loop in daos_test, daos.log grows extremely rapidly (debug-enabled). I believe there are several cascading problems. Here are the interesting daos.log bits:
daos@hl-d109 /tmp $ grep 'fi_tsend(' daos.log | more # na_ofi_msg_send_unexpected(): fi_tsend() unexpected failed, rc: -12(Cannot allocate memory) # na_ofi_msg_send_unexpected(): fi_tsend() unexpected failed, rc: -113(No route to host) # na_ofi_msg_send_unexpected(): fi_tsend() unexpected failed, rc: -113(No route to host) # na_ofi_msg_send_unexpected(): fi_tsend() unexpected failed, rc: -113(No route to host) # na_ofi_msg_send_unexpected(): fi_tsend() unexpected failed, rc: -113(No route to host)
The first fi_tsend() call in Mercury failed because the ofi_rxm layer first tried to allocate and register memory so that it could post a receive buffer for the results of the connect call to the server. It calls fi_mr_reg() for this. Eventually this bubbles down to a ibv_reg_mr() call, which returns NULL, and errno has the value EINVAL. This problem is my first focus, anyone know what why ibv_reg_mr() might fail? I can’t breakpoint into it, it appears to be in libibverbs.so.
The failure calling sequence is fi_tsend -> rxm_ep_tsend -> rxm_ep_prepare_tx -> rxm_cmap_connect -> rxm_conn_connect -> rxm_msg_ep_open -> rxm_msg_ep_prepost_recv -> rxm_rx_buf_alloc -> ofi_buf_alloc -> ofi_bufpool_grow -> rxm_buf_reg -> rxm_msg_mr_reg_internal -> fi_mr_reg -> fi_ibv_mr_cache_reg -> ofi_mr_cache_reg -> fi_ibv_mr_cache_add_region -> fi_ibv_mr_reg_common -> ibv_reg_mr
Next problem: the error bubbles back up to rxm_cmap_connect in the RXM_CMAP_IDLE branch of the switch statement. ‘ret’ is non-zero, so the handle state is updated to RXM_CMAP_SHUTDOWN by routine rxm_cmap_del_handle, and the handle is scheduled for deletion. This seems reasonable, but the higher code levels don’t seem to notice the failure, and so fi_tsend gets called again.
This time when rxm_ep_prepare_tx get called, routine rxm_cmap_acquire_handle returns NULL because the handle was deleted, so the routine returns -FI_EHOSTUNREACH. Mercury keeps calling fi_tsend over and over forever, not recognizing the failure. Seems like this is a Mercury bug, libfabric failures aren’t being seen and handled?
Regards, Kevan
|
|
Re: DAOS Agent: connection refused
nfortne2@...
Patrick,
I tried removing all specification of the working directory, socket file, and log file from the agent and server, these things then being placed in /var/run/daos_agent/server, same result. David, Here is the output of the server. It looks like it starts successfully: :~/daos-vol/build> "/home/nfortne2/spack/opt/spack/linux-opensuse_tumbleweed20200105-ivybridge/gcc-9.2.1/openmpi-3.1.5-m4l3bopzh77d6lwnjzvfppwx5zg3chbl/bin/orterun" "-n" "1" "--enable-recovery" "-x" "DAOS_DISABLE_REQ_FWD=1" "/home/nfortne2/spack/opt/spack/linux-opensuse_tumbleweed20200105-ivybridge/gcc-9.2.1/daos-master-msndixzwmrrej4jnywawwqxb2nyhy2ae/bin/daos_server" "start" "--recreate-superblocks" "-o" "/home/nfortne2/daos-vol/build/test/scripts/daos_server.yml" /home/nfortne2/spack/opt/spack/linux-opensuse_tumbleweed20200105-ivybridge/gcc-9.2.1/daos-master-msndixzwmrrej4jnywawwqxb2nyhy2ae/bin/daos_server logging to file /home/nfortne2/daos-vol/build/test/daos_control.log DEBUG 16:24:41.850690 start.go:142: Switching control log level to DEBUG DEBUG 16:24:41.850769 netdetect.go:768: Calling ValidateProviderConfig with lo, ofi+sockets DEBUG 16:24:41.850793 netdetect.go:812: Input provider string: ofi+sockets DEBUG 16:24:41.857843 netdetect.go:844: There are 0 hfi1 devices in the system DEBUG 16:24:41.857886 netdetect.go:634: NUMA Node data is unavailable. Using NUMA 0 DEBUG 16:24:41.857906 netdetect.go:634: NUMA Node data is unavailable. Using NUMA 0 DEBUG 16:24:41.857921 netdetect.go:634: NUMA Node data is unavailable. Using NUMA 0 DEBUG 16:24:41.857946 netdetect.go:634: NUMA Node data is unavailable. Using NUMA 0 DEBUG 16:24:41.857968 netdetect.go:634: NUMA Node data is unavailable. Using NUMA 0 DEBUG 16:24:41.857988 netdetect.go:634: NUMA Node data is unavailable. Using NUMA 0 DEBUG 16:24:41.858006 netdetect.go:783: Device lo supports provider: ofi+sockets DEBUG 16:24:41.858434 config.go:378: Active config saved to /home/nfortne2/daos-vol/build/test/scripts/.daos_server.active.yml (read-only) DEBUG 16:24:41.858607 server.go:117: automatic NVMe prepare req: {ForwardableRequest:{Forwarded:false} HugePageCount:1 PCIWhitelist: TargetUser:nfortne2 ResetOnly:false} ERROR: automatic NVMe prepare failed (check configuration?) SPDK setup reset: spdk reset failed (): must be run with root privileges DEBUG 16:24:41.858759 class.go:199: spdk : bdev_list empty in config, no nvme.conf generated for server EAL: No free hugepages reported in hugepages-2048kB EAL: No free hugepages reported in hugepages-2048kB EAL: FATAL: Cannot get hugepage information. EAL: Cannot get hugepage information. Failed to initialize DPDK DEBUG 16:24:41.865231 ctl_storage.go:101: Warning, NVMe Scan: failed to initialize SPDK: spdk_env_opts_init: 1 DEBUG 16:24:41.865477 ctl_storage.go:112: Warning, SCM Scan: failed to discover SCM modules: get_number_of_devices: rc=268 DAOS Control Server (pid 19684) listening on 0.0.0.0:10001 Waiting for DAOS I/O Server instance storage to be ready... DEBUG 16:24:41.865739 instance.go:180: /mnt/daos: checking formatting DEBUG 16:24:41.865762 ipmctl.go:102: discovered 0 DCPM modules DEBUG 16:24:41.865976 instance.go:198: /mnt/daos (ram) needs format: false DEBUG 16:24:41.866000 superblock.go:110: /mnt/daos: checking superblock DEBUG 16:24:41.866018 instance.go:180: /mnt/daos: checking formatting DEBUG 16:24:41.866216 instance.go:198: /mnt/daos (ram) needs format: false DEBUG 16:24:41.866408 superblock.go:114: /mnt/daos: needs superblock (doesn't exist) DEBUG 16:24:41.866936 superblock.go:159: creating /mnt/daos/superblock: (rank: 0, uuid: 056dfc0c-1116-471e-bd06-67a24b5f2fa4) DEBUG 16:24:41.867257 harness.go:209: starting instances SCM @ /mnt/daos: 8.00GB Total/8.00GB Avail DEBUG 16:24:41.867467 harness.go:191: bootstrapping system member: rank 0, addr [::1]:10001 DEBUG 16:24:41.867489 harness.go:229: waiting for instances to be ready DEBUG 16:24:41.867720 exec.go:113: daos_io_server:0 args: [-t 4 -x 0 -g daos_server -d /home/nfortne2/daos-vol/build/test -s /mnt/daos -i 19685 -I 0] DEBUG 16:24:41.867755 exec.go:114: daos_io_server:0 env: [FI_SOCKETS_CONN_TIMEOUT=2000 D_LOG_MASK=ERR ABT_ENV_MAX_NUM_XSTREAMS=100 ABT_MAX_NUM_XSTREAMS=100 DAOS_MD_CAP=1024 CRT_CTX_SHARE_ADDR=0 CRT_TIMEOUT=30 FI_SOCKETS_MAX_CONN_RETRY=1 D_LOG_FILE=/home/nfortne2/daos-vol/build/test/daos_server.log OFI_INTERFACE=lo CRT_PHY_ADDR_STR=ofi+sockets OFI_PORT=31416] Starting I/O server instance 0: /home/nfortne2/spack/opt/spack/linux-opensuse_tumbleweed20200105-ivybridge/gcc-9.2.1/daos-master-msndixzwmrrej4jnywawwqxb2nyhy2ae/bin/daos_io_server daos_io_server:0 Using legacy core allocation algorithm 4 target XS(xstream) requested (#cores 2); use (1) target XS DEBUG 16:24:42.149324 drpc_server.go:96: Creating session for connection DEBUG 16:24:42.149606 instance.go:232: DAOS I/O Server instance 0 ready: uri:"ofi+sockets://127.0.0.1:31416" nctxs:3 drpcListenerSock:"/home/nfortne2/daos-vol/build/test/daos_io_server_19691.sock" DEBUG 16:24:42.149709 harness.go:239: instance ready: uri:"ofi+sockets://127.0.0.1:31416" nctxs:3 drpcListenerSock:"/home/nfortne2/daos-vol/build/test/daos_io_server_19691.sock" DEBUG 16:24:42.185023 instance.go:342: create MS (bootstrap=true) DEBUG 16:24:42.432105 instance.go:355: start MS Management Service access point started (bootstrapped) DEBUG 16:24:42.432721 harness.go:263: monitoring instances daos_io_server:0 DAOS I/O server (v0.8.0) process 19691 started on rank 0 with 1 target, 0 helper XS per target, firstcore 0, host ummon.ad.hdfgroup.org. Thanks, -Neil |
|
Re: DAOS Agent: connection refused
Quigley, David
Based on the gRPC failure in the previous email the agent is functioning properly. It is saying that there is no server listening on localhost:10001 to receive the getAttachInfo call being made by the agent. Are you sure that daos_server is started fully? If you are not seeing any server logging it might be an indication that the server hasn’t started properly.
Dave
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Patrick Farrell
Sent: Tuesday, January 14, 2020 2:13 PM To: daos@daos.groups.io Subject: Re: [daos] DAOS Agent: connection refused
Hm, interesting. Not obvious to me what’s wrong, yeah. You might try turning on debug (though the agent connection debug is pretty limited), as documented in the Troubleshooting section of the manual, and seeing what turns up.
https://daos-stack.github.io/admin/troubleshooting/
(personally, I had better luck strac’ing the agent…)
BTW, it is not necessary to use orterun any more, daos_agent can be run directly. (or, I think, ever for daos_agent?)
-Patrick
From: <daos@daos.groups.io> on behalf of "nfortne2@..."
<nfortne2@...>
Patrick, |
|
Re: [External] Re: [daos] Does DAOS support infiniband now?
Oganezov, Alexander A
Hi Shengyu
Good to hear that cart/mercury issue is resolved. In general to avoid those issues it is best to remove _build_external/install if daos/utils/build.config changes as it would indicate changes to dependencies.
As for spdk/nvme issue, someone else from DAOS team would need to help, as I am not familiar with those.
Thanks, ~~Alex. From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of
Shengyu SY19 Zhang
Sent: Tuesday, January 21, 2020 11:24 PM To: daos@daos.groups.io Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
That resolved the issue, after completed removed the _build_external folder, those logs disappeared, while the NVMe issue after SPDK/DPDK removed still haven’t resolved (currently can use only one NVMe). How to avoid this type of issue? The way is remove the path after a period of time?
Regards, Shengyu.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Oganezov, Alexander A
Hi Shengyu,
After talking to mercury developer, it sounds like you still might have a mismatch between versions of cart and mercury that you are using. Can you do a fresh build of daos by first fully removing install/ and _build_external.Linux/ directories and see if that solves the issue?
Thanks, ~~Alex.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Alex,
Please see the log attached.
Regards, Shengyu.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Oganezov, Alexander A
Shengyu,
Can you provide full log from start until first occurrence of this error?
Thanks, ~~Alex.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Alex,
I have removed mercury, the io_server seems started normally ,however the daos_server.log increasing quickly to eat all my free space. It infinite repeat the three lines: 01/21-01:40:49.28 afa1 DAOS[72964/73009] hg ERR src/cart/crt_hg.c:1331 crt_hg_trigger() HG_Trigger failed, hg_ret: 18.
Regards, Shengyu,
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Oganezov, Alexander A
One other thing Shengyu,
Can you verify that your mercury build is up to date? Based on domain being used as “mlx5_0/192.168.80.161” it sounds like there is mismatch between what CaRT generates and what mercury consumes; there has been change few weeks ago regarding how domain is provided to mercury level, and it feels as if older mercury is being used.
To ensure clean mercury build remove libmercury* libna* from your install/lib location, remove _build.external-Linux/mercury directory and recompile daos with scons –build-deps=yes –config=force install
Thanks, ~~Alex.
From: Oganezov, Alexander A
Hi Shengyu,
What does this command return to you on the node where you see this strange domain name? fi_info --provider="verbs"
Thanks, ~~Alex.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Joel,
I have some more information, In the file na_ofi.c + 1609, the io_serve exit there, and on above code, na_ofi_verify_provider function will compare domain names, the domain is “mlx5_0/192.168.80.161”, while prov->domain_attr.name is “mlx5_0”, it will always return FALSE.
Regards, Shengyu
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Joel,
Network scan: Scanning fabric for YML specified provider: ofi+verbs;ofi_rxm
And log attached, it is standalone server. Additionally, I created two identical virtual machines with IB sr-iov pass-through, however one vm can’t start due to the same problem, while another can start normally. They were using ofi+verbs;ofi_rxm as provider.
Regards, Shengyu
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Rosenzweig, Joel B
Hello Shengyu,
It appears that your daos_server.yml is specifying the provider as “ofi+verbs” but I think it should be set to “ofi+verbs;ofi_rxm”. Can you configure your daos_server.yml with that and try again? And then, if things still do not work, then please provide:
1) the network scan output again after you make the provider change 2) the portion of the debug log that shows the environment variables being provided to the daos_io_server. This will show us what is being set for OFI_INTERFACE, OFI_DOMAIN in the daos_io_server environment. 3) the daos_server.yml so I can see how you have configured each daos_io_server.
Regards, Joel
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Johann,
Here is logs:
Scanning fabric for YML specified provider: ofi+verbs
And please see my inline comments.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi,
Cc’ing Joel. Those errors indicate that you still have some network setup issue. Could you please run daos_server network scan?
e.g.: [root@wolf-118 ~]# daos_server network scan -p "ofi+verbs;ofi_rxm" Scanning fabric for cmdline specified provider: ofi+verbs;ofi_rxm Fabric scan found 3 devices matching the provider spec: ofi+verbs;ofi_rxm
fabric_iface: ib0 provider: ofi+verbs;ofi_rxm pinned_numa_node: 0
fabric_iface: ib1 provider: ofi+verbs;ofi_rxm pinned_numa_node: 1
fabric_iface: eth0 provider: ofi+verbs;ofi_rxm pinned_numa_node: 0
As for the NVMe issue, do I understand correctly that: - The PCI addresses of the 8 NVMe SSDs show up fine via daos_server storage scan Yes.
-
You have reduced the number of huge pages to 4096 pages (8GB) and all the SPDK errors related to failed huge pages allocation are gone as well as this error from the log: Yes, only reporting pci addr not found and can’t start, then reduced nvme to 1 seems pass this step. - But the io_server fails to start after dmg storage format? Yes.
I had a second look at daos_control2.log and noticed that you are using 20 targets while you have 8 SSDs and 10 cores. Could you please try with #targets = #SSDs = 8 and set nr_xs_helpers to 0? This config file was copies from physical machine, maybe some problem, several days ago, it and physical server are working, tried your suggestion, still same while formatting and can’t start again. daos_io_server:0 Using legacy core allocation algorithm
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello Johann,
For the first issue (daos_control.log), it is physical machine and specified 8 NVMes, daos server report can’t find PCI address (created the log). Then I use only one NVMe, daos server can start normally, however it stopped after dmg storage format, just like the second issue (daos_control2.log).
DEBUG 02:18:19.457860 config.go:378: Active config saved to /root/daos/install/etc/.daos_server.active.yml (read-only)
Regards, Shengyu,
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi Shengyu,
Mike looked into the logs you provided and noticed that:
## ERROR: requested 40960 hugepages but only 8585 could be allocated. ## Memory might be heavily fragmented. Please try flushing the system cache, or reboot the machine.
Maybe you meant to specify 4096 in the yaml file instead of 40960 for nr_hugepages? It sounds like you are trying to allocate 82GB of RAM for hugepages.
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
There seems to be multiple messages about not being able to meet the required number of hugepages requested. I was not involved in the work but there have been changes in how huge page allocation is performed for each of the daos_io_server instances and we are now performing automatic storage prepare when starting daos_server.
What I would suggest is to reboot the nodes (in case the issues to do with memory fragmentation) and try with a smaller number of drives configured in the config file. Don’t try to do a storage prepare before running daos_server (as it will be performed on start-up anyway). And update to recent master before trying please. You could also try to bump the number of huge pages specified in the server config file to maybe 8192?
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello,
Sorry I forgot attachments.
Regards, Shengyu
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Johann,
I couldn’t check it at this time, since I can’t start daos server (newest code of master) on my two different machines, I assume there are lots of modification in control path, there could be some new issues: 1. Daos_server storage prepare –-nvme-only, all my NVMe disks switched to uio as expected, then issue storage scan, can see those disks as expected as well. However, when I start daos server, it report error that reporting can’t find PCI address, and all NVMs switched to kernel driver, see daos_control.log 2. On another machine, it just stopped after being formatted, see doas_control2.log
Regards, Shengyu,
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Thanks for the logs Shengyu. It does not seem to be related to wrong endpoint addresses. The client did find the server, but this latter returned DER_NONEXIST when connecting to the pool. It might be the same problem fixed recently by PR #1701. Could you please apply the patch or try with latest master?
Cheers, Johann
From:
<daos@daos.groups.io>
on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello Johann,
Here are log files. My step: 1. Create fresh environment and start daos server and then format it. 2. Create pool 3. Create container. 4. List container: daos container query –-svc=0 –path=/tmp/mycontainer, that work great. 5. CTRl +C to kill daos server 6. Restart daos server 7. Repeat 4, daos process will be dead with infinitely loop.
Regards, Shengyu.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hm, the issue with server restart might be due to the endpoint address of the servers not being persistent. Could you please collect full debug logs for the fresh start with reformat and the subsequent restart?
Cheers, Johann
From:
<daos@daos.groups.io>
on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello,
@Johann, Newest version will work on new formatted and created pool, yes, I did set in the yaml. @Kal, no, I still meet the issue after io_server restart, seems there issues after load existing data.
Regards, Shengyu. From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi Kevan & Shengyu,
Could you please advise what commit hash you use? Also, are you specifying in “fabric_iface_port” in the yaml file?
Cheers, Johann
From:
<daos@daos.groups.io>
on behalf of Kevan Rehm <kevan.rehm@...>
I have not had any success, I see the same failure sequence as Shengyu, and due to other commitments I've had to set this aside for a few days. Hope to get back to it in a week or so.
Kevan
On 1/8/20, 2:15 PM, "daos@daos.groups.io on behalf of Alfizah, Kurniawan" <daos@daos.groups.io on behalf of kurniawan.alfizah@...> wrote:
Hello Shengyu and Kevan, I'm wondering if you have resolved this problem and that DAOS is working well with IB.
Cheers, Kal
-----Original Message----- From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang Sent: Saturday, December 28, 2019 2:46 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Kevan,
Yes, its exact the same problem that I meet, the rdma_get_cm_event function will issue write system call to receive completing event from from ib device, however it get nothing at this time, and always return EAGAIN, that caused fi_tsend function infinitely loop.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm Sent: Saturday, December 28, 2019 12:30 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Shengyu,
I think we are going to need help from the experts, I am not familiar with this code. I tried the same commands that you mentioned in your last email, and they also hang for me. But I do not see an infinite loop; rather the daos process just hangs forever in a write() request. Is that what you see as well?
Experts: is there any documentation on CaRT, what it is for, internals? I have not been able to find anything.
The last entries in the daos_server_srv1.log file are just the ds_mgmt_drpc_get_attach_info() log messages, called from daos_agent.
While the daos command was hung, I sent a kill -6 signal to it to collect the corefile. It seems like the command has attempted to set up a MSG connection to the daos_io_server, but has not received a completion event. The dest_addr=0 looks a little suspicious in the fi_tsend() call. Hopefully others will recognize what the problem is in the backtrace below, otherwise I will keep digging as time permits.
Thanks, Kevan
(gdb) bt #0 0x00007fa7fd5749cd in write () from /lib64/libc.so.6 #1 0x00007fa7fac2d875 in rdma_get_cm_event.part.13 () from /lib64/librdmacm.so.1 #2 0x00007fa7fb7fd856 in fi_ibv_eq_read () from /home/users/daos/daos/install/lib/libfabric.so.1 #3 0x00007fa7fb82360f in rxm_eq_read () from /home/users/daos/daos/install/lib/libfabric.so.1 #4 0x00007fa7fb8252af in rxm_msg_eq_progress () from /home/users/daos/daos/install/lib/libfabric.so.1 #5 0x00007fa7fb82542d in rxm_cmap_connect () from /home/users/daos/daos/install/lib/libfabric.so.1 #6 0x00007fa7fb82b5eb in rxm_ep_tsend () from /home/users/daos/daos/install/lib/libfabric.so.1 #7 0x00007fa7fc59f3e8 in fi_tsend (context=0x1814558, tag=1, dest_addr=0, desc=0x1811ab0, len=332, buf=0x7fa7f6a10038, ep=0x17f03c0) at /home/users/daos/daos/install/include/rdma/fi_tagged.h:114 #8 na_ofi_msg_send_unexpected (na_class=0x17d6250, context=0x180f760, callback=<optimized out>, arg=<optimized out>, buf=0x7fa7f6a10038, buf_size=332, plugin_data=0x1811ab0, dest_addr=0x18ba5e0, dest_id=0 '\000', tag=1, op_id=0x1811a48) at /home/users/daos/daos/_build.external/mercury/src/na/na_ofi.c:3745 #9 0x00007fa7fc7b79ff in NA_Msg_send_unexpected (op_id=0x1811a48, tag=<optimized out>, dest_id=<optimized out>, dest_addr=<optimized out>, plugin_data=<optimized out>, buf_size=<optimized out>, buf=<optimized out>, arg=0x1811920, callback=0x7fa7fc7b9a60 <hg_core_send_input_cb>, context=<optimized out>, na_class=<optimized out>) at /home/users/daos/daos/_build.external/mercury/src/na/na.h:1506 #10 hg_core_forward_na (hg_core_handle=0x1811920) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:2076 #11 0x00007fa7fc7bb5e6 in HG_Core_forward (handle=0x1811920, callback=callback@entry=0x7fa7fc7b0890 <hg_core_forward_cb>, arg=arg@entry=0x1814730, flags=<optimized out>, payload_size=<optimized out>) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:4775 #12 0x00007fa7fc7b41f7 in HG_Forward (handle=0x1814730, callback=callback@entry=0x7fa7fd8b2980 <crt_hg_req_send_cb>, arg=arg@entry=0x18b9190, in_struct=in_struct@entry=0x18b91b0) at /home/users/daos/daos/_build.external/mercury/src/mercury.c:2165 #13 0x00007fa7fd8b9e39 in crt_hg_req_send (rpc_priv=rpc_priv@entry=0x18b9190) at src/cart/crt_hg.c:1191 #14 0x00007fa7fd90a8ea in crt_req_send_immediately (rpc_priv=<optimized out>) at src/cart/crt_rpc.c:1104 #15 crt_req_send_internal (rpc_priv=rpc_priv@entry=0x18b9190) at src/cart/crt_rpc.c:1173 #16 0x00007fa7fd90ef23 in crt_req_hg_addr_lookup_cb (hg_addr=0x18ba590, arg=0x18b9190) at src/cart/crt_rpc.c:569 #17 0x00007fa7fd8b1062 in crt_hg_addr_lookup_cb (hg_cbinfo=<optimized out>) at src/cart/crt_hg.c:290 #18 0x00007fa7fc7b0985 in hg_core_addr_lookup_cb (callback_info=<optimized out>) at /home/users/daos/daos/_build.external/mercury/src/mercury.c:454 #19 0x00007fa7fc7bbce2 in hg_core_trigger_lookup_entry (hg_core_op_id=0x18ba530) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:3444 #20 hg_core_trigger (context=0x180d590, timeout=<optimized out>, timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7ffe185f88dc) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:3384 #21 0x00007fa7fc7bca4b in HG_Core_trigger (context=<optimized out>, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7ffe185f88dc) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:4900 #22 0x00007fa7fc7b44ed in HG_Trigger (context=context@entry=0x17d6370, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7ffe185f88dc) at /home/users/daos/daos/_build.external/mercury/src/mercury.c:2262 #23 0x00007fa7fd8b37ea in crt_hg_trigger (hg_ctx=hg_ctx@entry=0x18083d8) at src/cart/crt_hg.c:1327 #24 0x00007fa7fd8bce5d in crt_hg_progress (hg_ctx=hg_ctx@entry=0x18083d8, timeout=timeout@entry=1000) at src/cart/crt_hg.c:1360 #25 0x00007fa7fd86dfbb in crt_progress (crt_ctx=0x18083c0, timeout=timeout@entry=-1, cond_cb=cond_cb@entry=0x7fa7fe61f7f0 <ev_progress_cb>, arg=arg@entry=0x7ffe185f89d0) at src/cart/crt_context.c:1286 #26 0x00007fa7fe624bb6 in daos_event_priv_wait () at src/client/api/event.c:1203 #27 0x00007fa7fe627f96 in dc_task_schedule (task=0x181b4e0, instant=instant@entry=true) at src/client/api/task.c:139 #28 0x00007fa7fe626eb1 in daos_pool_connect (uuid=uuid@entry=0x7ffe185f8c38 "UXh}E<Kĭ<\035\340\332O", <incomplete sequence \325>, grp=0x1805f50 "daos_server", svc=0x1806000, flags=flags@entry=1, poh=poh@entry=0x7ffe185f8c48, info=info@entry=0x0, ev=ev@entry=0x0) at src/client/api/pool.c:53 #29 0x000000000040590d in pool_query_hdlr (ap=0x7ffe185f8c20) at src/utils/daos_hdlr.c:141 #30 0x0000000000402bc4 in main (argc=5, argv=<optimized out>) at src/utils/daos.c:957 (gdb)
On 12/25/19, 2:11 AM, "daos@daos.groups.io on behalf of Shengyu SY19 Zhang" <daos@daos.groups.io on behalf of zhangsy19@...> wrote:
Hello Kevan,
Here are the log files. Restart server means daos_server restarted, in spite of Ctrl+C to kill process or server reboot, anyhow after restart daos_server, the existing containers can't be touched, it can be 100% reproduced in my environment.
dmg storage query smd -I ->work. daos container query --svc=0 --path=/tmp/mycontainer ->no response due to infinitely loop daos container create... ->->no response due to infinitely loop
with sockets mode, didn't meet this issue.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm Sent: Wednesday, December 25, 2019 3:30 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Shengyu,
When you say "restart server", do you mean that you rebooted the node, or that you just restarted the daos_server process? Could you send another daos_control.log and daos_server.log from when it fails in this way?
Kevan
On 12/23/19, 11:34 PM, "daos@daos.groups.io on behalf of Shengyu SY19 Zhang" <daos@daos.groups.io on behalf of zhangsy19@...> wrote:
Hello Kevan,
After some more testing, actually the issue is still there, I can get ib to work by following ways: Restart subnet rm -rf /mnt/daos start daos server re-format create pool create container
however once I restart server, will happen the infinitely loop problem, and by any way I can't connect to an existing pool via ib.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm Sent: Saturday, December 21, 2019 6:57 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Shengyu,
As it happens, I also had a case today using infiniband where my daos_test client was in an infinite loop, it generated 200 million lines in daos.log within a minute or so. It turned out that the IB subnet manager process had died. I restarted opensm, then re-ran daos_test, and it started to work. I mention it in case it might be the same problem as yours. Are you sure your subnet manager is working? Try a fi_pingpong test; if it works, then your subnet manager is okay, that's not the problem.
Thanks, Kevan
On 12/19/19, 12:48 AM, "daos@daos.groups.io on behalf of Shengyu SY19 Zhang" <daos@daos.groups.io on behalf of zhangsy19@...> wrote:
Hello Joel,
Thanks for your information, those are outputs of the test with and without rxm specified. Furthermore, I corrected numa setting, I thought there are mismatch issue with ib devices, I have tried unbind all other devices (ib1,2,3), and also tried remove rxm, All same problem happen, daos process just hangs to fi_send due to infinite loop.
After I set FI_LOG_LEVEL=debug, and creating container, I found some interesting, it shows:
libfabric:54383:verbs:core:ofi_check_ep_type():629<info> unsupported endpoint type libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Supported: FI_EP_DGRAM libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Requested: FI_EP_MSG libfabric:54383:verbs:core:ofi_check_domain_attr():525<info> Unknown domain name libfabric:54383:verbs:core:ofi_check_domain_attr():526<info> Supported: mlx5_0-xrc libfabric:54383:verbs:core:ofi_check_domain_attr():526<info> Requested: mlx5_0 libfabric:54383:verbs:core:ofi_check_ep_type():629<info> unsupported endpoint type libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Supported: FI_EP_DGRAM libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Requested: FI_EP_MSG
Then I changed environment domain name to mlx5_0-xrc, there will be another message, can't find ofi+verbs provider, just like before. And BTW I think daos app should be failed when connecting pool rather than infinitely loop in fi_send.
Best Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Thursday, December 19, 2019 12:15 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Based on your logs, the control plane is successfully matching ib0 with mlx5_0. It shows "DEBUG 02:06:24.772249 netdetect.go:536: Device alias for ib0 is mlx5_0" As such, it correctly sets OFI_DOMAIN=mlx5_0. This matches with your topology data as reported by lstopo. Your results should not change if you manually are setting OFI_DOMAIN=mlx5_0 or not, because the control plane is already doing the right thing and you are not giving it a conflicting override. If you find that the behavior changes when you specify OFI_DOMAIN=mlx5_0 in your daos_server.yml, that's a problem we would need to debug.
Your topology shows these 4 interface cards / device combinations (mlx5_0:ib0, mlx5_1:ib1, mlx5_2:ib2 and mlx5_3:ib3).
PCI 15b3:1013 (P#548864 busid=0000:86:00.0 class=0207(IB) link=15.75GB/s PCISlot=5) Network L#7 (Address=20:00:10:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:48 Port=1) "ib0" OpenFabrics L#8 (NodeGUID=b859:9f03:0005:b548 SysImageGUID=b859:9f03:0005:b548 Port1State=4 Port1LID=0x4 Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03:0005 :b548) "mlx5_0"
PCI 15b3:1013 (P#548865 busid=0000:86:00.1 class=0207(IB) link=15.75GB/s PCISlot=5) Network L#9 (Address=20:00:18:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:49 Port=1) "ib1" OpenFabrics L#10 (NodeGUID=b859:9f03:0005:b549 SysImageGUID=b859:9f03:0005:b548 Port1State=1 Port1LID=0xffff Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03: 0005:b549) "mlx5_1"
PCI 15b3:1013 (P#716800 busid=0000:af:00.0 class=0207(IB) link=15.75GB/s PCISlot=6) Network L#11 (Address=20:00:10:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:40 Port=1) "ib2" OpenFabrics L#12 (NodeGUID=b859:9f03:0005:b540 SysImageGUID=b859:9f03:0005:b540 Port1State=4 Port1LID=0x1b Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03:00 05:b540) "mlx5_2"
PCI 15b3:1013 (P#716801 busid=0000:af:00.1 class=0207(IB) link=15.75GB/s PCISlot=6) Network L#13 (Address=20:00:18:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:41 Port=1) "ib3" OpenFabrics L#14 (NodeGUID=b859:9f03:0005:b541 SysImageGUID=b859:9f03:0005:b540 Port1State=1 Port1LID=0xffff Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03: 0005:b541) "mlx5_3"
I agree that fi_info shows that the provider: ofi+verbs;ofi_rxm is valid for ib0, and the control plane agrees with that. "DEBUG 02:06:20.594910 netdetect.go:764: Device ib0 supports provider: ofi+verbs;ofi_rxm" This is only a guess, but I have to rule it out. I am not certain that the provider ofi_rxm is being handled properly. Can you remove "ofi_rxm" from your provider configuration and try again?
That is, in your daos_server.yml set: provider: ofi+verbs
If you still have an error, then you will want to run the cart diagnostics again that Alex wrote about so we can see the latest results with that.
>> orterun --allow-run-as-root -np 2 -x >> CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 -x >> OFI_DOMAIN=mlx5_0 ../bin/crt_launch -e tests/iv_server -v 3
and
>> orterun --allow-run-as-root -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs" -x >> OFI_INTERFACE=ib0 -x OFI_DOMAIN=mlx5_0 ../bin/crt_launch -e >> tests/iv_server -v 3
One last thing, while this won't generate a runtime error _yet_, you have a mismatch between the pinned_numa_node:0 and the actual NUMA node of your ib0 device. Your topology data shows it as NUMA node 1. If you run "daos_server network scan -a" it should show you that the correct pinned_numa_node is 1. By setting it to the wrong NUMA node, you will have a performance impact once this is running because the daos_io_server will bind the threads to cores in the wrong NUMA node. The plan is to make a validation error like this a hard error instead of a warning. There's debug output in your daos_control.log that looks like this:
DEBUG 02:06:20.595012 netdetect.go:901: Validate network config -- given numaNode: 0 DEBUG 02:06:20.872053 netdetect.go:894: ValidateNUMAConfig (device: ib0, NUMA: 0) returned error: The NUMA node for device ib0 does not match the provided value 0. Remove the pinned_numa_node value from daos_server.yml then execute 'daos_server network scan' to see the valid NUMA node associated with the network device
I'll keep working on it with Alex. Let's see what you find.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Wednesday, December 18, 2019 4:25 AM Subject: FW: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
I'm wondering if you have any updates for this? I'm also a developer and I very familiar with spdk/dpdk/ibverbs, however I'm not familiar with other projects which depended by the DAOS, if you can give me some hints or guide, I would like to try troubleshooting this issue as well. Also if you need environment to reproduce the issue, you may connect to our machine for debug.
Regards, Shengyu
-----Original Message----- From: Shengyu SY19 Zhang Sent: Thursday, December 12, 2019 4:05 PM Subject: RE: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
Yes, please see it in the attachment.
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Thursday, December 12, 2019 11:08 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Can you generate a topology file for me, similar to what I asked Kevan to provide? You can generate it with with "lstopo --of xml > topology.xml" I am interested in seeing if your system also has device siblings with same/different ports as the one specified as the OFI_INTERFACE.
Your debug log shows that DAOS chose the sibling of ib0 as mlx5_0. It's correct that it picked something in the mlxN_N family, but, depending on your topology there could be a better device to choose, possibly one that has a port match. Your topology file will show if mlx5_0 matches the port or not, and similarly to Kevan's, it will help me develop a better function to find the correct matching sibling.
I don't know why cart failed once it had the OFI_DEVICE that we thought was correct. Alex's experiment with Kevan will help with that.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Wednesday, December 11, 2019 2:09 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
Please see those files in attachment, after changed some configurations, daos trapped in (no return from rdma_get_cm_event ) creating container, here is bt log:
(gdb) bt #0 0x00007fb8d6ef59cd in write () from /lib64/libc.so.6 #1 0x00007fb8d45a99ad in rdma_get_cm_event.part.18 () from /lib64/librdmacm.so.1 #2 0x00007fb8d517f836 in fi_ibv_eq_read () from /root/daos/install/lib/libfabric.so.1 #3 0x00007fb8d51a556f in rxm_eq_read () from /root/daos/install/lib/libfabric.so.1 #4 0x00007fb8d51a720f in rxm_msg_eq_progress () from /root/daos/install/lib/libfabric.so.1 #5 0x00007fb8d51a738d in rxm_cmap_connect () from /root/daos/install/lib/libfabric.so.1 #6 0x00007fb8d51ad54b in rxm_ep_tsend () from /root/daos/install/lib/libfabric.so.1 #7 0x00007fb8d5f1e468 in fi_tsend (context=0xbc4018, tag=1, dest_addr=0, desc=0xbaf6f0, len=384, buf=0x7fb8d235c038, ep=0xbb03a0) at /root/daos/install/include/rdma/fi_tagged.h:114 #8 na_ofi_msg_send_unexpected (na_class=0xacf220, context=0xbbf200, callback=<optimized out>, arg=<optimized out>, buf=0x7fb8d235c038, buf_size=384, plugin_data=0xbaf6f0, dest_addr=0xc6a1b0, dest_id=0 '\000', tag=1, op_id=0xbc14e8) at /root/daos/_build.external/mercury/src/na/na_ofi.c:3622 #9 0x00007fb8d613885f in NA_Msg_send_unexpected (op_id=0xbc14e8, tag=<optimized out>, dest_id=<optimized out>, dest_addr=<optimized out>, plugin_data=<optimized out>, buf_size=<optimized out>, buf=<optimized out>, arg=0xbc13c0, callback=0x7fb8d613a8c0 <hg_core_send_input_cb>, context=<optimized out>, na_class=<optimized out>) at /root/daos/_build.external/mercury/src/na/na.h:1485 #10 hg_core_forward_na (hg_core_handle=0xbc13c0) at /root/daos/_build.external/mercury/src/mercury_core.c:2076 #11 0x00007fb8d613c3a6 in HG_Core_forward (handle=0xbc13c0, callback=callback@entry=0x7fb8d6131740 <hg_core_forward_cb>, arg=arg@entry=0xbc41f0, flags=<optimized out>, payload_size=<optimized out>) at /root/daos/_build.external/mercury/src/mercury_core.c:4748 #12 0x00007fb8d6135057 in HG_Forward (handle=0xbc41f0, callback=callback@entry=0x7fb8d7233930 <crt_hg_req_send_cb>, arg=arg@entry=0xc68cb0, in_struct=in_struct@entry=0xc68cd0) at /root/daos/_build.external/mercury/src/mercury.c:2147 #13 0x00007fb8d723ade9 in crt_hg_req_send (rpc_priv=rpc_priv@entry=0xc68cb0) at src/cart/crt_hg.c:1190 #14 0x00007fb8d728b89a in crt_req_send_immediately (rpc_priv=<optimized out>) at src/cart/crt_rpc.c:1104 #15 crt_req_send_internal (rpc_priv=rpc_priv@entry=0xc68cb0) at src/cart/crt_rpc.c:1173 #16 0x00007fb8d728fed3 in crt_req_hg_addr_lookup_cb (hg_addr=0xc6a150, arg=0xc68cb0) at src/cart/crt_rpc.c:569 #17 0x00007fb8d7232012 in crt_hg_addr_lookup_cb (hg_cbinfo=<optimized out>) at src/cart/crt_hg.c:290 #18 0x00007fb8d6131835 in hg_core_addr_lookup_cb (callback_info=<optimized out>) at /root/daos/_build.external/mercury/src/mercury.c:454 #19 0x00007fb8d613caa2 in hg_core_trigger_lookup_entry (hg_core_op_id=0xc6a0f0) at /root/daos/_build.external/mercury/src/mercury_core.c:3444 #20 hg_core_trigger (context=0xbbd030, timeout=<optimized out>, timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7fff3d70388c) at /root/daos/_build.external/mercury/src/mercury_core.c:3384 #21 0x00007fb8d613d80b in HG_Core_trigger (context=<optimized out>, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7fff3d70388c) at /root/daos/_build.external/mercury/src/mercury_core.c:4873 #22 0x00007fb8d613534d in HG_Trigger (context=context@entry=0xacf1e0, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7fff3d70388c) at /root/daos/_build.external/mercury/src/mercury.c:2244 #23 0x00007fb8d723479a in crt_hg_trigger (hg_ctx=hg_ctx@entry=0xbb7e78) at src/cart/crt_hg.c:1326 #24 0x00007fb8d723de0d in crt_hg_progress (hg_ctx=hg_ctx@entry=0xbb7e78, timeout=timeout@entry=1000) at src/cart/crt_hg.c:1359 #25 0x00007fb8d71eef6b in crt_progress (crt_ctx=0xbb7e60, timeout=timeout@entry=-1, cond_cb=cond_cb@entry=0x7fb8d7fa0230 <ev_progress_cb>, arg=arg@entry=0x7fff3d703980) at src/cart/crt_context.c:1286 #26 0x00007fb8d7fa55f6 in daos_event_priv_wait () at src/client/api/event.c:1203 #27 0x00007fb8d7fa89d6 in dc_task_schedule (task=0xbcafa0, instant=instant@entry=true) at src/client/api/task.c:139 #28 0x00007fb8d7fa78f1 in daos_pool_connect (uuid=uuid@entry=0x7fff3d703b68 "\265\215%f\036AN\354\203\002\317I\035\362\067\273", grp=0xbb59f0 "daos_server", svc=0xbb5aa0, flags=flags@entry=2, poh=poh@entry=0x7fff3d703b78, info=info@entry=0x0, ev=ev@entry=0x0) at src/client/api/pool.c:53 #29 0x0000000000404eb0 in cont_op_hdlr (ap=ap@entry=0x7fff3d703b50) at src/utils/daos.c:610 #30 0x0000000000402b94 in main (argc=8, argv=<optimized out>) at src/utils/daos.c:957
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Tuesday, December 10, 2019 8:48 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
It's hard to further diagnose without the logs. Can you share your latest daos_server.yml, full daos_control.log and full server.log? In the daos_server.yml, please set control_log_mask: DEBUG and in the io server section, set log_mask: DEBUG
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Tuesday, December 10, 2019 1:42 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
I didn't set OFI_DOMAIN=mlx5_0 before, from the hint of Alex, I set it yesterday, there then there is another error while creating container: mgmt ERR src/mgmt/cli_mgmt.c:325 get_attach_info() GetAttachInfo unsuccessful: 2
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Monday, December 9, 2019 11:47 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
With your latest setup, can you launch DAOS and send your latest daos_control.log? In particular, I want to see how the daos_io_server environment variables are set. For example these two lines below show the command line args and the environment variables in use with an ofi+sockets/ib1 config. I'm looking to see if we are setting OFI_DOMAIN=mlx5_0 in your environment. I seem to recall that your earlier logs did have this set, but since builds have changed since then, it's worth checking out one more time.
DEBUG 16:33:15.711949 exec.go:112: daos_io_server:1 args: [-t 8 -x 2 -g daos_server -d /tmp/daos_sockets -s /mnt/daos1 -i 1842327892 -p 1 -I 1] DEBUG 16:33:15.711963 exec.go:113: daos_io_server:1 env: [CRT_TIMEOUT=30 FI_SOCKETS_CONN_TIMEOUT=2000 D_LOG_FILE=/tmp/server.log CRT_CTX_SHARE_ADDR=0 FI_SOCKETS_MAX_CONN_RETRY=1 D_LOG_MASK=ERR CRT_PHY_ADDR_STR=ofi+sockets OFI_INTERFACE=ib1 OFI_PORT=31416 DAOS_MD_CAP=1024]
If we are not setting OFI_DOMAIN=mlx5_0 automatically, go ahead and edit your daos_server.yml and add OFI_DOMAIN=mlx5_0 to the env_vars section (per below) and see what you get. If that doesn't get you going, please send the log for that run, too.
env_vars: - OFI_DOMAIN=mlx5_0
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Monday, December 9, 2019 5:03 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Actually this is really good result - two servers were able to exchange rpcs. That's the end of test -- just ctrl+C out of it.
I'll let Joel take over for daos help on how to make daos use mlx5_0 domain.
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, December 09, 2019 12:57 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Please see the new log below, and the test seems dead (no more response):
ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0x7fdd21092400 valid_mask = 0x3) [afa1][[44100,1],1][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0x195aad0 valid_mask = 0x3) [afa1][[44100,1],0][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument -------------------------------------------------------------------------- WARNING: There was an error initializing an OpenFabrics device.
Local host: afa1 Local device: mlx5_0 -------------------------------------------------------------------------- 12/09-03:53:20.32 afa1 CaRT[101464/101464] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically 12/09-03:53:20.32 afa1 CaRT[101465/101465] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically SRV [rank=1 pid=101465] Server starting, self_rank=1 SRV [rank=0 pid=101464] Server starting, self_rank=0 SRV [rank=1 pid=101465] >>>> Entered iv_set_ivns SRV [rank=1 pid=101465] <<<< Exited iv_set_ivns:773
[afa1:101458] 1 more process has sent help message help-mpi-btl-openib.txt / error in device init [afa1:101458] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Monday, December 9, 2019 4:30 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Given your output can you try this now?
orterun --allow-run-as-root -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 -x OFI_DOMAIN=mlx5_0 ../bin/crt_launch -e tests/iv_server -v 3
Please note additional OFI_DOMAIN envariable.
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, December 09, 2019 12:25 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Here is outputs of fio_info related to verbs:
provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_2 version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_RC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_2-xrc version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_XRC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_0 version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_RC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_0-xrc version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_XRC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_0-dgram version: 1.0 type: FI_EP_DGRAM protocol: FI_PROTO_IB_UD provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_2-dgram version: 1.0 type: FI_EP_DGRAM protocol: FI_PROTO_IB_UD provider: verbs;ofi_rxm fabric: IB-0xfe80000000000000 domain: mlx5_2 version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: verbs;ofi_rxm fabric: IB-0xfe80000000000000 domain: mlx5_0 version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: verbs;ofi_rxd fabric: IB-0xfe80000000000000 domain: mlx5_2-dgram version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: verbs;ofi_rxd fabric: IB-0xfe80000000000000 domain: mlx5_0-dgram version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Monday, December 9, 2019 4:08 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Thanks, Can you also provide full fi_info output?
~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, December 09, 2019 12:05 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Here are the outputs:
orterun --allow-run-as-root -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 ../bin/crt_launch -e tests/iv_server -v 3 ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0x2094f90 valid_mask = 0x3) [afa1][[18544,1],1][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0xacda60 valid_mask = 0x3) [afa1][[18544,1],0][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument -------------------------------------------------------------------------- WARNING: There was an error initializing an OpenFabrics device.
Local host: afa1 Local device: mlx5_0 -------------------------------------------------------------------------- 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:1609 # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0" 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:1609 # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0" 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:2975 # na_ofi_initialize(): Could not open domain for verbs;ofi_rxm, ib0 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na.c:324 # NA_Initialize_opt(): Could not initialize plugin 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR src/cart/crt_hg.c:525 crt_hg_init() Could not initialize NA class. 12/09-03:01:03.53 afa1 CaRT[92269/92269] crt ERR src/cart/crt_init.c:347 crt_init_opt() crt_hg_init failed rc: -1020. 12/09-03:01:03.53 afa1 CaRT[92269/92269] crt ERR src/cart/crt_init.c:421 crt_init_opt() crt_init failed, rc: -1020. 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:2975 # na_ofi_initialize(): Could not open domain for verbs;ofi_rxm, ib0 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na.c:324 # NA_Initialize_opt(): Could not initialize plugin 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR src/cart/crt_hg.c:525 crt_hg_init() Could not initialize NA class. 12/09-03:01:03.53 afa1 CaRT[92268/92268] crt ERR src/cart/crt_init.c:347 crt_init_opt() crt_hg_init failed rc: -1020. 12/09-03:01:03.53 afa1 CaRT[92268/92268] crt ERR src/cart/crt_init.c:421 crt_init_opt() crt_init failed, rc: -1020. [afa1:92262] 1 more process has sent help message help-mpi-btl-openib.txt / error in device init [afa1:92262] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Saturday, December 7, 2019 1:14 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
With latest daos code and mofed 4.6 installed can you rerun this and show what that one gives you?
source scons_local/utils/setup_local.sh cd install/Linux/TESTING orterun -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 ../bin/crt_launch -e tests/iv_server -v 3
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Thursday, December 05, 2019 10:25 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Yes, however with 4.6 the result is same. After I upgraded daos code to newest of master branch, I got some different results, daos io server seems started OK, since I can see lots of fd points to rdma_cm. But daos client seems can't connect to server due to same error (can't find efi+verbs provider on ib0) like the log shows, you may find the log in the attachment, that is crated via "create container"
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Wednesday, December 4, 2019 4:02 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Can you try installing MOFED 4.6 packages on your system? In general MOFED is required to get verbs over Mellanox working. Those packages can be found at: https://www.mellanox.com/page/mlnx_ofed_matrix?mtag=linux_sw_drivers
There is also 4.7 version available, however there seem to be few longevity issues currently when using 4.7 (according to verbs ofi maintainers).
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, November 25, 2019 9:55 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Thanks for your suggestion, here is the log:
mca_base_component_repository_open: unable to open mca_pml_ucx: libucp.so.0: cannot open shared object file: No such file or directory (ignored) 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:1407 # na_ofi_getinfo(): fi_getinfo() failed, rc: -61(No data available) 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:2816 # na_ofi_check_protocol(): na_ofi_getinfo() failed 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na.c:302 # NA_Initialize_opt(): No suitable plugin found that matches ofi+verbs;ofi_rxm://192.168.80.120 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR src/cart/crt_hg.c:521 crt_hg_init() Could not initialize NA class. 11/26-00:40:22.65 afa1 CaRT[365504/365504] crt ERR src/cart/crt_init.c:347 crt_init_opt() crt_hg_init failed rc: -1020. 11/26-00:40:22.65 afa1 CaRT[365504/365504] crt ERR src/cart/crt_init.c:421 crt_init_opt() crt_init failed, rc: -1020.
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Tuesday, November 26, 2019 12:21 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
In order to figure out what is the issue on your system could you run cart standalone test instead and provide the output that you get?
cd daos_dir source scons_local/utils/setup_local.sh cd install/Linux/TESTING orterun -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 ../bin/crt_launch -e tests/iv_server -v 3
Note: Depending on how you installed daos your paths might be different, so instead of cd install/Linux/TESTING you might have to cd into different directory first where you have tests/iv_server in. I think in your env it will be cd /root/daos/install/TESTING/ or cd /root/daos/install/cart/TESTING.
Expected output: 11/25-15:51:48.39 wolf-55 CaRT[53295/53295] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically 11/25-15:51:48.40 wolf-55 CaRT[53296/53296] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically SRV [rank=0 pid=53295] Server starting, self_rank=0 SRV [rank=1 pid=53296] Server starting, self_rank=1 SRV [rank=1 pid=53296] >>>> Entered iv_set_ivns SRV [rank=1 pid=53296] <<<< Exited iv_set_ivns:773
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang Sent: Monday, November 25, 2019 3:28 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
As the shown in the output.log, there is only one version of libfabrics installed in my machine, and actually I don't nave other software which depends libfabraics installed. From you guide to set FI_LOG_LEVEL=debug, I can see the following message, may be helpful:
libfabric:123445:verbs:fabric:fi_ibv_set_default_attr():1263<info> Ignoring provider default value for tx rma_iov_limit as it is greater than the value supported by domain: mlx5_0 libfabric:123445:verbs:fabric:fi_ibv_get_matching_info():1365<info> hints->ep_attr->rx_ctx_cnt != FI_SHARED_CONTEXT. Skipping XRC FI_EP_MSG endpoints ERROR: daos_io_server:0 libfabric:123445:verbs:core:fi_ibv_check_hints():231<info> Unsupported capabilities libfabric:123445:verbs:core:fi_ibv_check_hints():232<info> Supported: FI_MSG, FI_RECV, FI_SEND, FI_LOCAL_COMM, FI_REMOTE_COMM libfabric:123445:verbs:core:fi_ibv_check_hints():232<info> Requested: FI_MSG, FI_RMA, FI_READ, FI_RECV, FI_SEND, FI_REMOTE_READ ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: No such device(19) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: No such device(19) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: Invalid argument(22) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: Invalid argument(22) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: Invalid argument(22) ERROR: daos_io_server:0 libfabric:123445:core:core:ofi_layering_ok():795<info> Need core provider, skipping ofi_rxd libfabric:123445:core:core:ofi_layering_ok():795<info> Need core provider, skipping ofi_mrail
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Saturday, November 23, 2019 3:20 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
The debug output showed me that when daos_server is started via orterun, libfabric is not finding provider support for ofi_rxm at least. I'm still wondering if you have two different versions of libfabric installed on your machine.
Can you run these commands and provide the output?
1) ldd install/bin/daos_server 2) modify your orterun command to run ldd on daos_server. For example, I run this command locally: orterun --allow-run-as-root --map-by node --mca btl tcp,self --mca oob tcp -np 1 --hostfile /home/jbrosenz/daos/hostfile --enable-recovery --report-uri /tmp/urifile ldd /home/jbrosenz/daos/install/bin/daos_server 3) which fi_info 4) ldd over each version of fi_info found
From the data you provide, I'll understand if the libfabric being used by daos_server when executed directly by you in the shell is the same libfabric being used by daos_server when executed via orterun. Your original "daos_server network scan" output showed support for ofi+verbs;ofi_rxm but your debug output showed that when daos_server was started (via orterun), libfabric could not find support for the very same providers. If there are two different versions being used with different configurations, it would explain the failure. If it's a single installation/configuration, then that will lead the debug in another direction.
Depending on what you find through 1-4, you might find it helpful to export the environment variable FI_LOG_LEVEL=debug which will instruct libfabric to output a good deal of debug info.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Friday, November 22, 2019 12:59 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
Please see those files in attachment. I have tried two machines, one have full provider shows in fi_info (verbs and rxm), another doesn't show verbs, but they are same can't start io_server. I found the project conflicts with mellanox drivers, therefor I remove it and use yum package only, however still keep not working.
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Friday, November 22, 2019 6:35 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Can you share your daos_server.yml so we can see how you enabled the provider? And, can you share the log files daos_control.log and server.log so we can see more context?
Thank you, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Wednesday, November 20, 2019 9:23 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello,
Thank you for your help Alex, Joel and Kevin, I have checked those steps that you provided:
Ibstat: State: Active Physical state: LinkUp
Ifconfig: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
fi_info: verbs: version: 1.0 ofi_rxm: version: 1.0 ofi_rxd: version: 1.0
And network is good since I can run SPDK NVMe-oF over Infiniband with good working. I also specified "ofi+verbs;ofi_rxm", the same error occurred, the ioserver will be stopped after a while, and print log as I provided previously.
And I noticed, whatever I specify ofi+verbs, ofi_rxm, or ofi+verbs;ofi_rxm, the log keep shows No provider found for "verbs;ofi_rxm" provider on domain "ib0", is it the cause?
BTW: it is working under ofi+sockets.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Thursday, November 21, 2019 7:13 AM Subject: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
> However if I specify either ofi+verbs or ofi_rxm, the same error will happen, and io_server will stop. > na_ofi.c:1609 > # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0"
To use supported verbs provider you need to have "ofi+verbs;ofi_rxm" in the provider string.
~~Alex.
-----Original Message----- From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Rosenzweig, Joel B Sent: Wednesday, November 20, 2019 7:37 AM Subject: Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
The daos_server network scan uses information provided by libfabric to determine available devices and providers. It then cross references that list of devices with device names obtained from hwloc to convert libfabric device names (as necessary) to those you'd find via ifconfig. Therefore, if "daos_server network scan" displays a device and provider, it means that support for that via libfabric has been provided. However, as Kevin pointed out, it's possible that the device itself was down, and that could certainly generate an error like what you encountered. There's another possibility, that you might have more than one version of libfabric installed in your environment. I have run into this situation in our lab environment. You might check your target system to see if it has more than one libfabric library with different provider support.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Harms, Kevin via Groups.Io Sent: Wednesday, November 20, 2019 10:04 AM Subject: Re: [daos] Does DAOS support infiniband now?
Shengyu,
I have tried IB and it works. Verify the libfabric verbs provider is available.
fi_info -l
you should see these:
ofi\_rxm: version: 1.0
verbs: version: 1.0
See here for details:
You might also want to confirm ib0 is in the UP state:
[root@daos01 ~]# ifconfig ib0 ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092 inet 172.25.6.101 netmask 255.255.0.0 broadcast 172.25.255.255
kevin
________________________________________ From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...> Sent: Wednesday, November 20, 2019 2:54 AM Subject: [daos] Does DAOS support infiniband now?
Hello,
I use daos_server network scan, it shows as following: fabric_iface: ib0 provider: ofi+verbs;ofi_rxm pinned_numa_node: 1
However if I specify either ofi+verbs or ofi_rxm, the same error will happen, and io_server will stop. na_ofi.c:1609 # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0"
The ib0 is Mellanox nic over Infiniband network.
Regards, Shengyu.
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for |
|
Re: [External] Re: [daos] Does DAOS support infiniband now?
Shengyu SY19 Zhang
Hello Alex,
That resolved the issue, after completed removed the _build_external folder, those logs disappeared, while the NVMe issue after SPDK/DPDK removed still haven’t resolved (currently can use only one NVMe). How to avoid this type of issue? The way is remove the path after a period of time?
Regards, Shengyu.
From: daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Oganezov, Alexander A
Sent: Wednesday, January 22, 2020 2:00 AM To: daos@daos.groups.io Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
After talking to mercury developer, it sounds like you still might have a mismatch between versions of cart and mercury that you are using. Can you do a fresh build of daos by first fully removing install/ and _build_external.Linux/ directories and see if that solves the issue?
Thanks, ~~Alex.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Alex,
Please see the log attached.
Regards, Shengyu.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Oganezov, Alexander A
Shengyu,
Can you provide full log from start until first occurrence of this error?
Thanks, ~~Alex.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Alex,
I have removed mercury, the io_server seems started normally ,however the daos_server.log increasing quickly to eat all my free space. It infinite repeat the three lines: 01/21-01:40:49.28 afa1 DAOS[72964/73009] hg ERR src/cart/crt_hg.c:1331 crt_hg_trigger() HG_Trigger failed, hg_ret: 18.
Regards, Shengyu,
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Oganezov, Alexander A
One other thing Shengyu,
Can you verify that your mercury build is up to date? Based on domain being used as “mlx5_0/192.168.80.161” it sounds like there is mismatch between what CaRT generates and what mercury consumes; there has been change few weeks ago regarding how domain is provided to mercury level, and it feels as if older mercury is being used.
To ensure clean mercury build remove libmercury* libna* from your install/lib location, remove _build.external-Linux/mercury directory and recompile daos with scons –build-deps=yes –config=force install
Thanks, ~~Alex.
From: Oganezov, Alexander A
Hi Shengyu,
What does this command return to you on the node where you see this strange domain name? fi_info --provider="verbs"
Thanks, ~~Alex.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Joel,
I have some more information, In the file na_ofi.c + 1609, the io_serve exit there, and on above code, na_ofi_verify_provider function will compare domain names, the domain is “mlx5_0/192.168.80.161”, while prov->domain_attr.name is “mlx5_0”, it will always return FALSE.
Regards, Shengyu
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Joel,
Network scan: Scanning fabric for YML specified provider: ofi+verbs;ofi_rxm
And log attached, it is standalone server. Additionally, I created two identical virtual machines with IB sr-iov pass-through, however one vm can’t start due to the same problem, while another can start normally. They were using ofi+verbs;ofi_rxm as provider.
Regards, Shengyu
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Rosenzweig, Joel B
Hello Shengyu,
It appears that your daos_server.yml is specifying the provider as “ofi+verbs” but I think it should be set to “ofi+verbs;ofi_rxm”. Can you configure your daos_server.yml with that and try again? And then, if things still do not work, then please provide:
1) the network scan output again after you make the provider change 2) the portion of the debug log that shows the environment variables being provided to the daos_io_server. This will show us what is being set for OFI_INTERFACE, OFI_DOMAIN in the daos_io_server environment. 3) the daos_server.yml so I can see how you have configured each daos_io_server.
Regards, Joel
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Johann,
Here is logs:
Scanning fabric for YML specified provider: ofi+verbs
And please see my inline comments.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi,
Cc’ing Joel. Those errors indicate that you still have some network setup issue. Could you please run daos_server network scan?
e.g.: [root@wolf-118 ~]# daos_server network scan -p "ofi+verbs;ofi_rxm" Scanning fabric for cmdline specified provider: ofi+verbs;ofi_rxm Fabric scan found 3 devices matching the provider spec: ofi+verbs;ofi_rxm
fabric_iface: ib0 provider: ofi+verbs;ofi_rxm pinned_numa_node: 0
fabric_iface: ib1 provider: ofi+verbs;ofi_rxm pinned_numa_node: 1
fabric_iface: eth0 provider: ofi+verbs;ofi_rxm pinned_numa_node: 0
As for the NVMe issue, do I understand correctly that: - The PCI addresses of the 8 NVMe SSDs show up fine via daos_server storage scan Yes.
-
You have reduced the number of huge pages to 4096 pages (8GB) and all the SPDK errors related to failed huge pages allocation are gone as well as this error from the log: Yes, only reporting pci addr not found and can’t start, then reduced nvme to 1 seems pass this step. - But the io_server fails to start after dmg storage format? Yes.
I had a second look at daos_control2.log and noticed that you are using 20 targets while you have 8 SSDs and 10 cores. Could you please try with #targets = #SSDs = 8 and set nr_xs_helpers to 0? This config file was copies from physical machine, maybe some problem, several days ago, it and physical server are working, tried your suggestion, still same while formatting and can’t start again. daos_io_server:0 Using legacy core allocation algorithm
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello Johann,
For the first issue (daos_control.log), it is physical machine and specified 8 NVMes, daos server report can’t find PCI address (created the log). Then I use only one NVMe, daos server can start normally, however it stopped after dmg storage format, just like the second issue (daos_control2.log).
DEBUG 02:18:19.457860 config.go:378: Active config saved to /root/daos/install/etc/.daos_server.active.yml (read-only)
Regards, Shengyu,
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi Shengyu,
Mike looked into the logs you provided and noticed that:
## ERROR: requested 40960 hugepages but only 8585 could be allocated. ## Memory might be heavily fragmented. Please try flushing the system cache, or reboot the machine.
Maybe you meant to specify 4096 in the yaml file instead of 40960 for nr_hugepages? It sounds like you are trying to allocate 82GB of RAM for hugepages.
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
There seems to be multiple messages about not being able to meet the required number of hugepages requested. I was not involved in the work but there have been changes in how huge page allocation is performed for each of the daos_io_server instances and we are now performing automatic storage prepare when starting daos_server.
What I would suggest is to reboot the nodes (in case the issues to do with memory fragmentation) and try with a smaller number of drives configured in the config file. Don’t try to do a storage prepare before running daos_server (as it will be performed on start-up anyway). And update to recent master before trying please. You could also try to bump the number of huge pages specified in the server config file to maybe 8192?
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello,
Sorry I forgot attachments.
Regards, Shengyu
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Johann,
I couldn’t check it at this time, since I can’t start daos server (newest code of master) on my two different machines, I assume there are lots of modification in control path, there could be some new issues: 1. Daos_server storage prepare –-nvme-only, all my NVMe disks switched to uio as expected, then issue storage scan, can see those disks as expected as well. However, when I start daos server, it report error that reporting can’t find PCI address, and all NVMs switched to kernel driver, see daos_control.log 2. On another machine, it just stopped after being formatted, see doas_control2.log
Regards, Shengyu,
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Thanks for the logs Shengyu. It does not seem to be related to wrong endpoint addresses. The client did find the server, but this latter returned DER_NONEXIST when connecting to the pool. It might be the same problem fixed recently by PR #1701. Could you please apply the patch or try with latest master?
Cheers, Johann
From:
<daos@daos.groups.io>
on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello Johann,
Here are log files. My step: 1. Create fresh environment and start daos server and then format it. 2. Create pool 3. Create container. 4. List container: daos container query –-svc=0 –path=/tmp/mycontainer, that work great. 5. CTRl +C to kill daos server 6. Restart daos server 7. Repeat 4, daos process will be dead with infinitely loop.
Regards, Shengyu.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hm, the issue with server restart might be due to the endpoint address of the servers not being persistent. Could you please collect full debug logs for the fresh start with reformat and the subsequent restart?
Cheers, Johann
From:
<daos@daos.groups.io>
on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello,
@Johann, Newest version will work on new formatted and created pool, yes, I did set in the yaml. @Kal, no, I still meet the issue after io_server restart, seems there issues after load existing data.
Regards, Shengyu. From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi Kevan & Shengyu,
Could you please advise what commit hash you use? Also, are you specifying in “fabric_iface_port” in the yaml file?
Cheers, Johann
From:
<daos@daos.groups.io>
on behalf of Kevan Rehm <kevan.rehm@...>
I have not had any success, I see the same failure sequence as Shengyu, and due to other commitments I've had to set this aside for a few days. Hope to get back to it in a week or so.
Kevan
On 1/8/20, 2:15 PM, "daos@daos.groups.io on behalf of Alfizah, Kurniawan" <daos@daos.groups.io on behalf of kurniawan.alfizah@...> wrote:
Hello Shengyu and Kevan, I'm wondering if you have resolved this problem and that DAOS is working well with IB.
Cheers, Kal
-----Original Message----- From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang Sent: Saturday, December 28, 2019 2:46 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Kevan,
Yes, its exact the same problem that I meet, the rdma_get_cm_event function will issue write system call to receive completing event from from ib device, however it get nothing at this time, and always return EAGAIN, that caused fi_tsend function infinitely loop.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm Sent: Saturday, December 28, 2019 12:30 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Shengyu,
I think we are going to need help from the experts, I am not familiar with this code. I tried the same commands that you mentioned in your last email, and they also hang for me. But I do not see an infinite loop; rather the daos process just hangs forever in a write() request. Is that what you see as well?
Experts: is there any documentation on CaRT, what it is for, internals? I have not been able to find anything.
The last entries in the daos_server_srv1.log file are just the ds_mgmt_drpc_get_attach_info() log messages, called from daos_agent.
While the daos command was hung, I sent a kill -6 signal to it to collect the corefile. It seems like the command has attempted to set up a MSG connection to the daos_io_server, but has not received a completion event. The dest_addr=0 looks a little suspicious in the fi_tsend() call. Hopefully others will recognize what the problem is in the backtrace below, otherwise I will keep digging as time permits.
Thanks, Kevan
(gdb) bt #0 0x00007fa7fd5749cd in write () from /lib64/libc.so.6 #1 0x00007fa7fac2d875 in rdma_get_cm_event.part.13 () from /lib64/librdmacm.so.1 #2 0x00007fa7fb7fd856 in fi_ibv_eq_read () from /home/users/daos/daos/install/lib/libfabric.so.1 #3 0x00007fa7fb82360f in rxm_eq_read () from /home/users/daos/daos/install/lib/libfabric.so.1 #4 0x00007fa7fb8252af in rxm_msg_eq_progress () from /home/users/daos/daos/install/lib/libfabric.so.1 #5 0x00007fa7fb82542d in rxm_cmap_connect () from /home/users/daos/daos/install/lib/libfabric.so.1 #6 0x00007fa7fb82b5eb in rxm_ep_tsend () from /home/users/daos/daos/install/lib/libfabric.so.1 #7 0x00007fa7fc59f3e8 in fi_tsend (context=0x1814558, tag=1, dest_addr=0, desc=0x1811ab0, len=332, buf=0x7fa7f6a10038, ep=0x17f03c0) at /home/users/daos/daos/install/include/rdma/fi_tagged.h:114 #8 na_ofi_msg_send_unexpected (na_class=0x17d6250, context=0x180f760, callback=<optimized out>, arg=<optimized out>, buf=0x7fa7f6a10038, buf_size=332, plugin_data=0x1811ab0, dest_addr=0x18ba5e0, dest_id=0 '\000', tag=1, op_id=0x1811a48) at /home/users/daos/daos/_build.external/mercury/src/na/na_ofi.c:3745 #9 0x00007fa7fc7b79ff in NA_Msg_send_unexpected (op_id=0x1811a48, tag=<optimized out>, dest_id=<optimized out>, dest_addr=<optimized out>, plugin_data=<optimized out>, buf_size=<optimized out>, buf=<optimized out>, arg=0x1811920, callback=0x7fa7fc7b9a60 <hg_core_send_input_cb>, context=<optimized out>, na_class=<optimized out>) at /home/users/daos/daos/_build.external/mercury/src/na/na.h:1506 #10 hg_core_forward_na (hg_core_handle=0x1811920) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:2076 #11 0x00007fa7fc7bb5e6 in HG_Core_forward (handle=0x1811920, callback=callback@entry=0x7fa7fc7b0890 <hg_core_forward_cb>, arg=arg@entry=0x1814730, flags=<optimized out>, payload_size=<optimized out>) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:4775 #12 0x00007fa7fc7b41f7 in HG_Forward (handle=0x1814730, callback=callback@entry=0x7fa7fd8b2980 <crt_hg_req_send_cb>, arg=arg@entry=0x18b9190, in_struct=in_struct@entry=0x18b91b0) at /home/users/daos/daos/_build.external/mercury/src/mercury.c:2165 #13 0x00007fa7fd8b9e39 in crt_hg_req_send (rpc_priv=rpc_priv@entry=0x18b9190) at src/cart/crt_hg.c:1191 #14 0x00007fa7fd90a8ea in crt_req_send_immediately (rpc_priv=<optimized out>) at src/cart/crt_rpc.c:1104 #15 crt_req_send_internal (rpc_priv=rpc_priv@entry=0x18b9190) at src/cart/crt_rpc.c:1173 #16 0x00007fa7fd90ef23 in crt_req_hg_addr_lookup_cb (hg_addr=0x18ba590, arg=0x18b9190) at src/cart/crt_rpc.c:569 #17 0x00007fa7fd8b1062 in crt_hg_addr_lookup_cb (hg_cbinfo=<optimized out>) at src/cart/crt_hg.c:290 #18 0x00007fa7fc7b0985 in hg_core_addr_lookup_cb (callback_info=<optimized out>) at /home/users/daos/daos/_build.external/mercury/src/mercury.c:454 #19 0x00007fa7fc7bbce2 in hg_core_trigger_lookup_entry (hg_core_op_id=0x18ba530) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:3444 #20 hg_core_trigger (context=0x180d590, timeout=<optimized out>, timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7ffe185f88dc) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:3384 #21 0x00007fa7fc7bca4b in HG_Core_trigger (context=<optimized out>, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7ffe185f88dc) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:4900 #22 0x00007fa7fc7b44ed in HG_Trigger (context=context@entry=0x17d6370, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7ffe185f88dc) at /home/users/daos/daos/_build.external/mercury/src/mercury.c:2262 #23 0x00007fa7fd8b37ea in crt_hg_trigger (hg_ctx=hg_ctx@entry=0x18083d8) at src/cart/crt_hg.c:1327 #24 0x00007fa7fd8bce5d in crt_hg_progress (hg_ctx=hg_ctx@entry=0x18083d8, timeout=timeout@entry=1000) at src/cart/crt_hg.c:1360 #25 0x00007fa7fd86dfbb in crt_progress (crt_ctx=0x18083c0, timeout=timeout@entry=-1, cond_cb=cond_cb@entry=0x7fa7fe61f7f0 <ev_progress_cb>, arg=arg@entry=0x7ffe185f89d0) at src/cart/crt_context.c:1286 #26 0x00007fa7fe624bb6 in daos_event_priv_wait () at src/client/api/event.c:1203 #27 0x00007fa7fe627f96 in dc_task_schedule (task=0x181b4e0, instant=instant@entry=true) at src/client/api/task.c:139 #28 0x00007fa7fe626eb1 in daos_pool_connect (uuid=uuid@entry=0x7ffe185f8c38 "UXh}E<Kĭ<\035\340\332O", <incomplete sequence \325>, grp=0x1805f50 "daos_server", svc=0x1806000, flags=flags@entry=1, poh=poh@entry=0x7ffe185f8c48, info=info@entry=0x0, ev=ev@entry=0x0) at src/client/api/pool.c:53 #29 0x000000000040590d in pool_query_hdlr (ap=0x7ffe185f8c20) at src/utils/daos_hdlr.c:141 #30 0x0000000000402bc4 in main (argc=5, argv=<optimized out>) at src/utils/daos.c:957 (gdb)
On 12/25/19, 2:11 AM, "daos@daos.groups.io on behalf of Shengyu SY19 Zhang" <daos@daos.groups.io on behalf of zhangsy19@...> wrote:
Hello Kevan,
Here are the log files. Restart server means daos_server restarted, in spite of Ctrl+C to kill process or server reboot, anyhow after restart daos_server, the existing containers can't be touched, it can be 100% reproduced in my environment.
dmg storage query smd -I ->work. daos container query --svc=0 --path=/tmp/mycontainer ->no response due to infinitely loop daos container create... ->->no response due to infinitely loop
with sockets mode, didn't meet this issue.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm Sent: Wednesday, December 25, 2019 3:30 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Shengyu,
When you say "restart server", do you mean that you rebooted the node, or that you just restarted the daos_server process? Could you send another daos_control.log and daos_server.log from when it fails in this way?
Kevan
On 12/23/19, 11:34 PM, "daos@daos.groups.io on behalf of Shengyu SY19 Zhang" <daos@daos.groups.io on behalf of zhangsy19@...> wrote:
Hello Kevan,
After some more testing, actually the issue is still there, I can get ib to work by following ways: Restart subnet rm -rf /mnt/daos start daos server re-format create pool create container
however once I restart server, will happen the infinitely loop problem, and by any way I can't connect to an existing pool via ib.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm Sent: Saturday, December 21, 2019 6:57 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Shengyu,
As it happens, I also had a case today using infiniband where my daos_test client was in an infinite loop, it generated 200 million lines in daos.log within a minute or so. It turned out that the IB subnet manager process had died. I restarted opensm, then re-ran daos_test, and it started to work. I mention it in case it might be the same problem as yours. Are you sure your subnet manager is working? Try a fi_pingpong test; if it works, then your subnet manager is okay, that's not the problem.
Thanks, Kevan
On 12/19/19, 12:48 AM, "daos@daos.groups.io on behalf of Shengyu SY19 Zhang" <daos@daos.groups.io on behalf of zhangsy19@...> wrote:
Hello Joel,
Thanks for your information, those are outputs of the test with and without rxm specified. Furthermore, I corrected numa setting, I thought there are mismatch issue with ib devices, I have tried unbind all other devices (ib1,2,3), and also tried remove rxm, All same problem happen, daos process just hangs to fi_send due to infinite loop.
After I set FI_LOG_LEVEL=debug, and creating container, I found some interesting, it shows:
libfabric:54383:verbs:core:ofi_check_ep_type():629<info> unsupported endpoint type libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Supported: FI_EP_DGRAM libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Requested: FI_EP_MSG libfabric:54383:verbs:core:ofi_check_domain_attr():525<info> Unknown domain name libfabric:54383:verbs:core:ofi_check_domain_attr():526<info> Supported: mlx5_0-xrc libfabric:54383:verbs:core:ofi_check_domain_attr():526<info> Requested: mlx5_0 libfabric:54383:verbs:core:ofi_check_ep_type():629<info> unsupported endpoint type libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Supported: FI_EP_DGRAM libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Requested: FI_EP_MSG
Then I changed environment domain name to mlx5_0-xrc, there will be another message, can't find ofi+verbs provider, just like before. And BTW I think daos app should be failed when connecting pool rather than infinitely loop in fi_send.
Best Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Thursday, December 19, 2019 12:15 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Based on your logs, the control plane is successfully matching ib0 with mlx5_0. It shows "DEBUG 02:06:24.772249 netdetect.go:536: Device alias for ib0 is mlx5_0" As such, it correctly sets OFI_DOMAIN=mlx5_0. This matches with your topology data as reported by lstopo. Your results should not change if you manually are setting OFI_DOMAIN=mlx5_0 or not, because the control plane is already doing the right thing and you are not giving it a conflicting override. If you find that the behavior changes when you specify OFI_DOMAIN=mlx5_0 in your daos_server.yml, that's a problem we would need to debug.
Your topology shows these 4 interface cards / device combinations (mlx5_0:ib0, mlx5_1:ib1, mlx5_2:ib2 and mlx5_3:ib3).
PCI 15b3:1013 (P#548864 busid=0000:86:00.0 class=0207(IB) link=15.75GB/s PCISlot=5) Network L#7 (Address=20:00:10:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:48 Port=1) "ib0" OpenFabrics L#8 (NodeGUID=b859:9f03:0005:b548 SysImageGUID=b859:9f03:0005:b548 Port1State=4 Port1LID=0x4 Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03:0005 :b548) "mlx5_0"
PCI 15b3:1013 (P#548865 busid=0000:86:00.1 class=0207(IB) link=15.75GB/s PCISlot=5) Network L#9 (Address=20:00:18:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:49 Port=1) "ib1" OpenFabrics L#10 (NodeGUID=b859:9f03:0005:b549 SysImageGUID=b859:9f03:0005:b548 Port1State=1 Port1LID=0xffff Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03: 0005:b549) "mlx5_1"
PCI 15b3:1013 (P#716800 busid=0000:af:00.0 class=0207(IB) link=15.75GB/s PCISlot=6) Network L#11 (Address=20:00:10:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:40 Port=1) "ib2" OpenFabrics L#12 (NodeGUID=b859:9f03:0005:b540 SysImageGUID=b859:9f03:0005:b540 Port1State=4 Port1LID=0x1b Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03:00 05:b540) "mlx5_2"
PCI 15b3:1013 (P#716801 busid=0000:af:00.1 class=0207(IB) link=15.75GB/s PCISlot=6) Network L#13 (Address=20:00:18:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:41 Port=1) "ib3" OpenFabrics L#14 (NodeGUID=b859:9f03:0005:b541 SysImageGUID=b859:9f03:0005:b540 Port1State=1 Port1LID=0xffff Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03: 0005:b541) "mlx5_3"
I agree that fi_info shows that the provider: ofi+verbs;ofi_rxm is valid for ib0, and the control plane agrees with that. "DEBUG 02:06:20.594910 netdetect.go:764: Device ib0 supports provider: ofi+verbs;ofi_rxm" This is only a guess, but I have to rule it out. I am not certain that the provider ofi_rxm is being handled properly. Can you remove "ofi_rxm" from your provider configuration and try again?
That is, in your daos_server.yml set: provider: ofi+verbs
If you still have an error, then you will want to run the cart diagnostics again that Alex wrote about so we can see the latest results with that.
>> orterun --allow-run-as-root -np 2 -x >> CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 -x >> OFI_DOMAIN=mlx5_0 ../bin/crt_launch -e tests/iv_server -v 3
and
>> orterun --allow-run-as-root -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs" -x >> OFI_INTERFACE=ib0 -x OFI_DOMAIN=mlx5_0 ../bin/crt_launch -e >> tests/iv_server -v 3
One last thing, while this won't generate a runtime error _yet_, you have a mismatch between the pinned_numa_node:0 and the actual NUMA node of your ib0 device. Your topology data shows it as NUMA node 1. If you run "daos_server network scan -a" it should show you that the correct pinned_numa_node is 1. By setting it to the wrong NUMA node, you will have a performance impact once this is running because the daos_io_server will bind the threads to cores in the wrong NUMA node. The plan is to make a validation error like this a hard error instead of a warning. There's debug output in your daos_control.log that looks like this:
DEBUG 02:06:20.595012 netdetect.go:901: Validate network config -- given numaNode: 0 DEBUG 02:06:20.872053 netdetect.go:894: ValidateNUMAConfig (device: ib0, NUMA: 0) returned error: The NUMA node for device ib0 does not match the provided value 0. Remove the pinned_numa_node value from daos_server.yml then execute 'daos_server network scan' to see the valid NUMA node associated with the network device
I'll keep working on it with Alex. Let's see what you find.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Wednesday, December 18, 2019 4:25 AM Subject: FW: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
I'm wondering if you have any updates for this? I'm also a developer and I very familiar with spdk/dpdk/ibverbs, however I'm not familiar with other projects which depended by the DAOS, if you can give me some hints or guide, I would like to try troubleshooting this issue as well. Also if you need environment to reproduce the issue, you may connect to our machine for debug.
Regards, Shengyu
-----Original Message----- From: Shengyu SY19 Zhang Sent: Thursday, December 12, 2019 4:05 PM Subject: RE: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
Yes, please see it in the attachment.
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Thursday, December 12, 2019 11:08 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Can you generate a topology file for me, similar to what I asked Kevan to provide? You can generate it with with "lstopo --of xml > topology.xml" I am interested in seeing if your system also has device siblings with same/different ports as the one specified as the OFI_INTERFACE.
Your debug log shows that DAOS chose the sibling of ib0 as mlx5_0. It's correct that it picked something in the mlxN_N family, but, depending on your topology there could be a better device to choose, possibly one that has a port match. Your topology file will show if mlx5_0 matches the port or not, and similarly to Kevan's, it will help me develop a better function to find the correct matching sibling.
I don't know why cart failed once it had the OFI_DEVICE that we thought was correct. Alex's experiment with Kevan will help with that.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Wednesday, December 11, 2019 2:09 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
Please see those files in attachment, after changed some configurations, daos trapped in (no return from rdma_get_cm_event ) creating container, here is bt log:
(gdb) bt #0 0x00007fb8d6ef59cd in write () from /lib64/libc.so.6 #1 0x00007fb8d45a99ad in rdma_get_cm_event.part.18 () from /lib64/librdmacm.so.1 #2 0x00007fb8d517f836 in fi_ibv_eq_read () from /root/daos/install/lib/libfabric.so.1 #3 0x00007fb8d51a556f in rxm_eq_read () from /root/daos/install/lib/libfabric.so.1 #4 0x00007fb8d51a720f in rxm_msg_eq_progress () from /root/daos/install/lib/libfabric.so.1 #5 0x00007fb8d51a738d in rxm_cmap_connect () from /root/daos/install/lib/libfabric.so.1 #6 0x00007fb8d51ad54b in rxm_ep_tsend () from /root/daos/install/lib/libfabric.so.1 #7 0x00007fb8d5f1e468 in fi_tsend (context=0xbc4018, tag=1, dest_addr=0, desc=0xbaf6f0, len=384, buf=0x7fb8d235c038, ep=0xbb03a0) at /root/daos/install/include/rdma/fi_tagged.h:114 #8 na_ofi_msg_send_unexpected (na_class=0xacf220, context=0xbbf200, callback=<optimized out>, arg=<optimized out>, buf=0x7fb8d235c038, buf_size=384, plugin_data=0xbaf6f0, dest_addr=0xc6a1b0, dest_id=0 '\000', tag=1, op_id=0xbc14e8) at /root/daos/_build.external/mercury/src/na/na_ofi.c:3622 #9 0x00007fb8d613885f in NA_Msg_send_unexpected (op_id=0xbc14e8, tag=<optimized out>, dest_id=<optimized out>, dest_addr=<optimized out>, plugin_data=<optimized out>, buf_size=<optimized out>, buf=<optimized out>, arg=0xbc13c0, callback=0x7fb8d613a8c0 <hg_core_send_input_cb>, context=<optimized out>, na_class=<optimized out>) at /root/daos/_build.external/mercury/src/na/na.h:1485 #10 hg_core_forward_na (hg_core_handle=0xbc13c0) at /root/daos/_build.external/mercury/src/mercury_core.c:2076 #11 0x00007fb8d613c3a6 in HG_Core_forward (handle=0xbc13c0, callback=callback@entry=0x7fb8d6131740 <hg_core_forward_cb>, arg=arg@entry=0xbc41f0, flags=<optimized out>, payload_size=<optimized out>) at /root/daos/_build.external/mercury/src/mercury_core.c:4748 #12 0x00007fb8d6135057 in HG_Forward (handle=0xbc41f0, callback=callback@entry=0x7fb8d7233930 <crt_hg_req_send_cb>, arg=arg@entry=0xc68cb0, in_struct=in_struct@entry=0xc68cd0) at /root/daos/_build.external/mercury/src/mercury.c:2147 #13 0x00007fb8d723ade9 in crt_hg_req_send (rpc_priv=rpc_priv@entry=0xc68cb0) at src/cart/crt_hg.c:1190 #14 0x00007fb8d728b89a in crt_req_send_immediately (rpc_priv=<optimized out>) at src/cart/crt_rpc.c:1104 #15 crt_req_send_internal (rpc_priv=rpc_priv@entry=0xc68cb0) at src/cart/crt_rpc.c:1173 #16 0x00007fb8d728fed3 in crt_req_hg_addr_lookup_cb (hg_addr=0xc6a150, arg=0xc68cb0) at src/cart/crt_rpc.c:569 #17 0x00007fb8d7232012 in crt_hg_addr_lookup_cb (hg_cbinfo=<optimized out>) at src/cart/crt_hg.c:290 #18 0x00007fb8d6131835 in hg_core_addr_lookup_cb (callback_info=<optimized out>) at /root/daos/_build.external/mercury/src/mercury.c:454 #19 0x00007fb8d613caa2 in hg_core_trigger_lookup_entry (hg_core_op_id=0xc6a0f0) at /root/daos/_build.external/mercury/src/mercury_core.c:3444 #20 hg_core_trigger (context=0xbbd030, timeout=<optimized out>, timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7fff3d70388c) at /root/daos/_build.external/mercury/src/mercury_core.c:3384 #21 0x00007fb8d613d80b in HG_Core_trigger (context=<optimized out>, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7fff3d70388c) at /root/daos/_build.external/mercury/src/mercury_core.c:4873 #22 0x00007fb8d613534d in HG_Trigger (context=context@entry=0xacf1e0, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7fff3d70388c) at /root/daos/_build.external/mercury/src/mercury.c:2244 #23 0x00007fb8d723479a in crt_hg_trigger (hg_ctx=hg_ctx@entry=0xbb7e78) at src/cart/crt_hg.c:1326 #24 0x00007fb8d723de0d in crt_hg_progress (hg_ctx=hg_ctx@entry=0xbb7e78, timeout=timeout@entry=1000) at src/cart/crt_hg.c:1359 #25 0x00007fb8d71eef6b in crt_progress (crt_ctx=0xbb7e60, timeout=timeout@entry=-1, cond_cb=cond_cb@entry=0x7fb8d7fa0230 <ev_progress_cb>, arg=arg@entry=0x7fff3d703980) at src/cart/crt_context.c:1286 #26 0x00007fb8d7fa55f6 in daos_event_priv_wait () at src/client/api/event.c:1203 #27 0x00007fb8d7fa89d6 in dc_task_schedule (task=0xbcafa0, instant=instant@entry=true) at src/client/api/task.c:139 #28 0x00007fb8d7fa78f1 in daos_pool_connect (uuid=uuid@entry=0x7fff3d703b68 "\265\215%f\036AN\354\203\002\317I\035\362\067\273", grp=0xbb59f0 "daos_server", svc=0xbb5aa0, flags=flags@entry=2, poh=poh@entry=0x7fff3d703b78, info=info@entry=0x0, ev=ev@entry=0x0) at src/client/api/pool.c:53 #29 0x0000000000404eb0 in cont_op_hdlr (ap=ap@entry=0x7fff3d703b50) at src/utils/daos.c:610 #30 0x0000000000402b94 in main (argc=8, argv=<optimized out>) at src/utils/daos.c:957
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Tuesday, December 10, 2019 8:48 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
It's hard to further diagnose without the logs. Can you share your latest daos_server.yml, full daos_control.log and full server.log? In the daos_server.yml, please set control_log_mask: DEBUG and in the io server section, set log_mask: DEBUG
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Tuesday, December 10, 2019 1:42 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
I didn't set OFI_DOMAIN=mlx5_0 before, from the hint of Alex, I set it yesterday, there then there is another error while creating container: mgmt ERR src/mgmt/cli_mgmt.c:325 get_attach_info() GetAttachInfo unsuccessful: 2
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Monday, December 9, 2019 11:47 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
With your latest setup, can you launch DAOS and send your latest daos_control.log? In particular, I want to see how the daos_io_server environment variables are set. For example these two lines below show the command line args and the environment variables in use with an ofi+sockets/ib1 config. I'm looking to see if we are setting OFI_DOMAIN=mlx5_0 in your environment. I seem to recall that your earlier logs did have this set, but since builds have changed since then, it's worth checking out one more time.
DEBUG 16:33:15.711949 exec.go:112: daos_io_server:1 args: [-t 8 -x 2 -g daos_server -d /tmp/daos_sockets -s /mnt/daos1 -i 1842327892 -p 1 -I 1] DEBUG 16:33:15.711963 exec.go:113: daos_io_server:1 env: [CRT_TIMEOUT=30 FI_SOCKETS_CONN_TIMEOUT=2000 D_LOG_FILE=/tmp/server.log CRT_CTX_SHARE_ADDR=0 FI_SOCKETS_MAX_CONN_RETRY=1 D_LOG_MASK=ERR CRT_PHY_ADDR_STR=ofi+sockets OFI_INTERFACE=ib1 OFI_PORT=31416 DAOS_MD_CAP=1024]
If we are not setting OFI_DOMAIN=mlx5_0 automatically, go ahead and edit your daos_server.yml and add OFI_DOMAIN=mlx5_0 to the env_vars section (per below) and see what you get. If that doesn't get you going, please send the log for that run, too.
env_vars: - OFI_DOMAIN=mlx5_0
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Monday, December 9, 2019 5:03 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Actually this is really good result - two servers were able to exchange rpcs. That's the end of test -- just ctrl+C out of it.
I'll let Joel take over for daos help on how to make daos use mlx5_0 domain.
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, December 09, 2019 12:57 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Please see the new log below, and the test seems dead (no more response):
ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0x7fdd21092400 valid_mask = 0x3) [afa1][[44100,1],1][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0x195aad0 valid_mask = 0x3) [afa1][[44100,1],0][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument -------------------------------------------------------------------------- WARNING: There was an error initializing an OpenFabrics device.
Local host: afa1 Local device: mlx5_0 -------------------------------------------------------------------------- 12/09-03:53:20.32 afa1 CaRT[101464/101464] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically 12/09-03:53:20.32 afa1 CaRT[101465/101465] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically SRV [rank=1 pid=101465] Server starting, self_rank=1 SRV [rank=0 pid=101464] Server starting, self_rank=0 SRV [rank=1 pid=101465] >>>> Entered iv_set_ivns SRV [rank=1 pid=101465] <<<< Exited iv_set_ivns:773
[afa1:101458] 1 more process has sent help message help-mpi-btl-openib.txt / error in device init [afa1:101458] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Monday, December 9, 2019 4:30 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Given your output can you try this now?
orterun --allow-run-as-root -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 -x OFI_DOMAIN=mlx5_0 ../bin/crt_launch -e tests/iv_server -v 3
Please note additional OFI_DOMAIN envariable.
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, December 09, 2019 12:25 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Here is outputs of fio_info related to verbs:
provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_2 version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_RC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_2-xrc version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_XRC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_0 version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_RC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_0-xrc version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_XRC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_0-dgram version: 1.0 type: FI_EP_DGRAM protocol: FI_PROTO_IB_UD provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_2-dgram version: 1.0 type: FI_EP_DGRAM protocol: FI_PROTO_IB_UD provider: verbs;ofi_rxm fabric: IB-0xfe80000000000000 domain: mlx5_2 version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: verbs;ofi_rxm fabric: IB-0xfe80000000000000 domain: mlx5_0 version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: verbs;ofi_rxd fabric: IB-0xfe80000000000000 domain: mlx5_2-dgram version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: verbs;ofi_rxd fabric: IB-0xfe80000000000000 domain: mlx5_0-dgram version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Monday, December 9, 2019 4:08 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Thanks, Can you also provide full fi_info output?
~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, December 09, 2019 12:05 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Here are the outputs:
orterun --allow-run-as-root -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 ../bin/crt_launch -e tests/iv_server -v 3 ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0x2094f90 valid_mask = 0x3) [afa1][[18544,1],1][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0xacda60 valid_mask = 0x3) [afa1][[18544,1],0][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument -------------------------------------------------------------------------- WARNING: There was an error initializing an OpenFabrics device.
Local host: afa1 Local device: mlx5_0 -------------------------------------------------------------------------- 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:1609 # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0" 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:1609 # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0" 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:2975 # na_ofi_initialize(): Could not open domain for verbs;ofi_rxm, ib0 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na.c:324 # NA_Initialize_opt(): Could not initialize plugin 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR src/cart/crt_hg.c:525 crt_hg_init() Could not initialize NA class. 12/09-03:01:03.53 afa1 CaRT[92269/92269] crt ERR src/cart/crt_init.c:347 crt_init_opt() crt_hg_init failed rc: -1020. 12/09-03:01:03.53 afa1 CaRT[92269/92269] crt ERR src/cart/crt_init.c:421 crt_init_opt() crt_init failed, rc: -1020. 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:2975 # na_ofi_initialize(): Could not open domain for verbs;ofi_rxm, ib0 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na.c:324 # NA_Initialize_opt(): Could not initialize plugin 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR src/cart/crt_hg.c:525 crt_hg_init() Could not initialize NA class. 12/09-03:01:03.53 afa1 CaRT[92268/92268] crt ERR src/cart/crt_init.c:347 crt_init_opt() crt_hg_init failed rc: -1020. 12/09-03:01:03.53 afa1 CaRT[92268/92268] crt ERR src/cart/crt_init.c:421 crt_init_opt() crt_init failed, rc: -1020. [afa1:92262] 1 more process has sent help message help-mpi-btl-openib.txt / error in device init [afa1:92262] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Saturday, December 7, 2019 1:14 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
With latest daos code and mofed 4.6 installed can you rerun this and show what that one gives you?
source scons_local/utils/setup_local.sh cd install/Linux/TESTING orterun -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 ../bin/crt_launch -e tests/iv_server -v 3
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Thursday, December 05, 2019 10:25 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Yes, however with 4.6 the result is same. After I upgraded daos code to newest of master branch, I got some different results, daos io server seems started OK, since I can see lots of fd points to rdma_cm. But daos client seems can't connect to server due to same error (can't find efi+verbs provider on ib0) like the log shows, you may find the log in the attachment, that is crated via "create container"
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Wednesday, December 4, 2019 4:02 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Can you try installing MOFED 4.6 packages on your system? In general MOFED is required to get verbs over Mellanox working. Those packages can be found at: https://www.mellanox.com/page/mlnx_ofed_matrix?mtag=linux_sw_drivers
There is also 4.7 version available, however there seem to be few longevity issues currently when using 4.7 (according to verbs ofi maintainers).
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, November 25, 2019 9:55 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Thanks for your suggestion, here is the log:
mca_base_component_repository_open: unable to open mca_pml_ucx: libucp.so.0: cannot open shared object file: No such file or directory (ignored) 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:1407 # na_ofi_getinfo(): fi_getinfo() failed, rc: -61(No data available) 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:2816 # na_ofi_check_protocol(): na_ofi_getinfo() failed 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na.c:302 # NA_Initialize_opt(): No suitable plugin found that matches ofi+verbs;ofi_rxm://192.168.80.120 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR src/cart/crt_hg.c:521 crt_hg_init() Could not initialize NA class. 11/26-00:40:22.65 afa1 CaRT[365504/365504] crt ERR src/cart/crt_init.c:347 crt_init_opt() crt_hg_init failed rc: -1020. 11/26-00:40:22.65 afa1 CaRT[365504/365504] crt ERR src/cart/crt_init.c:421 crt_init_opt() crt_init failed, rc: -1020.
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Tuesday, November 26, 2019 12:21 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
In order to figure out what is the issue on your system could you run cart standalone test instead and provide the output that you get?
cd daos_dir source scons_local/utils/setup_local.sh cd install/Linux/TESTING orterun -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 ../bin/crt_launch -e tests/iv_server -v 3
Note: Depending on how you installed daos your paths might be different, so instead of cd install/Linux/TESTING you might have to cd into different directory first where you have tests/iv_server in. I think in your env it will be cd /root/daos/install/TESTING/ or cd /root/daos/install/cart/TESTING.
Expected output: 11/25-15:51:48.39 wolf-55 CaRT[53295/53295] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically 11/25-15:51:48.40 wolf-55 CaRT[53296/53296] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically SRV [rank=0 pid=53295] Server starting, self_rank=0 SRV [rank=1 pid=53296] Server starting, self_rank=1 SRV [rank=1 pid=53296] >>>> Entered iv_set_ivns SRV [rank=1 pid=53296] <<<< Exited iv_set_ivns:773
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang Sent: Monday, November 25, 2019 3:28 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
As the shown in the output.log, there is only one version of libfabrics installed in my machine, and actually I don't nave other software which depends libfabraics installed. From you guide to set FI_LOG_LEVEL=debug, I can see the following message, may be helpful:
libfabric:123445:verbs:fabric:fi_ibv_set_default_attr():1263<info> Ignoring provider default value for tx rma_iov_limit as it is greater than the value supported by domain: mlx5_0 libfabric:123445:verbs:fabric:fi_ibv_get_matching_info():1365<info> hints->ep_attr->rx_ctx_cnt != FI_SHARED_CONTEXT. Skipping XRC FI_EP_MSG endpoints ERROR: daos_io_server:0 libfabric:123445:verbs:core:fi_ibv_check_hints():231<info> Unsupported capabilities libfabric:123445:verbs:core:fi_ibv_check_hints():232<info> Supported: FI_MSG, FI_RECV, FI_SEND, FI_LOCAL_COMM, FI_REMOTE_COMM libfabric:123445:verbs:core:fi_ibv_check_hints():232<info> Requested: FI_MSG, FI_RMA, FI_READ, FI_RECV, FI_SEND, FI_REMOTE_READ ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: No such device(19) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: No such device(19) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: Invalid argument(22) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: Invalid argument(22) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: Invalid argument(22) ERROR: daos_io_server:0 libfabric:123445:core:core:ofi_layering_ok():795<info> Need core provider, skipping ofi_rxd libfabric:123445:core:core:ofi_layering_ok():795<info> Need core provider, skipping ofi_mrail
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Saturday, November 23, 2019 3:20 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
The debug output showed me that when daos_server is started via orterun, libfabric is not finding provider support for ofi_rxm at least. I'm still wondering if you have two different versions of libfabric installed on your machine.
Can you run these commands and provide the output?
1) ldd install/bin/daos_server 2) modify your orterun command to run ldd on daos_server. For example, I run this command locally: orterun --allow-run-as-root --map-by node --mca btl tcp,self --mca oob tcp -np 1 --hostfile /home/jbrosenz/daos/hostfile --enable-recovery --report-uri /tmp/urifile ldd /home/jbrosenz/daos/install/bin/daos_server 3) which fi_info 4) ldd over each version of fi_info found
From the data you provide, I'll understand if the libfabric being used by daos_server when executed directly by you in the shell is the same libfabric being used by daos_server when executed via orterun. Your original "daos_server network scan" output showed support for ofi+verbs;ofi_rxm but your debug output showed that when daos_server was started (via orterun), libfabric could not find support for the very same providers. If there are two different versions being used with different configurations, it would explain the failure. If it's a single installation/configuration, then that will lead the debug in another direction.
Depending on what you find through 1-4, you might find it helpful to export the environment variable FI_LOG_LEVEL=debug which will instruct libfabric to output a good deal of debug info.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Friday, November 22, 2019 12:59 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
Please see those files in attachment. I have tried two machines, one have full provider shows in fi_info (verbs and rxm), another doesn't show verbs, but they are same can't start io_server. I found the project conflicts with mellanox drivers, therefor I remove it and use yum package only, however still keep not working.
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Friday, November 22, 2019 6:35 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Can you share your daos_server.yml so we can see how you enabled the provider? And, can you share the log files daos_control.log and server.log so we can see more context?
Thank you, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Wednesday, November 20, 2019 9:23 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello,
Thank you for your help Alex, Joel and Kevin, I have checked those steps that you provided:
Ibstat: State: Active Physical state: LinkUp
Ifconfig: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
fi_info: verbs: version: 1.0 ofi_rxm: version: 1.0 ofi_rxd: version: 1.0
And network is good since I can run SPDK NVMe-oF over Infiniband with good working. I also specified "ofi+verbs;ofi_rxm", the same error occurred, the ioserver will be stopped after a while, and print log as I provided previously.
And I noticed, whatever I specify ofi+verbs, ofi_rxm, or ofi+verbs;ofi_rxm, the log keep shows No provider found for "verbs;ofi_rxm" provider on domain "ib0", is it the cause?
BTW: it is working under ofi+sockets.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Thursday, November 21, 2019 7:13 AM Subject: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
> However if I specify either ofi+verbs or ofi_rxm, the same error will happen, and io_server will stop. > na_ofi.c:1609 > # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0"
To use supported verbs provider you need to have "ofi+verbs;ofi_rxm" in the provider string.
~~Alex.
-----Original Message----- From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Rosenzweig, Joel B Sent: Wednesday, November 20, 2019 7:37 AM Subject: Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
The daos_server network scan uses information provided by libfabric to determine available devices and providers. It then cross references that list of devices with device names obtained from hwloc to convert libfabric device names (as necessary) to those you'd find via ifconfig. Therefore, if "daos_server network scan" displays a device and provider, it means that support for that via libfabric has been provided. However, as Kevin pointed out, it's possible that the device itself was down, and that could certainly generate an error like what you encountered. There's another possibility, that you might have more than one version of libfabric installed in your environment. I have run into this situation in our lab environment. You might check your target system to see if it has more than one libfabric library with different provider support.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Harms, Kevin via Groups.Io Sent: Wednesday, November 20, 2019 10:04 AM Subject: Re: [daos] Does DAOS support infiniband now?
Shengyu,
I have tried IB and it works. Verify the libfabric verbs provider is available.
fi_info -l
you should see these:
ofi\_rxm: version: 1.0
verbs: version: 1.0
See here for details:
You might also want to confirm ib0 is in the UP state:
[root@daos01 ~]# ifconfig ib0 ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092 inet 172.25.6.101 netmask 255.255.0.0 broadcast 172.25.255.255
kevin
________________________________________ From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...> Sent: Wednesday, November 20, 2019 2:54 AM Subject: [daos] Does DAOS support infiniband now?
Hello,
I use daos_server network scan, it shows as following: fabric_iface: ib0 provider: ofi+verbs;ofi_rxm pinned_numa_node: 1
However if I specify either ofi+verbs or ofi_rxm, the same error will happen, and io_server will stop. na_ofi.c:1609 # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0"
The ib0 is Mellanox nic over Infiniband network.
Regards, Shengyu.
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for |
|
Re: [External] Re: [daos] Does DAOS support infiniband now?
Oganezov, Alexander A
Hi Shengyu,
After talking to mercury developer, it sounds like you still might have a mismatch between versions of cart and mercury that you are using. Can you do a fresh build of daos by first fully removing install/ and _build_external.Linux/ directories and see if that solves the issue?
Thanks, ~~Alex.
From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of
Shengyu SY19 Zhang
Sent: Tuesday, January 21, 2020 12:22 AM To: daos@daos.groups.io Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Please see the log attached.
Regards, Shengyu.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Oganezov, Alexander A
Shengyu,
Can you provide full log from start until first occurrence of this error?
Thanks, ~~Alex.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Alex,
I have removed mercury, the io_server seems started normally ,however the daos_server.log increasing quickly to eat all my free space. It infinite repeat the three lines: 01/21-01:40:49.28 afa1 DAOS[72964/73009] hg ERR src/cart/crt_hg.c:1331 crt_hg_trigger() HG_Trigger failed, hg_ret: 18.
Regards, Shengyu,
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Oganezov, Alexander A
One other thing Shengyu,
Can you verify that your mercury build is up to date? Based on domain being used as “mlx5_0/192.168.80.161” it sounds like there is mismatch between what CaRT generates and what mercury consumes; there has been change few weeks ago regarding how domain is provided to mercury level, and it feels as if older mercury is being used.
To ensure clean mercury build remove libmercury* libna* from your install/lib location, remove _build.external-Linux/mercury directory and recompile daos with scons –build-deps=yes –config=force install
Thanks, ~~Alex.
From: Oganezov, Alexander A
Hi Shengyu,
What does this command return to you on the node where you see this strange domain name? fi_info --provider="verbs"
Thanks, ~~Alex.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Joel,
I have some more information, In the file na_ofi.c + 1609, the io_serve exit there, and on above code, na_ofi_verify_provider function will compare domain names, the domain is “mlx5_0/192.168.80.161”, while prov->domain_attr.name is “mlx5_0”, it will always return FALSE.
Regards, Shengyu
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Joel,
Network scan: Scanning fabric for YML specified provider: ofi+verbs;ofi_rxm
And log attached, it is standalone server. Additionally, I created two identical virtual machines with IB sr-iov pass-through, however one vm can’t start due to the same problem, while another can start normally. They were using ofi+verbs;ofi_rxm as provider.
Regards, Shengyu
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Rosenzweig, Joel B
Hello Shengyu,
It appears that your daos_server.yml is specifying the provider as “ofi+verbs” but I think it should be set to “ofi+verbs;ofi_rxm”. Can you configure your daos_server.yml with that and try again? And then, if things still do not work, then please provide:
1) the network scan output again after you make the provider change 2) the portion of the debug log that shows the environment variables being provided to the daos_io_server. This will show us what is being set for OFI_INTERFACE, OFI_DOMAIN in the daos_io_server environment. 3) the daos_server.yml so I can see how you have configured each daos_io_server.
Regards, Joel
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Johann,
Here is logs:
Scanning fabric for YML specified provider: ofi+verbs
And please see my inline comments.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi,
Cc’ing Joel. Those errors indicate that you still have some network setup issue. Could you please run daos_server network scan?
e.g.: [root@wolf-118 ~]# daos_server network scan -p "ofi+verbs;ofi_rxm" Scanning fabric for cmdline specified provider: ofi+verbs;ofi_rxm Fabric scan found 3 devices matching the provider spec: ofi+verbs;ofi_rxm
fabric_iface: ib0 provider: ofi+verbs;ofi_rxm pinned_numa_node: 0
fabric_iface: ib1 provider: ofi+verbs;ofi_rxm pinned_numa_node: 1
fabric_iface: eth0 provider: ofi+verbs;ofi_rxm pinned_numa_node: 0
As for the NVMe issue, do I understand correctly that: - The PCI addresses of the 8 NVMe SSDs show up fine via daos_server storage scan Yes.
-
You have reduced the number of huge pages to 4096 pages (8GB) and all the SPDK errors related to failed huge pages allocation are gone as well as this error from the log: Yes, only reporting pci addr not found and can’t start, then reduced nvme to 1 seems pass this step. - But the io_server fails to start after dmg storage format? Yes.
I had a second look at daos_control2.log and noticed that you are using 20 targets while you have 8 SSDs and 10 cores. Could you please try with #targets = #SSDs = 8 and set nr_xs_helpers to 0? This config file was copies from physical machine, maybe some problem, several days ago, it and physical server are working, tried your suggestion, still same while formatting and can’t start again. daos_io_server:0 Using legacy core allocation algorithm
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello Johann,
For the first issue (daos_control.log), it is physical machine and specified 8 NVMes, daos server report can’t find PCI address (created the log). Then I use only one NVMe, daos server can start normally, however it stopped after dmg storage format, just like the second issue (daos_control2.log).
DEBUG 02:18:19.457860 config.go:378: Active config saved to /root/daos/install/etc/.daos_server.active.yml (read-only)
Regards, Shengyu,
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi Shengyu,
Mike looked into the logs you provided and noticed that:
## ERROR: requested 40960 hugepages but only 8585 could be allocated. ## Memory might be heavily fragmented. Please try flushing the system cache, or reboot the machine.
Maybe you meant to specify 4096 in the yaml file instead of 40960 for nr_hugepages? It sounds like you are trying to allocate 82GB of RAM for hugepages.
Cheers, Johann
From:
<daos@daos.groups.io> on behalf of "Nabarro, Tom" <tom.nabarro@...>
There seems to be multiple messages about not being able to meet the required number of hugepages requested. I was not involved in the work but there have been changes in how huge page allocation is performed for each of the daos_io_server instances and we are now performing automatic storage prepare when starting daos_server.
What I would suggest is to reboot the nodes (in case the issues to do with memory fragmentation) and try with a smaller number of drives configured in the config file. Don’t try to do a storage prepare before running daos_server (as it will be performed on start-up anyway). And update to recent master before trying please. You could also try to bump the number of huge pages specified in the server config file to maybe 8192?
Regards, Tom Nabarro – DCG/ESAD M: +44 (0)7786 260986 Skype: tom.nabarro
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello,
Sorry I forgot attachments.
Regards, Shengyu
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Shengyu SY19 Zhang
Hello Johann,
I couldn’t check it at this time, since I can’t start daos server (newest code of master) on my two different machines, I assume there are lots of modification in control path, there could be some new issues: 1. Daos_server storage prepare –-nvme-only, all my NVMe disks switched to uio as expected, then issue storage scan, can see those disks as expected as well. However, when I start daos server, it report error that reporting can’t find PCI address, and all NVMs switched to kernel driver, see daos_control.log 2. On another machine, it just stopped after being formatted, see doas_control2.log
Regards, Shengyu,
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Thanks for the logs Shengyu. It does not seem to be related to wrong endpoint addresses. The client did find the server, but this latter returned DER_NONEXIST when connecting to the pool. It might be the same problem fixed recently by PR #1701. Could you please apply the patch or try with latest master?
Cheers, Johann
From:
<daos@daos.groups.io>
on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello Johann,
Here are log files. My step: 1. Create fresh environment and start daos server and then format it. 2. Create pool 3. Create container. 4. List container: daos container query –-svc=0 –path=/tmp/mycontainer, that work great. 5. CTRl +C to kill daos server 6. Restart daos server 7. Repeat 4, daos process will be dead with infinitely loop.
Regards, Shengyu.
From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hm, the issue with server restart might be due to the endpoint address of the servers not being persistent. Could you please collect full debug logs for the fresh start with reformat and the subsequent restart?
Cheers, Johann
From:
<daos@daos.groups.io>
on behalf of Shengyu SY19 Zhang <zhangsy19@...>
Hello,
@Johann, Newest version will work on new formatted and created pool, yes, I did set in the yaml. @Kal, no, I still meet the issue after io_server restart, seems there issues after load existing data.
Regards, Shengyu. From:
daos@daos.groups.io <daos@daos.groups.io>
On Behalf Of Lombardi, Johann
Hi Kevan & Shengyu,
Could you please advise what commit hash you use? Also, are you specifying in “fabric_iface_port” in the yaml file?
Cheers, Johann
From:
<daos@daos.groups.io>
on behalf of Kevan Rehm <kevan.rehm@...>
I have not had any success, I see the same failure sequence as Shengyu, and due to other commitments I've had to set this aside for a few days. Hope to get back to it in a week or so.
Kevan
On 1/8/20, 2:15 PM, "daos@daos.groups.io on behalf of Alfizah, Kurniawan" <daos@daos.groups.io on behalf of kurniawan.alfizah@...> wrote:
Hello Shengyu and Kevan, I'm wondering if you have resolved this problem and that DAOS is working well with IB.
Cheers, Kal
-----Original Message----- From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang Sent: Saturday, December 28, 2019 2:46 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Kevan,
Yes, its exact the same problem that I meet, the rdma_get_cm_event function will issue write system call to receive completing event from from ib device, however it get nothing at this time, and always return EAGAIN, that caused fi_tsend function infinitely loop.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm Sent: Saturday, December 28, 2019 12:30 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Shengyu,
I think we are going to need help from the experts, I am not familiar with this code. I tried the same commands that you mentioned in your last email, and they also hang for me. But I do not see an infinite loop; rather the daos process just hangs forever in a write() request. Is that what you see as well?
Experts: is there any documentation on CaRT, what it is for, internals? I have not been able to find anything.
The last entries in the daos_server_srv1.log file are just the ds_mgmt_drpc_get_attach_info() log messages, called from daos_agent.
While the daos command was hung, I sent a kill -6 signal to it to collect the corefile. It seems like the command has attempted to set up a MSG connection to the daos_io_server, but has not received a completion event. The dest_addr=0 looks a little suspicious in the fi_tsend() call. Hopefully others will recognize what the problem is in the backtrace below, otherwise I will keep digging as time permits.
Thanks, Kevan
(gdb) bt #0 0x00007fa7fd5749cd in write () from /lib64/libc.so.6 #1 0x00007fa7fac2d875 in rdma_get_cm_event.part.13 () from /lib64/librdmacm.so.1 #2 0x00007fa7fb7fd856 in fi_ibv_eq_read () from /home/users/daos/daos/install/lib/libfabric.so.1 #3 0x00007fa7fb82360f in rxm_eq_read () from /home/users/daos/daos/install/lib/libfabric.so.1 #4 0x00007fa7fb8252af in rxm_msg_eq_progress () from /home/users/daos/daos/install/lib/libfabric.so.1 #5 0x00007fa7fb82542d in rxm_cmap_connect () from /home/users/daos/daos/install/lib/libfabric.so.1 #6 0x00007fa7fb82b5eb in rxm_ep_tsend () from /home/users/daos/daos/install/lib/libfabric.so.1 #7 0x00007fa7fc59f3e8 in fi_tsend (context=0x1814558, tag=1, dest_addr=0, desc=0x1811ab0, len=332, buf=0x7fa7f6a10038, ep=0x17f03c0) at /home/users/daos/daos/install/include/rdma/fi_tagged.h:114 #8 na_ofi_msg_send_unexpected (na_class=0x17d6250, context=0x180f760, callback=<optimized out>, arg=<optimized out>, buf=0x7fa7f6a10038, buf_size=332, plugin_data=0x1811ab0, dest_addr=0x18ba5e0, dest_id=0 '\000', tag=1, op_id=0x1811a48) at /home/users/daos/daos/_build.external/mercury/src/na/na_ofi.c:3745 #9 0x00007fa7fc7b79ff in NA_Msg_send_unexpected (op_id=0x1811a48, tag=<optimized out>, dest_id=<optimized out>, dest_addr=<optimized out>, plugin_data=<optimized out>, buf_size=<optimized out>, buf=<optimized out>, arg=0x1811920, callback=0x7fa7fc7b9a60 <hg_core_send_input_cb>, context=<optimized out>, na_class=<optimized out>) at /home/users/daos/daos/_build.external/mercury/src/na/na.h:1506 #10 hg_core_forward_na (hg_core_handle=0x1811920) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:2076 #11 0x00007fa7fc7bb5e6 in HG_Core_forward (handle=0x1811920, callback=callback@entry=0x7fa7fc7b0890 <hg_core_forward_cb>, arg=arg@entry=0x1814730, flags=<optimized out>, payload_size=<optimized out>) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:4775 #12 0x00007fa7fc7b41f7 in HG_Forward (handle=0x1814730, callback=callback@entry=0x7fa7fd8b2980 <crt_hg_req_send_cb>, arg=arg@entry=0x18b9190, in_struct=in_struct@entry=0x18b91b0) at /home/users/daos/daos/_build.external/mercury/src/mercury.c:2165 #13 0x00007fa7fd8b9e39 in crt_hg_req_send (rpc_priv=rpc_priv@entry=0x18b9190) at src/cart/crt_hg.c:1191 #14 0x00007fa7fd90a8ea in crt_req_send_immediately (rpc_priv=<optimized out>) at src/cart/crt_rpc.c:1104 #15 crt_req_send_internal (rpc_priv=rpc_priv@entry=0x18b9190) at src/cart/crt_rpc.c:1173 #16 0x00007fa7fd90ef23 in crt_req_hg_addr_lookup_cb (hg_addr=0x18ba590, arg=0x18b9190) at src/cart/crt_rpc.c:569 #17 0x00007fa7fd8b1062 in crt_hg_addr_lookup_cb (hg_cbinfo=<optimized out>) at src/cart/crt_hg.c:290 #18 0x00007fa7fc7b0985 in hg_core_addr_lookup_cb (callback_info=<optimized out>) at /home/users/daos/daos/_build.external/mercury/src/mercury.c:454 #19 0x00007fa7fc7bbce2 in hg_core_trigger_lookup_entry (hg_core_op_id=0x18ba530) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:3444 #20 hg_core_trigger (context=0x180d590, timeout=<optimized out>, timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7ffe185f88dc) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:3384 #21 0x00007fa7fc7bca4b in HG_Core_trigger (context=<optimized out>, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7ffe185f88dc) at /home/users/daos/daos/_build.external/mercury/src/mercury_core.c:4900 #22 0x00007fa7fc7b44ed in HG_Trigger (context=context@entry=0x17d6370, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7ffe185f88dc) at /home/users/daos/daos/_build.external/mercury/src/mercury.c:2262 #23 0x00007fa7fd8b37ea in crt_hg_trigger (hg_ctx=hg_ctx@entry=0x18083d8) at src/cart/crt_hg.c:1327 #24 0x00007fa7fd8bce5d in crt_hg_progress (hg_ctx=hg_ctx@entry=0x18083d8, timeout=timeout@entry=1000) at src/cart/crt_hg.c:1360 #25 0x00007fa7fd86dfbb in crt_progress (crt_ctx=0x18083c0, timeout=timeout@entry=-1, cond_cb=cond_cb@entry=0x7fa7fe61f7f0 <ev_progress_cb>, arg=arg@entry=0x7ffe185f89d0) at src/cart/crt_context.c:1286 #26 0x00007fa7fe624bb6 in daos_event_priv_wait () at src/client/api/event.c:1203 #27 0x00007fa7fe627f96 in dc_task_schedule (task=0x181b4e0, instant=instant@entry=true) at src/client/api/task.c:139 #28 0x00007fa7fe626eb1 in daos_pool_connect (uuid=uuid@entry=0x7ffe185f8c38 "UXh}E<Kĭ<\035\340\332O", <incomplete sequence \325>, grp=0x1805f50 "daos_server", svc=0x1806000, flags=flags@entry=1, poh=poh@entry=0x7ffe185f8c48, info=info@entry=0x0, ev=ev@entry=0x0) at src/client/api/pool.c:53 #29 0x000000000040590d in pool_query_hdlr (ap=0x7ffe185f8c20) at src/utils/daos_hdlr.c:141 #30 0x0000000000402bc4 in main (argc=5, argv=<optimized out>) at src/utils/daos.c:957 (gdb)
On 12/25/19, 2:11 AM, "daos@daos.groups.io on behalf of Shengyu SY19 Zhang" <daos@daos.groups.io on behalf of zhangsy19@...> wrote:
Hello Kevan,
Here are the log files. Restart server means daos_server restarted, in spite of Ctrl+C to kill process or server reboot, anyhow after restart daos_server, the existing containers can't be touched, it can be 100% reproduced in my environment.
dmg storage query smd -I ->work. daos container query --svc=0 --path=/tmp/mycontainer ->no response due to infinitely loop daos container create... ->->no response due to infinitely loop
with sockets mode, didn't meet this issue.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm Sent: Wednesday, December 25, 2019 3:30 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Shengyu,
When you say "restart server", do you mean that you rebooted the node, or that you just restarted the daos_server process? Could you send another daos_control.log and daos_server.log from when it fails in this way?
Kevan
On 12/23/19, 11:34 PM, "daos@daos.groups.io on behalf of Shengyu SY19 Zhang" <daos@daos.groups.io on behalf of zhangsy19@...> wrote:
Hello Kevan,
After some more testing, actually the issue is still there, I can get ib to work by following ways: Restart subnet rm -rf /mnt/daos start daos server re-format create pool create container
however once I restart server, will happen the infinitely loop problem, and by any way I can't connect to an existing pool via ib.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Kevan Rehm Sent: Saturday, December 21, 2019 6:57 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Shengyu,
As it happens, I also had a case today using infiniband where my daos_test client was in an infinite loop, it generated 200 million lines in daos.log within a minute or so. It turned out that the IB subnet manager process had died. I restarted opensm, then re-ran daos_test, and it started to work. I mention it in case it might be the same problem as yours. Are you sure your subnet manager is working? Try a fi_pingpong test; if it works, then your subnet manager is okay, that's not the problem.
Thanks, Kevan
On 12/19/19, 12:48 AM, "daos@daos.groups.io on behalf of Shengyu SY19 Zhang" <daos@daos.groups.io on behalf of zhangsy19@...> wrote:
Hello Joel,
Thanks for your information, those are outputs of the test with and without rxm specified. Furthermore, I corrected numa setting, I thought there are mismatch issue with ib devices, I have tried unbind all other devices (ib1,2,3), and also tried remove rxm, All same problem happen, daos process just hangs to fi_send due to infinite loop.
After I set FI_LOG_LEVEL=debug, and creating container, I found some interesting, it shows:
libfabric:54383:verbs:core:ofi_check_ep_type():629<info> unsupported endpoint type libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Supported: FI_EP_DGRAM libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Requested: FI_EP_MSG libfabric:54383:verbs:core:ofi_check_domain_attr():525<info> Unknown domain name libfabric:54383:verbs:core:ofi_check_domain_attr():526<info> Supported: mlx5_0-xrc libfabric:54383:verbs:core:ofi_check_domain_attr():526<info> Requested: mlx5_0 libfabric:54383:verbs:core:ofi_check_ep_type():629<info> unsupported endpoint type libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Supported: FI_EP_DGRAM libfabric:54383:verbs:core:ofi_check_ep_type():630<info> Requested: FI_EP_MSG
Then I changed environment domain name to mlx5_0-xrc, there will be another message, can't find ofi+verbs provider, just like before. And BTW I think daos app should be failed when connecting pool rather than infinitely loop in fi_send.
Best Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Thursday, December 19, 2019 12:15 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Based on your logs, the control plane is successfully matching ib0 with mlx5_0. It shows "DEBUG 02:06:24.772249 netdetect.go:536: Device alias for ib0 is mlx5_0" As such, it correctly sets OFI_DOMAIN=mlx5_0. This matches with your topology data as reported by lstopo. Your results should not change if you manually are setting OFI_DOMAIN=mlx5_0 or not, because the control plane is already doing the right thing and you are not giving it a conflicting override. If you find that the behavior changes when you specify OFI_DOMAIN=mlx5_0 in your daos_server.yml, that's a problem we would need to debug.
Your topology shows these 4 interface cards / device combinations (mlx5_0:ib0, mlx5_1:ib1, mlx5_2:ib2 and mlx5_3:ib3).
PCI 15b3:1013 (P#548864 busid=0000:86:00.0 class=0207(IB) link=15.75GB/s PCISlot=5) Network L#7 (Address=20:00:10:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:48 Port=1) "ib0" OpenFabrics L#8 (NodeGUID=b859:9f03:0005:b548 SysImageGUID=b859:9f03:0005:b548 Port1State=4 Port1LID=0x4 Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03:0005 :b548) "mlx5_0"
PCI 15b3:1013 (P#548865 busid=0000:86:00.1 class=0207(IB) link=15.75GB/s PCISlot=5) Network L#9 (Address=20:00:18:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:49 Port=1) "ib1" OpenFabrics L#10 (NodeGUID=b859:9f03:0005:b549 SysImageGUID=b859:9f03:0005:b548 Port1State=1 Port1LID=0xffff Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03: 0005:b549) "mlx5_1"
PCI 15b3:1013 (P#716800 busid=0000:af:00.0 class=0207(IB) link=15.75GB/s PCISlot=6) Network L#11 (Address=20:00:10:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:40 Port=1) "ib2" OpenFabrics L#12 (NodeGUID=b859:9f03:0005:b540 SysImageGUID=b859:9f03:0005:b540 Port1State=4 Port1LID=0x1b Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03:00 05:b540) "mlx5_2"
PCI 15b3:1013 (P#716801 busid=0000:af:00.1 class=0207(IB) link=15.75GB/s PCISlot=6) Network L#13 (Address=20:00:18:8b:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:05:b5:41 Port=1) "ib3" OpenFabrics L#14 (NodeGUID=b859:9f03:0005:b541 SysImageGUID=b859:9f03:0005:b540 Port1State=1 Port1LID=0xffff Port1LMC=0 Port1GID0=fe80:0000:0000:0000:b859:9f03: 0005:b541) "mlx5_3"
I agree that fi_info shows that the provider: ofi+verbs;ofi_rxm is valid for ib0, and the control plane agrees with that. "DEBUG 02:06:20.594910 netdetect.go:764: Device ib0 supports provider: ofi+verbs;ofi_rxm" This is only a guess, but I have to rule it out. I am not certain that the provider ofi_rxm is being handled properly. Can you remove "ofi_rxm" from your provider configuration and try again?
That is, in your daos_server.yml set: provider: ofi+verbs
If you still have an error, then you will want to run the cart diagnostics again that Alex wrote about so we can see the latest results with that.
>> orterun --allow-run-as-root -np 2 -x >> CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 -x >> OFI_DOMAIN=mlx5_0 ../bin/crt_launch -e tests/iv_server -v 3
and
>> orterun --allow-run-as-root -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs" -x >> OFI_INTERFACE=ib0 -x OFI_DOMAIN=mlx5_0 ../bin/crt_launch -e >> tests/iv_server -v 3
One last thing, while this won't generate a runtime error _yet_, you have a mismatch between the pinned_numa_node:0 and the actual NUMA node of your ib0 device. Your topology data shows it as NUMA node 1. If you run "daos_server network scan -a" it should show you that the correct pinned_numa_node is 1. By setting it to the wrong NUMA node, you will have a performance impact once this is running because the daos_io_server will bind the threads to cores in the wrong NUMA node. The plan is to make a validation error like this a hard error instead of a warning. There's debug output in your daos_control.log that looks like this:
DEBUG 02:06:20.595012 netdetect.go:901: Validate network config -- given numaNode: 0 DEBUG 02:06:20.872053 netdetect.go:894: ValidateNUMAConfig (device: ib0, NUMA: 0) returned error: The NUMA node for device ib0 does not match the provided value 0. Remove the pinned_numa_node value from daos_server.yml then execute 'daos_server network scan' to see the valid NUMA node associated with the network device
I'll keep working on it with Alex. Let's see what you find.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Wednesday, December 18, 2019 4:25 AM Subject: FW: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
I'm wondering if you have any updates for this? I'm also a developer and I very familiar with spdk/dpdk/ibverbs, however I'm not familiar with other projects which depended by the DAOS, if you can give me some hints or guide, I would like to try troubleshooting this issue as well. Also if you need environment to reproduce the issue, you may connect to our machine for debug.
Regards, Shengyu
-----Original Message----- From: Shengyu SY19 Zhang Sent: Thursday, December 12, 2019 4:05 PM Subject: RE: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
Yes, please see it in the attachment.
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Thursday, December 12, 2019 11:08 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Can you generate a topology file for me, similar to what I asked Kevan to provide? You can generate it with with "lstopo --of xml > topology.xml" I am interested in seeing if your system also has device siblings with same/different ports as the one specified as the OFI_INTERFACE.
Your debug log shows that DAOS chose the sibling of ib0 as mlx5_0. It's correct that it picked something in the mlxN_N family, but, depending on your topology there could be a better device to choose, possibly one that has a port match. Your topology file will show if mlx5_0 matches the port or not, and similarly to Kevan's, it will help me develop a better function to find the correct matching sibling.
I don't know why cart failed once it had the OFI_DEVICE that we thought was correct. Alex's experiment with Kevan will help with that.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Wednesday, December 11, 2019 2:09 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
Please see those files in attachment, after changed some configurations, daos trapped in (no return from rdma_get_cm_event ) creating container, here is bt log:
(gdb) bt #0 0x00007fb8d6ef59cd in write () from /lib64/libc.so.6 #1 0x00007fb8d45a99ad in rdma_get_cm_event.part.18 () from /lib64/librdmacm.so.1 #2 0x00007fb8d517f836 in fi_ibv_eq_read () from /root/daos/install/lib/libfabric.so.1 #3 0x00007fb8d51a556f in rxm_eq_read () from /root/daos/install/lib/libfabric.so.1 #4 0x00007fb8d51a720f in rxm_msg_eq_progress () from /root/daos/install/lib/libfabric.so.1 #5 0x00007fb8d51a738d in rxm_cmap_connect () from /root/daos/install/lib/libfabric.so.1 #6 0x00007fb8d51ad54b in rxm_ep_tsend () from /root/daos/install/lib/libfabric.so.1 #7 0x00007fb8d5f1e468 in fi_tsend (context=0xbc4018, tag=1, dest_addr=0, desc=0xbaf6f0, len=384, buf=0x7fb8d235c038, ep=0xbb03a0) at /root/daos/install/include/rdma/fi_tagged.h:114 #8 na_ofi_msg_send_unexpected (na_class=0xacf220, context=0xbbf200, callback=<optimized out>, arg=<optimized out>, buf=0x7fb8d235c038, buf_size=384, plugin_data=0xbaf6f0, dest_addr=0xc6a1b0, dest_id=0 '\000', tag=1, op_id=0xbc14e8) at /root/daos/_build.external/mercury/src/na/na_ofi.c:3622 #9 0x00007fb8d613885f in NA_Msg_send_unexpected (op_id=0xbc14e8, tag=<optimized out>, dest_id=<optimized out>, dest_addr=<optimized out>, plugin_data=<optimized out>, buf_size=<optimized out>, buf=<optimized out>, arg=0xbc13c0, callback=0x7fb8d613a8c0 <hg_core_send_input_cb>, context=<optimized out>, na_class=<optimized out>) at /root/daos/_build.external/mercury/src/na/na.h:1485 #10 hg_core_forward_na (hg_core_handle=0xbc13c0) at /root/daos/_build.external/mercury/src/mercury_core.c:2076 #11 0x00007fb8d613c3a6 in HG_Core_forward (handle=0xbc13c0, callback=callback@entry=0x7fb8d6131740 <hg_core_forward_cb>, arg=arg@entry=0xbc41f0, flags=<optimized out>, payload_size=<optimized out>) at /root/daos/_build.external/mercury/src/mercury_core.c:4748 #12 0x00007fb8d6135057 in HG_Forward (handle=0xbc41f0, callback=callback@entry=0x7fb8d7233930 <crt_hg_req_send_cb>, arg=arg@entry=0xc68cb0, in_struct=in_struct@entry=0xc68cd0) at /root/daos/_build.external/mercury/src/mercury.c:2147 #13 0x00007fb8d723ade9 in crt_hg_req_send (rpc_priv=rpc_priv@entry=0xc68cb0) at src/cart/crt_hg.c:1190 #14 0x00007fb8d728b89a in crt_req_send_immediately (rpc_priv=<optimized out>) at src/cart/crt_rpc.c:1104 #15 crt_req_send_internal (rpc_priv=rpc_priv@entry=0xc68cb0) at src/cart/crt_rpc.c:1173 #16 0x00007fb8d728fed3 in crt_req_hg_addr_lookup_cb (hg_addr=0xc6a150, arg=0xc68cb0) at src/cart/crt_rpc.c:569 #17 0x00007fb8d7232012 in crt_hg_addr_lookup_cb (hg_cbinfo=<optimized out>) at src/cart/crt_hg.c:290 #18 0x00007fb8d6131835 in hg_core_addr_lookup_cb (callback_info=<optimized out>) at /root/daos/_build.external/mercury/src/mercury.c:454 #19 0x00007fb8d613caa2 in hg_core_trigger_lookup_entry (hg_core_op_id=0xc6a0f0) at /root/daos/_build.external/mercury/src/mercury_core.c:3444 #20 hg_core_trigger (context=0xbbd030, timeout=<optimized out>, timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7fff3d70388c) at /root/daos/_build.external/mercury/src/mercury_core.c:3384 #21 0x00007fb8d613d80b in HG_Core_trigger (context=<optimized out>, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7fff3d70388c) at /root/daos/_build.external/mercury/src/mercury_core.c:4873 #22 0x00007fb8d613534d in HG_Trigger (context=context@entry=0xacf1e0, timeout=timeout@entry=0, max_count=max_count@entry=4294967295, actual_count=actual_count@entry=0x7fff3d70388c) at /root/daos/_build.external/mercury/src/mercury.c:2244 #23 0x00007fb8d723479a in crt_hg_trigger (hg_ctx=hg_ctx@entry=0xbb7e78) at src/cart/crt_hg.c:1326 #24 0x00007fb8d723de0d in crt_hg_progress (hg_ctx=hg_ctx@entry=0xbb7e78, timeout=timeout@entry=1000) at src/cart/crt_hg.c:1359 #25 0x00007fb8d71eef6b in crt_progress (crt_ctx=0xbb7e60, timeout=timeout@entry=-1, cond_cb=cond_cb@entry=0x7fb8d7fa0230 <ev_progress_cb>, arg=arg@entry=0x7fff3d703980) at src/cart/crt_context.c:1286 #26 0x00007fb8d7fa55f6 in daos_event_priv_wait () at src/client/api/event.c:1203 #27 0x00007fb8d7fa89d6 in dc_task_schedule (task=0xbcafa0, instant=instant@entry=true) at src/client/api/task.c:139 #28 0x00007fb8d7fa78f1 in daos_pool_connect (uuid=uuid@entry=0x7fff3d703b68 "\265\215%f\036AN\354\203\002\317I\035\362\067\273", grp=0xbb59f0 "daos_server", svc=0xbb5aa0, flags=flags@entry=2, poh=poh@entry=0x7fff3d703b78, info=info@entry=0x0, ev=ev@entry=0x0) at src/client/api/pool.c:53 #29 0x0000000000404eb0 in cont_op_hdlr (ap=ap@entry=0x7fff3d703b50) at src/utils/daos.c:610 #30 0x0000000000402b94 in main (argc=8, argv=<optimized out>) at src/utils/daos.c:957
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Tuesday, December 10, 2019 8:48 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
It's hard to further diagnose without the logs. Can you share your latest daos_server.yml, full daos_control.log and full server.log? In the daos_server.yml, please set control_log_mask: DEBUG and in the io server section, set log_mask: DEBUG
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Tuesday, December 10, 2019 1:42 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
I didn't set OFI_DOMAIN=mlx5_0 before, from the hint of Alex, I set it yesterday, there then there is another error while creating container: mgmt ERR src/mgmt/cli_mgmt.c:325 get_attach_info() GetAttachInfo unsuccessful: 2
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Monday, December 9, 2019 11:47 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
With your latest setup, can you launch DAOS and send your latest daos_control.log? In particular, I want to see how the daos_io_server environment variables are set. For example these two lines below show the command line args and the environment variables in use with an ofi+sockets/ib1 config. I'm looking to see if we are setting OFI_DOMAIN=mlx5_0 in your environment. I seem to recall that your earlier logs did have this set, but since builds have changed since then, it's worth checking out one more time.
DEBUG 16:33:15.711949 exec.go:112: daos_io_server:1 args: [-t 8 -x 2 -g daos_server -d /tmp/daos_sockets -s /mnt/daos1 -i 1842327892 -p 1 -I 1] DEBUG 16:33:15.711963 exec.go:113: daos_io_server:1 env: [CRT_TIMEOUT=30 FI_SOCKETS_CONN_TIMEOUT=2000 D_LOG_FILE=/tmp/server.log CRT_CTX_SHARE_ADDR=0 FI_SOCKETS_MAX_CONN_RETRY=1 D_LOG_MASK=ERR CRT_PHY_ADDR_STR=ofi+sockets OFI_INTERFACE=ib1 OFI_PORT=31416 DAOS_MD_CAP=1024]
If we are not setting OFI_DOMAIN=mlx5_0 automatically, go ahead and edit your daos_server.yml and add OFI_DOMAIN=mlx5_0 to the env_vars section (per below) and see what you get. If that doesn't get you going, please send the log for that run, too.
env_vars: - OFI_DOMAIN=mlx5_0
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Monday, December 9, 2019 5:03 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Actually this is really good result - two servers were able to exchange rpcs. That's the end of test -- just ctrl+C out of it.
I'll let Joel take over for daos help on how to make daos use mlx5_0 domain.
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, December 09, 2019 12:57 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Please see the new log below, and the test seems dead (no more response):
ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0x7fdd21092400 valid_mask = 0x3) [afa1][[44100,1],1][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0x195aad0 valid_mask = 0x3) [afa1][[44100,1],0][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument -------------------------------------------------------------------------- WARNING: There was an error initializing an OpenFabrics device.
Local host: afa1 Local device: mlx5_0 -------------------------------------------------------------------------- 12/09-03:53:20.32 afa1 CaRT[101464/101464] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically 12/09-03:53:20.32 afa1 CaRT[101465/101465] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically SRV [rank=1 pid=101465] Server starting, self_rank=1 SRV [rank=0 pid=101464] Server starting, self_rank=0 SRV [rank=1 pid=101465] >>>> Entered iv_set_ivns SRV [rank=1 pid=101465] <<<< Exited iv_set_ivns:773
[afa1:101458] 1 more process has sent help message help-mpi-btl-openib.txt / error in device init [afa1:101458] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Monday, December 9, 2019 4:30 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Given your output can you try this now?
orterun --allow-run-as-root -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 -x OFI_DOMAIN=mlx5_0 ../bin/crt_launch -e tests/iv_server -v 3
Please note additional OFI_DOMAIN envariable.
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, December 09, 2019 12:25 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Here is outputs of fio_info related to verbs:
provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_2 version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_RC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_2-xrc version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_XRC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_0 version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_RC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_0-xrc version: 1.0 type: FI_EP_MSG protocol: FI_PROTO_RDMA_CM_IB_XRC provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_0-dgram version: 1.0 type: FI_EP_DGRAM protocol: FI_PROTO_IB_UD provider: verbs fabric: IB-0xfe80000000000000 domain: mlx5_2-dgram version: 1.0 type: FI_EP_DGRAM protocol: FI_PROTO_IB_UD provider: verbs;ofi_rxm fabric: IB-0xfe80000000000000 domain: mlx5_2 version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: verbs;ofi_rxm fabric: IB-0xfe80000000000000 domain: mlx5_0 version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: tcp;ofi_rxm fabric: TCP-IP domain: tcp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXM provider: verbs;ofi_rxd fabric: IB-0xfe80000000000000 domain: mlx5_2-dgram version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: verbs;ofi_rxd fabric: IB-0xfe80000000000000 domain: mlx5_0-dgram version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD provider: UDP;ofi_rxd fabric: UDP-IP domain: udp version: 1.0 type: FI_EP_RDM protocol: FI_PROTO_RXD
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Monday, December 9, 2019 4:08 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Thanks, Can you also provide full fi_info output?
~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, December 09, 2019 12:05 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Here are the outputs:
orterun --allow-run-as-root -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 ../bin/crt_launch -e tests/iv_server -v 3 ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0x2094f90 valid_mask = 0x3) [afa1][[18544,1],1][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0xacda60 valid_mask = 0x3) [afa1][[18544,1],0][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Invalid argument -------------------------------------------------------------------------- WARNING: There was an error initializing an OpenFabrics device.
Local host: afa1 Local device: mlx5_0 -------------------------------------------------------------------------- 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:1609 # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0" 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:1609 # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0" 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:2975 # na_ofi_initialize(): Could not open domain for verbs;ofi_rxm, ib0 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na.c:324 # NA_Initialize_opt(): Could not initialize plugin 12/09-03:01:03.53 afa1 CaRT[92269/92269] hg ERR src/cart/crt_hg.c:525 crt_hg_init() Could not initialize NA class. 12/09-03:01:03.53 afa1 CaRT[92269/92269] crt ERR src/cart/crt_init.c:347 crt_init_opt() crt_hg_init failed rc: -1020. 12/09-03:01:03.53 afa1 CaRT[92269/92269] crt ERR src/cart/crt_init.c:421 crt_init_opt() crt_init failed, rc: -1020. 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:2975 # na_ofi_initialize(): Could not open domain for verbs;ofi_rxm, ib0 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na.c:324 # NA_Initialize_opt(): Could not initialize plugin 12/09-03:01:03.53 afa1 CaRT[92268/92268] hg ERR src/cart/crt_hg.c:525 crt_hg_init() Could not initialize NA class. 12/09-03:01:03.53 afa1 CaRT[92268/92268] crt ERR src/cart/crt_init.c:347 crt_init_opt() crt_hg_init failed rc: -1020. 12/09-03:01:03.53 afa1 CaRT[92268/92268] crt ERR src/cart/crt_init.c:421 crt_init_opt() crt_init failed, rc: -1020. [afa1:92262] 1 more process has sent help message help-mpi-btl-openib.txt / error in device init [afa1:92262] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Saturday, December 7, 2019 1:14 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
With latest daos code and mofed 4.6 installed can you rerun this and show what that one gives you?
source scons_local/utils/setup_local.sh cd install/Linux/TESTING orterun -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 ../bin/crt_launch -e tests/iv_server -v 3
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Thursday, December 05, 2019 10:25 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Yes, however with 4.6 the result is same. After I upgraded daos code to newest of master branch, I got some different results, daos io server seems started OK, since I can see lots of fd points to rdma_cm. But daos client seems can't connect to server due to same error (can't find efi+verbs provider on ib0) like the log shows, you may find the log in the attachment, that is crated via "create container"
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Wednesday, December 4, 2019 4:02 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Can you try installing MOFED 4.6 packages on your system? In general MOFED is required to get verbs over Mellanox working. Those packages can be found at: https://www.mellanox.com/page/mlnx_ofed_matrix?mtag=linux_sw_drivers
There is also 4.7 version available, however there seem to be few longevity issues currently when using 4.7 (according to verbs ofi maintainers).
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Monday, November 25, 2019 9:55 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Alex,
Thanks for your suggestion, here is the log:
mca_base_component_repository_open: unable to open mca_pml_ucx: libucp.so.0: cannot open shared object file: No such file or directory (ignored) 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:1407 # na_ofi_getinfo(): fi_getinfo() failed, rc: -61(No data available) 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na_ofi.c:2816 # na_ofi_check_protocol(): na_ofi_getinfo() failed 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR # NA -- Error -- /root/daos/_build.external/mercury/src/na/na.c:302 # NA_Initialize_opt(): No suitable plugin found that matches ofi+verbs;ofi_rxm://192.168.80.120 11/26-00:40:22.65 afa1 CaRT[365504/365504] hg ERR src/cart/crt_hg.c:521 crt_hg_init() Could not initialize NA class. 11/26-00:40:22.65 afa1 CaRT[365504/365504] crt ERR src/cart/crt_init.c:347 crt_init_opt() crt_hg_init failed rc: -1020. 11/26-00:40:22.65 afa1 CaRT[365504/365504] crt ERR src/cart/crt_init.c:421 crt_init_opt() crt_init failed, rc: -1020.
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Tuesday, November 26, 2019 12:21 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
In order to figure out what is the issue on your system could you run cart standalone test instead and provide the output that you get?
cd daos_dir source scons_local/utils/setup_local.sh cd install/Linux/TESTING orterun -np 2 -x CRT_PHY_ADDR_STR="ofi+verbs;ofi_rxm" -x OFI_INTERFACE=ib0 ../bin/crt_launch -e tests/iv_server -v 3
Note: Depending on how you installed daos your paths might be different, so instead of cd install/Linux/TESTING you might have to cd into different directory first where you have tests/iv_server in. I think in your env it will be cd /root/daos/install/TESTING/ or cd /root/daos/install/cart/TESTING.
Expected output: 11/25-15:51:48.39 wolf-55 CaRT[53295/53295] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically 11/25-15:51:48.40 wolf-55 CaRT[53296/53296] crt WARN src/cart/crt_init.c:270 crt_init_opt() PMIX disabled. Disabling LM automatically SRV [rank=0 pid=53295] Server starting, self_rank=0 SRV [rank=1 pid=53296] Server starting, self_rank=1 SRV [rank=1 pid=53296] >>>> Entered iv_set_ivns SRV [rank=1 pid=53296] <<<< Exited iv_set_ivns:773
Thanks, ~~Alex.
-----Original Message----- From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Shengyu SY19 Zhang Sent: Monday, November 25, 2019 3:28 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
As the shown in the output.log, there is only one version of libfabrics installed in my machine, and actually I don't nave other software which depends libfabraics installed. From you guide to set FI_LOG_LEVEL=debug, I can see the following message, may be helpful:
libfabric:123445:verbs:fabric:fi_ibv_set_default_attr():1263<info> Ignoring provider default value for tx rma_iov_limit as it is greater than the value supported by domain: mlx5_0 libfabric:123445:verbs:fabric:fi_ibv_get_matching_info():1365<info> hints->ep_attr->rx_ctx_cnt != FI_SHARED_CONTEXT. Skipping XRC FI_EP_MSG endpoints ERROR: daos_io_server:0 libfabric:123445:verbs:core:fi_ibv_check_hints():231<info> Unsupported capabilities libfabric:123445:verbs:core:fi_ibv_check_hints():232<info> Supported: FI_MSG, FI_RECV, FI_SEND, FI_LOCAL_COMM, FI_REMOTE_COMM libfabric:123445:verbs:core:fi_ibv_check_hints():232<info> Requested: FI_MSG, FI_RMA, FI_READ, FI_RECV, FI_SEND, FI_REMOTE_READ ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: No such device(19) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: No such device(19) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: Invalid argument(22) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: Invalid argument(22) ERROR: daos_io_server:0 libfabric:123445:verbs:fabric:fi_ibv_get_rai_id():179<info> rdma_bind_addr: Invalid argument(22) ERROR: daos_io_server:0 libfabric:123445:core:core:ofi_layering_ok():795<info> Need core provider, skipping ofi_rxd libfabric:123445:core:core:ofi_layering_ok():795<info> Need core provider, skipping ofi_mrail
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Saturday, November 23, 2019 3:20 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
The debug output showed me that when daos_server is started via orterun, libfabric is not finding provider support for ofi_rxm at least. I'm still wondering if you have two different versions of libfabric installed on your machine.
Can you run these commands and provide the output?
1) ldd install/bin/daos_server 2) modify your orterun command to run ldd on daos_server. For example, I run this command locally: orterun --allow-run-as-root --map-by node --mca btl tcp,self --mca oob tcp -np 1 --hostfile /home/jbrosenz/daos/hostfile --enable-recovery --report-uri /tmp/urifile ldd /home/jbrosenz/daos/install/bin/daos_server 3) which fi_info 4) ldd over each version of fi_info found
From the data you provide, I'll understand if the libfabric being used by daos_server when executed directly by you in the shell is the same libfabric being used by daos_server when executed via orterun. Your original "daos_server network scan" output showed support for ofi+verbs;ofi_rxm but your debug output showed that when daos_server was started (via orterun), libfabric could not find support for the very same providers. If there are two different versions being used with different configurations, it would explain the failure. If it's a single installation/configuration, then that will lead the debug in another direction.
Depending on what you find through 1-4, you might find it helpful to export the environment variable FI_LOG_LEVEL=debug which will instruct libfabric to output a good deal of debug info.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Friday, November 22, 2019 12:59 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello Joel,
Please see those files in attachment. I have tried two machines, one have full provider shows in fi_info (verbs and rxm), another doesn't show verbs, but they are same can't start io_server. I found the project conflicts with mellanox drivers, therefor I remove it and use yum package only, however still keep not working.
Regards, Shengyu
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Rosenzweig, Joel B Sent: Friday, November 22, 2019 6:35 AM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
Can you share your daos_server.yml so we can see how you enabled the provider? And, can you share the log files daos_control.log and server.log so we can see more context?
Thank you, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Shengyu SY19 Zhang Sent: Wednesday, November 20, 2019 9:23 PM Subject: Re: [External] Re: [daos] Does DAOS support infiniband now?
Hello,
Thank you for your help Alex, Joel and Kevin, I have checked those steps that you provided:
Ibstat: State: Active Physical state: LinkUp
Ifconfig: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
fi_info: verbs: version: 1.0 ofi_rxm: version: 1.0 ofi_rxd: version: 1.0
And network is good since I can run SPDK NVMe-oF over Infiniband with good working. I also specified "ofi+verbs;ofi_rxm", the same error occurred, the ioserver will be stopped after a while, and print log as I provided previously.
And I noticed, whatever I specify ofi+verbs, ofi_rxm, or ofi+verbs;ofi_rxm, the log keep shows No provider found for "verbs;ofi_rxm" provider on domain "ib0", is it the cause?
BTW: it is working under ofi+sockets.
Regards, Shengyu.
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Oganezov, Alexander A Sent: Thursday, November 21, 2019 7:13 AM Subject: [External] Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
> However if I specify either ofi+verbs or ofi_rxm, the same error will happen, and io_server will stop. > na_ofi.c:1609 > # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0"
To use supported verbs provider you need to have "ofi+verbs;ofi_rxm" in the provider string.
~~Alex.
-----Original Message----- From: daos@daos.groups.io [mailto:daos@daos.groups.io] On Behalf Of Rosenzweig, Joel B Sent: Wednesday, November 20, 2019 7:37 AM Subject: Re: [daos] Does DAOS support infiniband now?
Hi Shengyu,
The daos_server network scan uses information provided by libfabric to determine available devices and providers. It then cross references that list of devices with device names obtained from hwloc to convert libfabric device names (as necessary) to those you'd find via ifconfig. Therefore, if "daos_server network scan" displays a device and provider, it means that support for that via libfabric has been provided. However, as Kevin pointed out, it's possible that the device itself was down, and that could certainly generate an error like what you encountered. There's another possibility, that you might have more than one version of libfabric installed in your environment. I have run into this situation in our lab environment. You might check your target system to see if it has more than one libfabric library with different provider support.
Regards, Joel
-----Original Message----- From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Harms, Kevin via Groups.Io Sent: Wednesday, November 20, 2019 10:04 AM Subject: Re: [daos] Does DAOS support infiniband now?
Shengyu,
I have tried IB and it works. Verify the libfabric verbs provider is available.
fi_info -l
you should see these:
ofi\_rxm: version: 1.0
verbs: version: 1.0
See here for details:
You might also want to confirm ib0 is in the UP state:
[root@daos01 ~]# ifconfig ib0 ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092 inet 172.25.6.101 netmask 255.255.0.0 broadcast 172.25.255.255
kevin
________________________________________ From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Shengyu SY19 Zhang <zhangsy19@...> Sent: Wednesday, November 20, 2019 2:54 AM Subject: [daos] Does DAOS support infiniband now?
Hello,
I use daos_server network scan, it shows as following: fabric_iface: ib0 provider: ofi+verbs;ofi_rxm pinned_numa_node: 1
However if I specify either ofi+verbs or ofi_rxm, the same error will happen, and io_server will stop. na_ofi.c:1609 # na_ofi_domain_open(): No provider found for "verbs;ofi_rxm" provider on domain "ib0"
The ib0 is Mellanox nic over Infiniband network.
Regards, Shengyu.
--------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for --------------------------------------------------------------------- This e-mail and any attachments may contain confidential material for |
|