Topics

CPU NUMA node bind error


Wu Huijun
 

I found DAOS cannot saturate the bandwidth of the IB network in our settings. We received warnings from the client-side saying "No network devices bound to client NUMA node 0" so I guess this caused the sub-optimal performance.

By commands such as daos_server network scan/ daos_agent net-scan, I noticed that the only IB card is with NUMA node 1 while the client is somehow bound to NUMA node 0. This is also the case for the server. I tried to use the option pinned_numa_node in the server config.yml to force it the bind to NUMA node 0. However, I got the following errors. Are there any good ways to control the NUMA bindings for both the clients and the servers? Thanks if anyone could help.

ERROR: daos_io_server:0 10/17-09:24:09.69 len-cn3 DAOS[6970/7024] bio  EMRG src/bio/bio_monitor.c:196 get_spdk_identify_ctrlr_completion() Assertion 'dev_health->bdh_io_channel != NULL' failed

daos_io_server: src/bio/bio_monitor.c:196: get_spdk_identify_ctrlr_completion: Assertion `dev_health->bdh_io_channel != ((void *)0)' failed.

ERROR: daos_io_server:0 *** Process 6970 received signal 6 ***

Associated errno: Success (0)

/usr/lib64/libpthread.so.0(+0xf5f0)[0x7f8e5adb45f0]

/usr/lib64/libc.so.6(gsignal+0x37)[0x7f8e5a164337]

/usr/lib64/libc.so.6(abort+0x148)[0x7f8e5a165a28]

ERROR: daos_io_server:0 /usr/lib64/libc.so.6(+0x2f156)[0x7f8e5a15d156]

ERROR: daos_io_server:0 /usr/lib64/libc.so.6(+0x2f202)[0x7f8e5a15d202]

/usr/local/daos/lib64/daos_srv/libbio.so(+0x1322d)[0x7f8e5b3dd22d]

/usr/local/daos/lib64/daos_srv/../../prereq/dev/spdk/lib/libspdk_thread.so.2.0(spdk_thread_poll+0xc6)[0x7f8e58dd7036]

/usr/local/daos/lib64/daos_srv/libbio.so(bio_xsctxt_free+0x28d)[0x7f8e5b3e309d]

ERROR: daos_io_server:0 /usr/local/daos/bin/daos_io_server[0x41b439]

/usr/local/daos/bin/../prereq/dev/argobots/lib/libabt.so.0(+0x1313b)[0x7f8e5ab9613b]

ERROR: daos_io_server:0 /usr/local/daos/bin/../prereq/dev/argobots/lib/libabt.so.0(+0x13811)[0x7f8e5ab96811]

instance 0 exited: instance 0 exited prematurely: /usr/local/daos/bin/daos_io_server (instance 0) exited: signal: aborted (core dumped)

ERROR: removing socket file: removing instance 0 socket file: no dRPC client set (data plane not started?)


Farrell, Patrick Arthur
 

Client NUMA binding is not controlled by DAOS, it is a function of where your client application process is running (since DAOS is just a library linked in to that process).

You will have to control client NUMA binding using whatever technique you would normally use independent of DAOS.  mpirun implementations generally support NUMA binding, or if you're not running an mpi app, you can use something like numactl to run your app.

For the server, the NUMA node option you describe (pinned_numa_node) is the correct method.  The error you shared is not obviously related to that setting - You may want to try to confirm whether or not it's caused by changing the NUMA node setting.

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wu Huijun <huijunw91@...>
Sent: Friday, October 16, 2020 8:50 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] CPU NUMA node bind error
 
I found DAOS cannot saturate the bandwidth of the IB network in our settings. We received warnings from the client-side saying "No network devices bound to client NUMA node 0" so I guess this caused the sub-optimal performance.

By commands such as daos_server network scan/ daos_agent net-scan, I noticed that the only IB card is with NUMA node 1 while the client is somehow bound to NUMA node 0. This is also the case for the server. I tried to use the option pinned_numa_node in the server config.yml to force it the bind to NUMA node 0. However, I got the following errors. Are there any good ways to control the NUMA bindings for both the clients and the servers? Thanks if anyone could help.

ERROR: daos_io_server:0 10/17-09:24:09.69 len-cn3 DAOS[6970/7024] bio  EMRG src/bio/bio_monitor.c:196 get_spdk_identify_ctrlr_completion() Assertion 'dev_health->bdh_io_channel != NULL' failed

daos_io_server: src/bio/bio_monitor.c:196: get_spdk_identify_ctrlr_completion: Assertion `dev_health->bdh_io_channel != ((void *)0)' failed.

ERROR: daos_io_server:0 *** Process 6970 received signal 6 ***

Associated errno: Success (0)

/usr/lib64/libpthread.so.0(+0xf5f0)[0x7f8e5adb45f0]

/usr/lib64/libc.so.6(gsignal+0x37)[0x7f8e5a164337]

/usr/lib64/libc.so.6(abort+0x148)[0x7f8e5a165a28]

ERROR: daos_io_server:0 /usr/lib64/libc.so.6(+0x2f156)[0x7f8e5a15d156]

ERROR: daos_io_server:0 /usr/lib64/libc.so.6(+0x2f202)[0x7f8e5a15d202]

/usr/local/daos/lib64/daos_srv/libbio.so(+0x1322d)[0x7f8e5b3dd22d]

/usr/local/daos/lib64/daos_srv/../../prereq/dev/spdk/lib/libspdk_thread.so.2.0(spdk_thread_poll+0xc6)[0x7f8e58dd7036]

/usr/local/daos/lib64/daos_srv/libbio.so(bio_xsctxt_free+0x28d)[0x7f8e5b3e309d]

ERROR: daos_io_server:0 /usr/local/daos/bin/daos_io_server[0x41b439]

/usr/local/daos/bin/../prereq/dev/argobots/lib/libabt.so.0(+0x1313b)[0x7f8e5ab9613b]

ERROR: daos_io_server:0 /usr/local/daos/bin/../prereq/dev/argobots/lib/libabt.so.0(+0x13811)[0x7f8e5ab96811]

instance 0 exited: instance 0 exited prematurely: /usr/local/daos/bin/daos_io_server (instance 0) exited: signal: aborted (core dumped)

ERROR: removing socket file: removing instance 0 socket file: no dRPC client set (data plane not started?)


Wu Huijun
 

Patrick, thanks for your reply. I see. But for the server, the only change I made that triggered this error was to add "pinned_numa_node: 1" in the server config yml file...

Cheers,
Huijun


On Fri, Oct 16, 2020 at 10:30 PM Farrell, Patrick Arthur <patrick.farrell@...> wrote:
Client NUMA binding is not controlled by DAOS, it is a function of where your client application process is running (since DAOS is just a library linked in to that process).

You will have to control client NUMA binding using whatever technique you would normally use independent of DAOS.  mpirun implementations generally support NUMA binding, or if you're not running an mpi app, you can use something like numactl to run your app.

For the server, the NUMA node option you describe (pinned_numa_node) is the correct method.  The error you shared is not obviously related to that setting - You may want to try to confirm whether or not it's caused by changing the NUMA node setting.

-Patrick

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wu Huijun <huijunw91@...>
Sent: Friday, October 16, 2020 8:50 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] CPU NUMA node bind error
 
I found DAOS cannot saturate the bandwidth of the IB network in our settings. We received warnings from the client-side saying "No network devices bound to client NUMA node 0" so I guess this caused the sub-optimal performance.

By commands such as daos_server network scan/ daos_agent net-scan, I noticed that the only IB card is with NUMA node 1 while the client is somehow bound to NUMA node 0. This is also the case for the server. I tried to use the option pinned_numa_node in the server config.yml to force it the bind to NUMA node 0. However, I got the following errors. Are there any good ways to control the NUMA bindings for both the clients and the servers? Thanks if anyone could help.

ERROR: daos_io_server:0 10/17-09:24:09.69 len-cn3 DAOS[6970/7024] bio  EMRG src/bio/bio_monitor.c:196 get_spdk_identify_ctrlr_completion() Assertion 'dev_health->bdh_io_channel != NULL' failed

daos_io_server: src/bio/bio_monitor.c:196: get_spdk_identify_ctrlr_completion: Assertion `dev_health->bdh_io_channel != ((void *)0)' failed.

ERROR: daos_io_server:0 *** Process 6970 received signal 6 ***

Associated errno: Success (0)

/usr/lib64/libpthread.so.0(+0xf5f0)[0x7f8e5adb45f0]

/usr/lib64/libc.so.6(gsignal+0x37)[0x7f8e5a164337]

/usr/lib64/libc.so.6(abort+0x148)[0x7f8e5a165a28]

ERROR: daos_io_server:0 /usr/lib64/libc.so.6(+0x2f156)[0x7f8e5a15d156]

ERROR: daos_io_server:0 /usr/lib64/libc.so.6(+0x2f202)[0x7f8e5a15d202]

/usr/local/daos/lib64/daos_srv/libbio.so(+0x1322d)[0x7f8e5b3dd22d]

/usr/local/daos/lib64/daos_srv/../../prereq/dev/spdk/lib/libspdk_thread.so.2.0(spdk_thread_poll+0xc6)[0x7f8e58dd7036]

/usr/local/daos/lib64/daos_srv/libbio.so(bio_xsctxt_free+0x28d)[0x7f8e5b3e309d]

ERROR: daos_io_server:0 /usr/local/daos/bin/daos_io_server[0x41b439]

/usr/local/daos/bin/../prereq/dev/argobots/lib/libabt.so.0(+0x1313b)[0x7f8e5ab9613b]

ERROR: daos_io_server:0 /usr/local/daos/bin/../prereq/dev/argobots/lib/libabt.so.0(+0x13811)[0x7f8e5ab96811]

instance 0 exited: instance 0 exited prematurely: /usr/local/daos/bin/daos_io_server (instance 0) exited: signal: aborted (core dumped)

ERROR: removing socket file: removing instance 0 socket file: no dRPC client set (data plane not started?)


Niu, Yawei
 

The assert of “bdh_io_channel != NULL” is because a bio poll is called after the context is freed on error cleanup, could you open a ticket for it? Thanks!

 

Thanks

-Niu

 

From: <daos@daos.groups.io> on behalf of Wu Huijun <huijunw91@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Friday, October 16, 2020 at 10:47 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] CPU NUMA node bind error

 

Patrick, thanks for your reply. I see. But for the server, the only change I made that triggered this error was to add "pinned_numa_node: 1" in the server config yml file...

 

Cheers,

Huijun

 

On Fri, Oct 16, 2020 at 10:30 PM Farrell, Patrick Arthur <patrick.farrell@...> wrote:

Client NUMA binding is not controlled by DAOS, it is a function of where your client application process is running (since DAOS is just a library linked in to that process).

 

You will have to control client NUMA binding using whatever technique you would normally use independent of DAOS.  mpirun implementations generally support NUMA binding, or if you're not running an mpi app, you can use something like numactl to run your app.

 

For the server, the NUMA node option you describe (pinned_numa_node) is the correct method.  The error you shared is not obviously related to that setting - You may want to try to confirm whether or not it's caused by changing the NUMA node setting.

 

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wu Huijun <huijunw91@...>
Sent: Friday, October 16, 2020 8:50 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] CPU NUMA node bind error

 

I found DAOS cannot saturate the bandwidth of the IB network in our settings. We received warnings from the client-side saying "No network devices bound to client NUMA node 0" so I guess this caused the sub-optimal performance.

By commands such as daos_server network scan/ daos_agent net-scan, I noticed that the only IB card is with NUMA node 1 while the client is somehow bound to NUMA node 0. This is also the case for the server. I tried to use the option pinned_numa_node in the server config.yml to force it the bind to NUMA node 0. However, I got the following errors. Are there any good ways to control the NUMA bindings for both the clients and the servers? Thanks if anyone could help.

ERROR: daos_io_server:0 10/17-09:24:09.69 len-cn3 DAOS[6970/7024] bio  EMRG src/bio/bio_monitor.c:196 get_spdk_identify_ctrlr_completion() Assertion 'dev_health->bdh_io_channel != NULL' failed

daos_io_server: src/bio/bio_monitor.c:196: get_spdk_identify_ctrlr_completion: Assertion `dev_health->bdh_io_channel != ((void *)0)' failed.

ERROR: daos_io_server:0 *** Process 6970 received signal 6 ***

Associated errno: Success (0)

/usr/lib64/libpthread.so.0(+0xf5f0)[0x7f8e5adb45f0]

/usr/lib64/libc.so.6(gsignal+0x37)[0x7f8e5a164337]

/usr/lib64/libc.so.6(abort+0x148)[0x7f8e5a165a28]

ERROR: daos_io_server:0 /usr/lib64/libc.so.6(+0x2f156)[0x7f8e5a15d156]

ERROR: daos_io_server:0 /usr/lib64/libc.so.6(+0x2f202)[0x7f8e5a15d202]

/usr/local/daos/lib64/daos_srv/libbio.so(+0x1322d)[0x7f8e5b3dd22d]

/usr/local/daos/lib64/daos_srv/../../prereq/dev/spdk/lib/libspdk_thread.so.2.0(spdk_thread_poll+0xc6)[0x7f8e58dd7036]

/usr/local/daos/lib64/daos_srv/libbio.so(bio_xsctxt_free+0x28d)[0x7f8e5b3e309d]

ERROR: daos_io_server:0 /usr/local/daos/bin/daos_io_server[0x41b439]

/usr/local/daos/bin/../prereq/dev/argobots/lib/libabt.so.0(+0x1313b)[0x7f8e5ab9613b]

ERROR: daos_io_server:0 /usr/local/daos/bin/../prereq/dev/argobots/lib/libabt.so.0(+0x13811)[0x7f8e5ab96811]

instance 0 exited: instance 0 exited prematurely: /usr/local/daos/bin/daos_io_server (instance 0) exited: signal: aborted (core dumped)

ERROR: removing socket file: removing instance 0 socket file: no dRPC client set (data plane not started?)


Rosenzweig, Joel B
 

Hi Huijun,

 

On the client side, the daos_agent examines the NUMA binding associated with the PID of the client application and automatically assigns an interface to the client that matches that NUMA affinity.  If the client is bound to a NUMA node that has no compatible network interface, or isn’t bound at all, then the agent assigns an interface from the default NUMA node.    To get the best performance then, you’d want to bind your client application to a NUMA node that matches one of the network interfaces available to daos_agent running on your client node.

 

If the client is bound to a NUMA node without a compatible interface, then performance will suffer.  I wrote some details about this in the /doc/admin/performance_tuning.md file.  I go into more detail there.  There’s additional info I wrote about this mechanism in the “Get Attach Info” section of the /src/control/cmd/daos_agent/README.md.  That said, you can specifically choose an interface for the client and override the automatic selection, by setting OFI_INTERFACE=… in the client environment if you desire to do so.

 

Using the pinned_numa_node setting on the daos_server is separate from settings that affect the client side.  This setting only controls how the daos_io_server processes are bound.  In the ideal case, a daos_server launches up to 1 daos_io_server process per NUMA node / matching network interface and using the ‘pinned_numa_node’ setting instructs the daos_io_server process to bind itself to cores matching that NUMA affinity.

 

Regards,

Joel

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Niu, Yawei
Sent: Friday, October 16, 2020 11:17 AM
To: daos@daos.groups.io
Subject: Re: [daos] CPU NUMA node bind error

 

The assert of “bdh_io_channel != NULL” is because a bio poll is called after the context is freed on error cleanup, could you open a ticket for it? Thanks!

 

Thanks

-Niu

 

From: <daos@daos.groups.io> on behalf of Wu Huijun <huijunw91@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Friday, October 16, 2020 at 10:47 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] CPU NUMA node bind error

 

Patrick, thanks for your reply. I see. But for the server, the only change I made that triggered this error was to add "pinned_numa_node: 1" in the server config yml file...

 

Cheers,

Huijun

 

On Fri, Oct 16, 2020 at 10:30 PM Farrell, Patrick Arthur <patrick.farrell@...> wrote:

Client NUMA binding is not controlled by DAOS, it is a function of where your client application process is running (since DAOS is just a library linked in to that process).

 

You will have to control client NUMA binding using whatever technique you would normally use independent of DAOS.  mpirun implementations generally support NUMA binding, or if you're not running an mpi app, you can use something like numactl to run your app.

 

For the server, the NUMA node option you describe (pinned_numa_node) is the correct method.  The error you shared is not obviously related to that setting - You may want to try to confirm whether or not it's caused by changing the NUMA node setting.

 

-Patrick


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Wu Huijun <huijunw91@...>
Sent: Friday, October 16, 2020 8:50 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] CPU NUMA node bind error

 

I found DAOS cannot saturate the bandwidth of the IB network in our settings. We received warnings from the client-side saying "No network devices bound to client NUMA node 0" so I guess this caused the sub-optimal performance.

By commands such as daos_server network scan/ daos_agent net-scan, I noticed that the only IB card is with NUMA node 1 while the client is somehow bound to NUMA node 0. This is also the case for the server. I tried to use the option pinned_numa_node in the server config.yml to force it the bind to NUMA node 0. However, I got the following errors. Are there any good ways to control the NUMA bindings for both the clients and the servers? Thanks if anyone could help.

ERROR: daos_io_server:0 10/17-09:24:09.69 len-cn3 DAOS[6970/7024] bio  EMRG src/bio/bio_monitor.c:196 get_spdk_identify_ctrlr_completion() Assertion 'dev_health->bdh_io_channel != NULL' failed

daos_io_server: src/bio/bio_monitor.c:196: get_spdk_identify_ctrlr_completion: Assertion `dev_health->bdh_io_channel != ((void *)0)' failed.

ERROR: daos_io_server:0 *** Process 6970 received signal 6 ***

Associated errno: Success (0)

/usr/lib64/libpthread.so.0(+0xf5f0)[0x7f8e5adb45f0]

/usr/lib64/libc.so.6(gsignal+0x37)[0x7f8e5a164337]

/usr/lib64/libc.so.6(abort+0x148)[0x7f8e5a165a28]

ERROR: daos_io_server:0 /usr/lib64/libc.so.6(+0x2f156)[0x7f8e5a15d156]

ERROR: daos_io_server:0 /usr/lib64/libc.so.6(+0x2f202)[0x7f8e5a15d202]

/usr/local/daos/lib64/daos_srv/libbio.so(+0x1322d)[0x7f8e5b3dd22d]

/usr/local/daos/lib64/daos_srv/../../prereq/dev/spdk/lib/libspdk_thread.so.2.0(spdk_thread_poll+0xc6)[0x7f8e58dd7036]

/usr/local/daos/lib64/daos_srv/libbio.so(bio_xsctxt_free+0x28d)[0x7f8e5b3e309d]

ERROR: daos_io_server:0 /usr/local/daos/bin/daos_io_server[0x41b439]

/usr/local/daos/bin/../prereq/dev/argobots/lib/libabt.so.0(+0x1313b)[0x7f8e5ab9613b]

ERROR: daos_io_server:0 /usr/local/daos/bin/../prereq/dev/argobots/lib/libabt.so.0(+0x13811)[0x7f8e5ab96811]

instance 0 exited: instance 0 exited prematurely: /usr/local/daos/bin/daos_io_server (instance 0) exited: signal: aborted (core dumped)

ERROR: removing socket file: removing instance 0 socket file: no dRPC client set (data plane not started?)


Wu Huijun
 

Thanks Niu, but I think I don't really understand it here. So does it mean that pinned_numa_node option cannot be used in my setting? Are there any ways to solve this problem? What do you mean by 'open a ticket for it'? Do you mean opening a ticket here in the daos group?

Cheers,
Huijun


Wu Huijun
 

Thanks Joel!

I am running DAOS with MPI. What I did was to use -bind-to ib0 in mpirun to bind the processes with the NUMA node with ib0 (am I doing right?)

However, although the previous warning message regarding 'No network devices bound to client NUMA node 0, using response from NUMA 1 ' is gone, the performance didn't improve at all... It seems it might not be the NUMA affinity of the network devices that caused performance degradation. 

I was expecting the write performance to easily saturate the ib bandwidth... Could you please give some advices about what should I check, any configurations may matter?

Cheers,
Huijun


Niu, Yawei
 

Hi, Huijun

 

Sorry for the confusion I brought here. I didn’t refer to the NUMA question (which I believe was answered by others), I was just asking you to create a ticket for the particular assert error you observed, but I realized that the bug tracking system may be not accessible for you. Anyway, I’ll open a ticket by myself and fix it (the assert error). Thanks a lot for reporting this issue.

 

Thanks

-Niu

 

From: <daos@daos.groups.io> on behalf of Wu Huijun <huijunw91@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Saturday, October 17, 2020 at 9:46 PM
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] CPU NUMA node bind error

 

Thanks Niu, but I think I don't really understand it here. So does it mean that pinned_numa_node option cannot be used in my setting? Are there any ways to solve this problem? What do you mean by 'open a ticket for it'? Do you mean opening a ticket here in the daos group?

Cheers,
Huijun


Lombardi, Johann
 

Hi Huijun,

 

On our side, we are definitely able to saturate the link bandwidth with SSDs. I actually tested a similar configuration last Friday and it worked well. Maybe you should try with a single SSD first and then add more SSDs to see how the bandwidth scales?

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Wu Huijun <huijunw91@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Saturday 17 October 2020 at 16:46
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] CPU NUMA node bind error

 

Thanks Joel!

I am running DAOS with MPI. What I did was to use -bind-to ib0 in mpirun to bind the processes with the NUMA node with ib0 (am I doing right?)

However, although the previous warning message regarding 'No network devices bound to client NUMA node 0, using response from NUMA 1 ' is gone, the performance didn't improve at all... It seems it might not be the NUMA affinity of the network devices that caused performance degradation. 

I was expecting the write performance to easily saturate the ib bandwidth... Could you please give some advices about what should I check, any configurations may matter?

Cheers,
Huijun

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Wu Huijun
 

Thanks, Johann,

I tried to use different nodes and the bandwidth got saturated easily. It seems some SSDs performed sub-optimally...

Cheers,
Huijun

On Tue, Oct 20, 2020 at 6:17 PM Lombardi, Johann <johann.lombardi@...> wrote:

Hi Huijun,

 

On our side, we are definitely able to saturate the link bandwidth with SSDs. I actually tested a similar configuration last Friday and it worked well. Maybe you should try with a single SSD first and then add more SSDs to see how the bandwidth scales?

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of Wu Huijun <huijunw91@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Saturday 17 October 2020 at 16:46
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: Re: [daos] CPU NUMA node bind error

 

Thanks Joel!

I am running DAOS with MPI. What I did was to use -bind-to ib0 in mpirun to bind the processes with the NUMA node with ib0 (am I doing right?)

However, although the previous warning message regarding 'No network devices bound to client NUMA node 0, using response from NUMA 1 ' is gone, the performance didn't improve at all... It seems it might not be the NUMA affinity of the network devices that caused performance degradation. 

I was expecting the write performance to easily saturate the ib bandwidth... Could you please give some advices about what should I check, any configurations may matter?

Cheers,
Huijun

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.