MPIO: Poor performance using Starwind 6.0

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Tue Apr 02, 2013 2:00 pm

Dear Support Forum,

I have a simple setup to evaluate the iSCSI SAN, and I cannot manage to get appropriate performance results.

Hardware:
- iSCSI client: HP DL165, W2k8R2, using 3 dedicated NICs via MPIO . NICs have 192.168.0.1, 192.168.1.1, 193.168.2.1, only TCP/IP protocol active, MS initiator
- iSCSI server: Core i3, W2k12 Storage Server, 3 dedicated NICs (192.168.0.2, 192.168.1.2, 192.168.2.2), Starwind 6.0.5228 manages a virtual image file that resides on a RAID0 using 2 Samsung 840 SSDs. Write-Back Cache 512 MB.

I thoroughly followed the Starwind MPIO pdf guide. Nevertheless, the atto benchmark results attached to this (acquired directly on the client) are far from the results measured directly on the server. In particular for small transfers sizes, performance is only about 10% of the native server results.

I am thankful for any tip...

Best
Matthias
Attachments
atto results SERVER drive
atto results SERVER drive
atto benchmark SERVER DRIVE.PNG (20.36 KiB) Viewed 9581 times
atto results CLIENT
atto results CLIENT
atto benchmark.PNG (68.78 KiB) Viewed 9587 times
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Tue Apr 02, 2013 4:15 pm

Hi Matthias,
Could you please show how the 3 pathes are displayed in the iSCSI initiator device properties?
Also, do you achieve pure gigabit performance with each of the cards individually?

PS: I would also appreciate if you get a chance to do few IOmeter results, IOmeter tests usually appear to show results closer to the real load values.
Max Kolomyeytsev
StarWind Software
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Tue Apr 02, 2013 4:41 pm

Hi Max

Thank you for your reply and for this extraordinary responsiveness!

Please find attached some screenshots of the iscsi initiator.
Please note: Although currently configured as "least queue depth", we tried "round robin" also. It showed a bit lower performance results compared to the first.

I do not think anymore that the issue is related to the MPIO config. If I disable two of my three NIC connections in the ISCSI Initiator, the results for large transfer sizes remains limited due to 1-Gbps network; however, the smaller transfer sizes are still the same kind of weird slow.
(see attached screenshot).

Could it be some issue related to the use of Win Server 2012 on the server machine?

Anyway, I will checkout iometer; although performance issue is obvious despite benchmarking tool.

Thank you

Matthias
Attachments
DISABLED MPIO
DISABLED MPIO
Atto NO mpio.PNG (19.78 KiB) Viewed 9564 times
mpio config: iscsi initiator
mpio config: iscsi initiator
mpio config.PNG (51.12 KiB) Viewed 9565 times
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Tue Apr 02, 2013 5:30 pm

Maybe time to add some further network infos:

- Server: 3 Realtek 8168 NICs using the latest driver (8.012, 2013/3/26), Jumbo enabled.
- Switch: Cisco SGS2000 48-port, Jumbo enabled. No teaming or vlans etc.
- Client: I meanwhile used my workstation computer to substitute the w2k8r2 client. My Win7 PC shows the same slow results for small transfer sizes.

Also, I did a localhost iSCSI connection. I am not sure, if this is a valuable test - anyway, the results are good in this case, i.e. for small transfer sizes 5 to 10 times faster (see attached).

Best regards
Matthias
Attachments
localhost iSCSI connection
localhost iSCSI connection
Atto localhost iSCSI connection.PNG (20.33 KiB) Viewed 9560 times
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Tue Apr 02, 2013 6:11 pm

Hi,
I am not so familiar with iometer, here are some results.

Hope this helps.

Best regards
Matthias
Attachments
results.zip
iometer results
(2.08 KiB) Downloaded 402 times
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Wed Apr 03, 2013 2:40 pm

Hi Matthias,
Localhost iSCSI connection looks good.
If you're running ATTO it's also a good idea to increase the queue depth to 10 and limit the block size to 4k - 256k, bigger block size is not really relevant.
Few more questions:
1. Have you provisioned any cache for the iSCSI device in StarWind?
2. do you get same performance results with jumbo frames turned off?
Max Kolomyeytsev
StarWind Software
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Wed Apr 03, 2013 4:39 pm

Hi Max,

Thank you for your reply.

Yes, I have provisioned 512 MB write-back cache in StarWind.
Yes, I get the same results with jumbo frames turned off. To be clear: I used a cross-over lan cable to bypass the cisco swith, and disabled jumbo on both NICs - no performance increase.
Attachments
atto depth 10 write back cache 512 mb
atto depth 10 write back cache 512 mb
atto depth 10 - mpio -write-back.PNG (62.11 KiB) Viewed 9502 times
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Thu Apr 04, 2013 11:44 am

Matthias,
The last test looks like using 3 connections, could you please exclude MPIO from the equation to check if a single link is getting saturated in these tests.
also, could you please test the link itself following the guidelines from this document (ntttcp/iperf)
http://www.starwindsoftware.com/starwin ... ice-manual

MFG,
Max
Max Kolomyeytsev
StarWind Software
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Thu Apr 04, 2013 1:53 pm

Max,

Thank you for coming back.

I disabled MPIO, and did some benchmarks with just one NIC (see attached). Small transfer sizes remain very slow.
As I stated earlier, I tried a different client (my workstation) with no MPIO, and low performance on small sizes.

I used ntttcp to check the link, 890 MBit throughput, and 8128 average frame size - looks good (see attached). Currently, and for me it seems that either the combination of StarWind / W2k12 or some other server relevant circumstance causes the issue (bad server NICs)...

At the moment, I have no clue what else could be checked.

Matthias
Attachments
ntttcp
ntttcp
ntttcp.PNG (25.81 KiB) Viewed 9459 times
atto single NIC - no mpio
atto single NIC - no mpio
atto - SINGLE NIC - no mpio.PNG (49.34 KiB) Viewed 9460 times
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Thu Apr 04, 2013 2:07 pm

4k performance is below average.
iSCSI uses 64k blocks thus, shows good results in ATTO benchmark.
Although write performance is not really good.
Normally I see 102-105 MB/s for writes using an iSCSI attached RAM device (should be no big difference with SSD)

NIC suspicion goes double with me.
I've never seen a Realtek in a production environment so it would be great if you get a chance to test same config with different NICs (Intel/Broadcom preferred).
Max Kolomyeytsev
StarWind Software
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Thu Apr 04, 2013 2:17 pm

Thanks. I will come back after I installed some other NIC...

Regards
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Thu Apr 04, 2013 2:25 pm

Anyway, in my opinion, two issues remain strange:

- if it was due to slow NIC, MPIO should still scale somehow, but it does only for large blocks
- also for 64k atto results (comparing same queue depth, see post #1), StarWind is far away from physical performance

Matthias
kmax
Posts: 47
Joined: Thu Nov 04, 2010 3:37 pm

Thu Apr 04, 2013 7:40 pm

Try the delayed ack (nagle) registry fix?

TcpAckFrequency key if I remember correctly.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Mon Apr 08, 2013 10:20 am

Matthias, may I ask you if you have any update for the community?
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
jeddyatcc
Posts: 49
Joined: Wed Apr 25, 2012 11:52 pm

Mon Apr 08, 2013 12:23 pm

I also recommend turning off all of the offload stuff on both sides of the connection. Especially Large Send. This has made a big difference for me as well.
Post Reply