IO Challenge Update 3: 31,323 IOPS
May 19, 2008 – 10:35 amWe now starting to see some of the more enterprise-league I/O rates… In contrast to the simple NVRAM cached 3 disk configurations used in the previous posts, we’ve moved to one of the more typical SAN based setups used for running the bigger applications. In this case, we now have a EMC CX array with over 150 disks.

Some additional commentary on what’s being tested here. The workload is running inside a virtual Windows 64-bit guest on ESX Server 3.5, and is performing I/O to a 4gigabit fibre channel SAN, so which the CX3 array is connected.
Since the virtualization stack logically resides between the benchmark in the guest virtual machine and the backend storage, it is critical that the virtualizion stack’s I/O facilities scale up without any performance ceilings, and don’t add any appreciable latency.
The Windows guest is running fully virtualized, which means that it is running a local SCSI device driver atop of an emulated hardware device. In the case of VMware ESX, this device is an LSI SCSI controller. The ESX virtual machine monitor is emulating the LSI hardware device, and then redirecting those I/Os into the I/O stack in the ESX kernel. Inside the ESX kernel is the VMFS clustered file system, layered on QLogic fibre-channel driver for the physical HBA.
The I/O subsystem in VMware ESX shown in uses a direct driver model, so that there is minimal latency added by the virtualization stack. This is possible because I/O requests can be handled in-line by the same processor as the requesting virtual machine (other architectures add substantial latency and CPU overhead when I/O is proxied via a heavy-weight domain-0 or parent-partition).
On this configuration, we saw 31323 8Kbyte I/Os per second (we upped the I/O size from 4k in the previous runs). That’s approximately enough I/O to support 30,000 Microsoft Exchange 2003 users, or 60,000 Exchange 2007 users, using the standard “Heavy” profile in the exchange “loadgen” benchmark. The 2007 user-count is higher, because the I/O rate per mailbox drops approximately 50% , resulting largely from the additional disk cache available in the 64-bit 2007 version.


One Response to “IO Challenge Update 3: 31,323 IOPS”
Hi Richard,
I’m an Oracle DBA, one of 4 administering 24×7 Production databases totalling 4 terabytes of data.
I’ve recently come across your writings particularly one pdf titled ‘File Systems’ for Solaris, and I am highly appreciative of the great technical depth and knowledge you displayed in your deliberation.
As an Oracle DBA, I rely heavily on our Solaris Administrators for their contributions and insights. Therefore, my knowledge and experience with Solaris is at best ‘basic’.
I intend to get your publication from Amazon together with some other definitive books like Jonathan Lewis’ on Oracle Optimizer.
Would you mind giving me your considered opinion on what is happening with our Solaris I0 I/O sub-system.
We have asked the question several times and it seems to have gone ‘under the carpet’ for some reason. Maybe, it’s too challenging I suspect.
Oracle’s v$filestat and v$tempfile views are telling us that the amount of Oracle related i/os over a given time (after having undergone extensive multiple sql tunings) have gone down significantly. However, using iostat and sar, we don’t see a corresponding decrease in demand at the OS level. Even when Oracle is going through a quiet period, the OS is still reporting high kr/s when we expect to see proportionally reduced rates.
The question is, what’s causing the apparent discrepancy ? We are causing cooked OS ie Veritas FS. So, is it the read-ahead that is giving us more data from the SAN that we need ?
What else could we do besides using iostat, var, vmstat ie the usual tools, to find out what might be causing the apparent discrepancy ?
Your feedback much appreciated.
Thanks.
Alex
By Alex Low on May 2, 2009