IO Challenge Update 2: 4013 IOPS

May 16, 2008 – 2:50 pm
This result is on a somewhat tiny system -- we are able to get 4013 iops from a Dell 1950 server with just 3 disks. You might expect a somewhat smaller number with just 3 disks (like ~750), but these servers have a non-volatile cached storage RAID controller, which is able to cache some reads and writes in memory, this giving more IOPS than possible from the 3 disk spindles. The team is off running the same tests on one a much bigger array, so this certainly isn't a heroic result. To put this result in perspective however, the average amount of disk IOPS we see for databases (from capacity planner) is about 1200/s for 4vcpus. As another data point, the average number of IOPS I often saw on big SPARC/Solaris servers running transaction databases was in this order of magnitude -- sure, there are applications with much higher requirements, but more ...

I/O Challenge Update 1 - 231 iops

May 15, 2008 – 3:44 pm
Our engineers have been working on the I/O challenge... They are using iometer as the workload generator -- it can faithfully produce a random I/O load on Windows, and small-random-io is typical of a database access. I'd love to use filebench, but we don't have a Windows port yet (any takers?) The iometer tool has a variety of workload settings, and can be configured to drive random, sequential mixes of any desired combination of I/O sizes. Here you can see the workload selection screen which we are using to configure random 4k I/Os: For the first test, we benchmark a single disk. For this experiment, we test the random IOPS from a seagate 15krpm disk. As a simple baseline, we get about 230 iops per second from a single disk. At this point, we are seeing about 4ms service time, and ESX is contributing about 0.12ms of additional latency, less than ...

The Grand Virtualized I/O Challenge

May 14, 2008 – 9:18 pm
We all know that the performance of I/O critical to many enterprise style applications. Typically, it's the throughput and latency of the I/O system that is critical to performance, especially to that of online web or transaction processing systems. Since these style of applications operate on small data items at random places in the dataset, it’s more important to have good random I/O throughput (measured in I/O operations per second), than bandwidth (MB/s). Going back to disk basics, we typically look at a disk's performance in two dimensions - the amount of data we can stream to/from the disk (it's bandwidth) and the number of I/O's per second the disk can perform. Streaming data from disk is primarily limited by how fast it can be read from the head as the disk surface spins underneath -- and is related to the rotational speed of the disk and the density that the ...

FileBench has a twin…

May 12, 2008 – 1:56 pm
I just saw that Neel at Sun has  released a sister project to FileBench -- uPerf. uPerf uses a similar model-based approach to allow flexible, application realistic workload to be synthesized. The source for uPerf is available too... We look forward to researching how uPerf can be used to synthesize virtual network workloads...

Virtualization Performance Tutorial @ Usenix 2008

May 6, 2008 – 10:22 pm
I'm starting a new tutorial this year at Usenix -- all about performance and tuning of VMware ESX server. The session is on Tuesday, June 24, 2008 in Boston, and is an all-day class. Who should attend: Anyone who is involved in planning or deploying virtualization on VMware ESX and wants to understand the performance characteristics of applications in a virtualized environment. We will walk through the implications to performance and capacity planning in a virtualized world to learn about how to achieve best performance in a VMware ESX enviroment. Take back to work: How to plan, understand, characterize, diagnose, and tune for best application performance on VMware ESX. Topics include: Introduction to virtualization Understanding different hardware acceleration techniques for virtualization Diagnosing performance using VMware tools Diagnosing performance using guest OS tools in a virtual environment Practical limits and overheads for virtualization Storage performance Network throughput and options Using Virtual-SMP Guest Operating System Types Understanding the characteristics of key applications, including Oracle, MS ...

Realistic Virtualization Benchmarks

May 6, 2008 – 10:01 pm
A recent comment about 'bench-marketing' caught my attention. I much prefer to see performance analysis of real-world benchmarks, because well-designed studies server as reference examples to guide solid decisions when planning for virtualization. The specific comments were about wanting to see more examples of scale-out performance - a term I use to describe when multiple VMs are stacked onto a single server. Given my background, one of the first things I asked about when I arrived at VMware was about scaling. I was highly skeptical about how well the scheduler and virtual machine monitor could scale up and out. We ran some simple microbenchmarks using components of the SPECcpu suite on a 16-core Sun x4600 system, using 4-vcpu guests. The results were a surprising to me at least, since I know how hard it is to scale operating system algorithms. It wasn't however much of a surprise to the people who had ...

FileBench for Linux

December 31, 2007 – 3:36 pm
Eric and Drew's recent FileBench updates have been exceptionally helpful making the package more complete and easier to use. We're using FileBench to do database I/O simulations on virtual machines, and I've just completed a significant update to the package to allow it to be used in full anger on Linux. The main objective of this work was two-fold, to finish an updated port to Linux, and to enable native Linux async I/O, in a mode almost identical to that used by the major database vendors. A brief summary of what was done in this release is: Updated the automake Makefile.am tree, to build correct Linux based Makefiles for the new files and directories Added a new backend support in the flowop library, so that aiowrite uses native Linux async I/O (via io_submit and io_getevents). Added a new flowop (aiosubmit), to allow two-phase list I/O submit/reap Added the aiosubmit to the oltp.f database workload Enabled shared memory ...

Just what are the important performance factors for Virtualization?

November 21, 2007 – 4:20 pm
Steve Wilson does a good job at putting the Oracle benchmarketing claims into persepective. Later in his post, he touches on a few factors which are more relevant in the practical world, including power-performance ratios. There are many aspects by which we can measure virtual-performance. I strongly agree that alternate factors like power are becoming very important, and that there are several more that are typically considered. Historically, the primary factors have been: CPU Efficiency: How much CPU is used to deliver a prescribed throughput Price-performance: Performance vs. cost of the CPU Going forward, I think the following will be more important: Throughput: can the application deliver the required levels of throughput, in terms of real world transactions? Latency: is the latency of each transaction within tolerances, or affected by virtualization Scalability: does throughput/latency change as load is increased (often asked in the context of - “do I have enough future headroom?”) Memory efficiency: ...

Oracle performance on VMware ESX

November 15, 2007 – 2:03 pm
Since I've been at VMware, I've tried to meet as many customers as schedules permit, to better understand performance needs. I love talking to customers, and I've managed to meet some great people so far, from 35 separate companies in two countries. One of the questions that always comes up is about the "big apps" -- i.e. can I virtualize them. We're have a continuing focus on these apps, and I've started to share some of that in my first post to VROOM (most appropriate, given my keen interest in fun cars) -- the VMware performance blog.

What happens when toddler curiosity meets ZFS?

November 15, 2007 – 1:18 pm
It's always entertaining and revealing to look at performance statistics in the frequency domain. Since most performance anomalies are driven by some regular external event, you can often learn about what might be causing repetitive performance or response time patterns. Here's an interesting graph of the traffic to solarisinternals.com: The traffic to the site is amazingly circadian -- no surprise there. So what caused the obvious dip on October 29? Here's my confession -- the youngest member of the mcdougallplex administrator team entered the server room, and couldn't resist the big bright green light on the 5TB ZFS server :-) Needless to say, the physical security perimeter has been increased...