Our Elephant Grows Up – New Serengeti Capabilities for Hadoop

October 23, 2012 – 2:59 pm
Since the release of Serengeti, VMware has learned a tremendous amount from our customers about using virtualization as the platform for big data workloads and Hadoop. These customer conversations provided us with solid reasons to virtualize Hadoop and other big data workload... [Read More...]

Towards an Elastic Elephant: Enabling Hadoop for the Cloud

October 22, 2012 – 3:00 pm
In his joint presentation at Hadoop Summit 2012 titled “Hadoop in Virtual Machines” , Richard McDougall talked about the benefits and challenges of virtualizing Hadoop. In particular, he introduced the idea of separating Hadoop’s compute runtime from data storage... Read More...

Big Data and Virtual Hadoop at VMworld 2012

October 2, 2012 – 3:02 pm
VMworld 2012 has come and gone, and VMworld Europe is on the horizon. We had several big data oriented sessions this year, and saw a significant rise in the activity in this important area. During the keynote, we demo’ed the next version of Serengeti , which allows Hadoop to be elastically scaled on a virtual platform... Read More...

Project Serengeti: There’s a Virtual Elephant in my Datacenter

June 12, 2012 – 3:04 pm
There’s no question that the amount of value being extracted from data is increasing – almost every customer I speak with is building new technology to gain new or competitive insights from tapping large volumes or rates of data. In the last few posts, I have introduced VMware technologies and products that provide data services to new applications... Read More...

Analyzing Hadoop’s internals with Analytics

May 10, 2012 – 3:05 pm
As part of our Big Data efforts, we have a team focused on Hadoop that is working hard to ensure Hadoop runs well on vSphere. We published a paper last year on Hadoop performance, and have a lot more in the pipeline. More recently, I took up a challenge to see how much we could learn about Hadoop I/O in a very short time... Read More...

New Year, New Assignment: Data

February 10, 2012 – 3:07 pm
I’ve been spending a growing amount of time in the data space. This has been mostly as a strategic planning activity for VMware’s product investments as we extend into the data space, and in 2012, I’ll take the lead on our big-data development efforts... Read More...

Is Cloud Storage going to disrupt Traditional Storage? – Part2: What is Blob storage?

November 5, 2010 – 5:53 pm
Cloud-scale Blob is the new category of storage designed specifically to meet a very relaxed set of requirements uniquely selected to match the needs of the majority of bytes of data for new media types. Gone are the complex constraints of concise consistency [Read More...]

VMware performance for Gurus

November 5, 2010 – 9:06 am
I'm running my 11th annual Usenix performance class next week in San Jose - VMware performance for Gurus http://tinyurl.com/2g2t37q

Is Cloud Storage going to disrupt Traditional Storage? – Part1: The demise of expensive datacenter storage

October 25, 2010 – 5:35 pm
Recently, I've been looking at changes that are unfolding for the ways personal users store and manage data, and server applications use, store and manage data. I see some significant trends that radically change the way new types of storage systems are structured. Read more...

Flash & Phase Change Memory Talks

October 26, 2009 – 5:46 pm
  This week, I attended talks at the international high performance transaction workshop. Following are my rough notes from the Flash Memory discussions. First up Steve Kleiman from Network Appliance spoke about their intent to move  flash into the clients that access NAS, so that it can intelligently cache and interact with the backend storage: Netapp is building host-side caches for VMware and HyperV The cache will be a block based read cache  Cache is write through   Andy Bechtolshiem, cofounder of Sun and now at Arista Networks talked about trends affecting system design and how Flash could be leveraged. 3d chip packaging will become common, to solve power and latency fundamentals around higher clock rate memory systems.  This will mean that systems will become very NUMA, since the memory is directly attached to the cpu core. This is the decade where we switch to optical in the datacenter. Cutover is 10gbit (copper), 20 requires optical and will move to volume commodity ...