What happens when Facebook, Yahoo, Google and Oracle get together? The answer: you get big data. That’s no exaggeration as that’s exactly what happened when three engineers — Jeff Hammerbacher from Facebook, Amr Awadallah from Yahoo and Christophe Bisciglia from Google — and Mike Olsen, executive from Oracle, teamed up. The four in 2008 formed Cloudera, which builds products aimed at solving the problem of big data for enterprises. “We knew this was not a temporary problem and more enterprises would face the same problem of grappling with more data than they can handle,” says Mike Olsen, chairman, Cloudera.
So, what exactly is big data? It is typically any data that cannot be handled by traditional systems. When consumer internet companies such as Google, Yahoo and Facebook exploded on the web, they discovered there were no systems that could process and manage the massive amounts of data their users were generating. So, they went ahead and built their own. Google was the first to publish a paper on how to handle the avalanche of data generated. Inspired by Google’s technology, Doug Cutting, who was then with Yahoo, came up with an open source software called Hadoop, named after his son’s yellow stuffed toy elephant.
What does Hadoop do? Let’s say you are looking for a tiny needle in a large haystack. If you go through it all by yourself, you’d be there for a while. But if five friends joined you, divided the stack and searched simultaneously, chances are you would find the needle much faster. Hadoop does that with big data. It can handle all kinds of data: structured, unstructured, log files, pictures, audio files and e-mail, irrespective of format. It takes a complex data problem, breaks it into smaller pieces and sends it to different servers for analysis. Once all the pieces are processed (parallely), it integrates the answers using a technology called MapReduce.
Cloudera, which now has around 500 employees, sells its own software that runs on top of Hadoop. It helped that Cutting also joined Cloudera. When they started, Olsen says, no one knew what Hadoop was, nor did they feel that big data was a huge problem. But a few companies in financial services and the media soon started to see merit in Cloudera’s argument. On its part, Cloudera realised that for enterprises, it was not just enough to store a lot of data and process it a particular way, but also provide security, data governance and capabilities that provide real-time data access and analysis. Cloudera offered all of that through its solutions and soon became a clear market leader in its space. Today, eBay, Samsung, CBS, AOL and Box are among Cloudera’s top clients.
Though Cloudera is keeping a close eye on the intensifying competition in the space as new companies increasingly make their presence felt, it believes it has significant first-mover advantage. “Cloudera is defining a new business model to drive innovation in enterprise software infrastructure,” says Ping Li, partner at Accel Partners. “Unlike other open-source distributors that just provide basic support and services, Cloudera also builds proprietary software that enables enterprises to deploy, operate and manage Hadoop in their complex environments.”
Cloudera believes its latest product offering, the enterprise data hub — the first unified platform for big data where companies can store, process and analyse all their data — will revolutionise enterprise data management. What the hub does is that it lets executives enter queries on the data rather than wait for engineers to translate their questions into instructions that Hadoop can understand, thus providing real-time data access and analysis and leading to quicker business decisions. “The enterprise data hub will emerge as the dominant data platform and you will see the rest of industry embrace the idea and launch their own versions in the next couple of years,” predicts Olsen.
He believes that the hub will bring in enormous growth not only for Cloudera but also the industry. Accel’s Li couldn’t agree more. “The founders had the unique vision that a next-generation data platform needed to emerge in order to transform enterprises that were grappling with the explosion of big data. We feel the new platform will yield a massive market opportunity to build a great company.”
The company has its eyes set on billion-dollar revenues as a target to achieve over the next couple of years. The entire market is seeing exponential growth, with the space that Cloudera operates in likely to grow multifold in the next couple of years.
Big is beautiful
Being one among the hottest cloud companies in Silicon Valley, raising money has been comparatively easy for Cloudera. Last December, it mopped up $65 million in a Series E round of funding led by Accel Partners, which had already in 2009 invested $5 million in Series A funding. “They have the courage to change the world with technology. Who else would decide to start a company to change the multi-billion dollar data management industry when the world was melting down in the fall of 2008,” asks Accel’s Li. Other investors include Greylock Partners, Ignition Partners, In-Q-Tel and Meritech Capital Partners. With the latest round of funding, Cloudera has raised about $141 million in capital in a span of four-and-a-half years. The recent deal infusion values the company at about $700 million.
The Enterprise data hub: One unified system
More importantly, in July 2013, Mike Olsen made way for Tom Reilly as the chief executive officer (CEO). Reilly was the former CEO of security company ArcSight, which he took public two years after he joined it in 2006. Four years later, he finally sold the company to HP for a staggering $1.5 billion. He is most likely to take Cloudera through the same path as he manages the company’s rapid growth.
According to Olsen, the value proposition for Cloudera’s offerings only moves higher as it solves business problems with applications that are enabled on its platform. For instance, in financial services, it can lead to much better analytics and risk management. In healthcare, it can lead to better outcomes at lower cost. Solving business problems rather than just providing software will not only lead to faster adoption but also higher transaction value.
“I am a database guy and I can’t cure cancer or feed 2 billion people, but I am pretty excited that we are building a platform that researchers are going to use to solve problems,” says Olsen. He says that while Cloudera will continue to solve business problems better, solutions to larger global issues such as how to generate and distribute clean energy, distribute clean water to those who are in need and improve agricultural yield across the world can be found through better use of data. If that indeed turns out to be true, then Cloudera’s enterprise data hub will soon be the inflection point in big data.