The latest struggle of Cloudera and MapR is the proof tells that there were better days seen by Hadoop. Both Cloudera and MapReduce are the remaining independent distributors of Hadoop software. After years of failing to meet expectations, there are few customers who have decided to quit on Hadoop and move on. However, other customers in the industry still have faith in Hadoop and think it will survive into the next decade, with lower expectations.
Hadoop started over a decade ago as Nutch, a distributed engine created by Mike Cafarell and Doug Cutting that was used eventually in Yahoo. Hadoop is running on a network of X86 servers that are equipped with hard drives. It changed the math on storage and revolutionized the big data era and brought parallel computing into the mainstream.
With the growth in use cases, Cloudera became a preeminent platform of Hadoop for enterprise data analytics. It would allow companies to extract value from big data with the help of Hadoop platform, the thinking went. This centralized approach would allow customers to discard standalone analytics systems and apply common tools set and techniques for building multiple applications, such as fraud detection, product recommendation, etc.
Enterprises have seen Hadoop technology evolvement at an incredible rate. The ecosystem flourished with the release of Hadoop 2.0. More open source projects with names like Impala, Hive, Giraph, Storm, Spark, and Tez introduced to handle certain big data challenges and opportunities. Machine learning, graph analytics, real-time analytics – everything would run over a single platform.
Cracks in Hadoop-land
The trouble started to appear in around 2015 when customers started complaining about software that wasn’t integrated and projects that never sent for production. Keeping entire software integrated and in synch was a major challenge.
Apart from this, there was a fact troubling the most was the difficulty to find people with such great technical skills to build finished apps using Hadoop. Eventually, Hortonworks emerged as the second distributor beyond Cloudera, realized the software was moving faster and it broke Hadoop down into core elements and this affected its release cadence. However, thinking about Hadoop complexity in use remains.
Public cloud vendors brought their own Cloudera Big Data Solutions, which applied similar underlying technology in mainstream Hadoop implementation, in the simplest package. When you combine Amazon Web Services, Google Cloud, and Microsoft Azure, the trio has captured a large share of big data processing and storage.
Hadoop is likened to a specialized OS for distributed data storage and analytically compute workloads. Operating systems are complex and creating apps for them is only a cup of tea of the experts. However, Hadoop was mostly and widely sold as a platform that initially included those apps, which aided contribute to the difference between expectation and reality.
With Cloudera, expectations related to revenue just got exceeded in its most recent quarter. However, what nailed the company to the cross was its estimate of future revenue. This doesn’t make changes to the fact that a few companies are pulling out of Hadoop in favor of the cloud, for different reasons including sound economic reasons, fear, doubt, and uncertainty.
“Hadoop is unable to walk along with cloud capabilities. There is no need for HDFS. It was an inferior file system after all,” says Oliver Ratsezberger, CEO of Teradata.
Some find Hadoop still valuable software. But instead of taking Hadoop as the solution for all data issues, Cloudera and others should shift the focus on one core area – data warehousing; because this is the place, where the Hadoop distributors lost their goals.
2,605 total views, 2 views today