View Sidebar

Archive for category: Hadoop

Enterprise Ready Hadoop Infrastructure from EMC – Isilon

Enterprise Ready Hadoop Infrastructure from EMC – Isilon

With increased reliance on technology and large scale usage of applications and IT systems, the amount of structured & unstructured data stored and processed by a typical modern-day enterprise has been growing very rapidly. Organizations today, lest they’re okay with the idea of being left-behind in the race, require highly efficient, effective and scalable storage solutions to manage this growth.

Modern day organizations require high-end storage systems also because the latter helps provide powerful analytics; they can draw information of concern from data. EMC Isilon scale-out Network-attached storage (NAS) with native Hadoop Distributed File System (HDFS) provides Hadoop users access to shared storage infrastructure that helps minimize the void between Big Data Hadoop and IT analytics.

The lsilon NAS integrated with HDFS offers customers a solution to accelerate enterprise ready development of Apache Hadoop. Until now, customers of Hadoop have benefited from storage infrastructure solutions that weren’t really optimized for big data storage, thus limiting the scope of Hadoop’s applicability in large enterprises. But, EMC Isilon with native HDFS tackles this challenge well and offers an all-inclusive enterprise ready storage system to collect, protect, analyze and share data in Hadoop environment.

Enterprise Ready Hadoop Infrastructure from EMC - Isilon

By integrating Hadoop natively in an enterprise-class storage solution, Isilon has enabled customers to benefit from a comprehensive data protection system (irrespective of the size of the Hadoop data). By combining EMC Isilon scale-out NAS with native HDFS, EMC will be able to reduce the complications related to Hadoop usage to allow enterprises to extract valuable data from the gigantic heaps of unstructured & structured data.

EMC Isilon provides Hadoop customers a built-in entrance to enterprise data protection; this is made possible with the integration of Isilon scale-out NAS storage system and native HDFS. This integration of Isilon and HDFS eliminates any one point failure with open source Apache Hadoop that enterprises are using; further, the combination allows customers to use a Hadoop system of choice to accelerate their Hadoop adaptation in enterprise ready environment.

Industry’s first scale-out storage system with native HDFS offers the following advantages: 

  • Enterprises can utilize more benefits of Hadoop
  • Reduces risks
  • Increases organization knowledge

The reason why enterprises need to consider ‘HDFS plus Isilon’ is that there’s no ingest necessary anymore. It’s comparatively cheaper and still, the performance is better. With multiple enterprise-features, multi-protocol access and Hadoop multi-tenancy, ‘HDFS on Isilon’ supports nearly everything you’d possible want to work with such as Pivotal, Apache, Cloudera and Hortonworks. NameNode SPOF and 3x Mirroring, two key challenges with DAS Hadoop are eliminated too!

Advantages of EMC Isilon storage implementation over traditional implementation

  • It offers scale-out storage to facilitate multiple workflow and applications
  • No downtime associated, it is distributed in NameNode
  • Provides matchless storage efficiency
  • Offers independent scalability to compute and store separately
  • Provides end-to-end data protection using SnapshotIQ, SynclQ and NDMP Backup

Benefits an enterprise derives from data storage & analytics solution – Hadoop

Hadoop as an enterprise ready big data analytics solution can help store, analyze, structure and visualize big amounts of structured & unstructured data. Hadoop is especially beneficial because it enables users to process unstructured big data, to give it structure so that it can be used for the advantage of the enterprise.

a)   Benefits an enterprise derives

  • Enhanced business agility
  • Easier data management
  • Faster and more convenient data analytics
  • Reduction in time and cost of infrastructure and maintenance
  • Ability to accommodate and analyze irrespective of type or size

b)   Hadoop enterprise ready EMC Isilon advantages:

  • Dependable security
  • Scalable storage solution
  • Continuous availability
  • Existing infrastructure and simple integration
  • Easy deployment and faster administration

EMC Hadoop Starter Kit (HSK)

For extracting insights on customer sentiments and other such information from big data, you will need the Hadoop integration if you are an enterprise that uses VMware Vsphere and/or EMC Isilon . Hadoop with Isilon integration becomes enterprise-ready and helps your data architecture deal with new opportunities provided by data most diligently along with the existing tasks.

Now, to make things even simpler for an organization that uses VMware Vsphere and EMC Isilon, an EMC Hadoop Starter Kit has been developed (video). This HSK step-by-step guide is designed to help enterprises learn and discover the all encompassing potentials of Hadoop.

VMware has also started an open source project (called Serengeti) that can help automate the management and deployment of Hadoop clusters on vSphere. With a virtualized infrastructure, Hadoop can be run as a service.

Whether you are a seasoned Hadoop user or a newbie, all can equally benefit with the HSK because of following reasons:

Rapid provisioning: Most of the Hadoop cluster development can be automated with expertise. Thus, the guide takes you through the process of creation of Hadoop nodes and to set up and start Hadoop service on a cluster, which makes it ever so simple for you to execute.

High availability: High availability protection with use of virtualization platform ensures that single point of failure in Hadoop storage solution can be protected.

Profitability: Enterprises can use and benefit from any Hadoop distribution within the big data application lifecycle; this, with zero data migration.

Elasticity: The same physical infrastructure can be shared amid Hadoop and other application, since, the Hadoop capacity can be scaled to and fro according to demand.

Multi tenancy: Hadoop infrastructure offers multi tenancy option, which means different tenants can have virtual machines provided to them, thus enhancing data security.

EMC Hadoop Starter Kit combines the benefits of VMware vSphere with Isilon scale-out NAS in order to help achieve big data storage goals and added analytics solution.

Some of the reasons why the HSK can be considered as the outright solution have been mentioned above. The merits, especially ‘profitability,’ explains that users can use Hadoop distribution all through the big data application lifecycle with zero data migration that includes, Hortonworks, Pivotal HD, Cloudera and Apache Open Source etc.

This means that starting Hadoop project with EMC Isilon scale-out NAS, enterprises can profit with zero data migration when they have to move from one Hadoop distribution to another. This implies that user can run multiple Hadoop distributions for same data without data duplication.

EMC Isilon’s Notable Collaborations

In addition, Isilon also shares a good collaborative effort with companies like Splunk, Rackspace and Rainstor. EMC Isilon scale-out NAS is no doubt the finest storage system offering users an opportunity to scale capacity and performance of data to meet their needs. To benefit Hadoop users, Isilon has teamed up with Splunk, Rackspace and Rainstor for additional benefits.

Isilon and Splunk: Splunk for Isilon app integrates EMC scale-out NAS with Splunk. The team up of EMC and Splunk helps enterprises manage avalanche of data across virtual, cloud and physical environments to transform this data into real time insight for the user.

Isilon and Rackspace: EMC Isilon helps enterprises to store, consolidate, analyze and use data and applications exceeding 100 TB. Rackspace offers its services to EMC Islion NL400 and X400 high density and large capacity models to perform their tasks diligently for greater benefit of enterprises.

Isilon and RainStor: The combination of EMC and RainStor helps enterprises run the Hadoop distribution anywhere. The RainStor’s unique data compression technique helps enterprises to analyze their large data sets with more efficiency and greater predictability.

Gartner Big Data 2013: Highlights

Gartner Big Data 2013: Highlights

Gartner’s annual big data survey report for the year 2013 was released recently. As expected, the highlights of the survey were pretty startling. The survey revealed some beliefs in big data backed by evidence.

Gartner Big Data 2013 - HighlightsThe biggest revelation of the year’s Gartner survey was that 64 percent of companies globally have already implemented or are planning to implement big data systems. The percentage reveals that nearly 30 percent companies have already invested in the big data systems and 19 percent are on the verge of investing in the technology over the next one year. Additionally, the survey shows that another 15 percent companies are willing shell out some money over the next couple of years.

The percentage exposed by the survey is a significant number, which goes on to prove that there is a genuine interest amid the companies to imbibe the new big data system. A large chunk of enterprises are looking at ways they are managing their data and wish to hunt for new ways to get the best out of the ever growing data industry.

The surveyed  

Gartner LogoAccording to Gartner, the survey was basically focused on companies (720 Gartner Research Circle members) and was carried out in June 2013. Designed primarily to understand the investment plans of various organizations for big data technologies, what stage of implementation the companies have reached and how the big data is helping these enterprises solve problems.

Despite being a very confined survey, due to the variety of companies surveyed, this survey is a broad and effective representation of how the world of big data is shaping up and how the enterprises (big and small) are adapting it.

The Prominent Findings

The survey reveals that the industries that lead the big data investments for 2013 include media, communication and banking.

According to Gartner, about 39 percent of media and communication organizations vouched to have already invested heavily in big data technologies. 34 percent of banking organizations also said they have made investments in big data. According to the survey, investments for the next couple of years are majorly lined up in the transportation, healthcare and insurance sectors.

What Is Instigating Companies To Invest In Big Data?

Following a strong precedent set by the billion dollar companies like Google and Facebook, almost all enterprises worldwide have understood that big data usage can have a significant impact on revenue. Therefore, it is not a surprise that more and more organizations are looking to invest in big data.

 Big data in most cases, if analyzed and used properly, can help companies learn about customer experience and customer expectations. Big data analysis helps produce highly useful insights that helps companies make really smart business decisions.

When Facebook Concluded Largest Hadoop Data Migration Ever

When Facebook Concluded Largest Hadoop Data Migration Ever

Since the inception of Facebook in particular, days of storing massive data on servers are here. Data content being shared on the internet is growing enormously with every passing day and managing the same is becoming a problem for organizations across the globe.

When Facebook Concluded Largest Hadoop Data Migration EverFacebook recently undertook the largest data migration ever.  The Facebook infrastructure team moved dozens of petabites of data to a new a center – not easy, nonetheless a task well executed.

Over the past couple of years, the amount of data stored and processed by Facebook servers has grown exponentially, increasing the need for warehouse infrastructure and superior IT architecture.

Facebook stores its data on HDFS — the Hadoop distributed file system. In 2011, Facebook had almost 60 petabytes of data on Hadoop, which posed serious power and storage shortage issues. Geeks at Facebook were then compelled to move this data to a larger data center.

Data Move

The amount of content exchanged on Facebook daily has created a demand for a large team of data infrastructure management professionals. They will analyze all the data to give it out to in the quickest and most convenient way. The treatment of such large data requires large data centers.

So considering the amount of data that had piled up, Facebook’s infrastructure team just concluded the largest data migration ever. They moved petabytes of data to a new center.

This was the largest scale data migration ever. For this Facebook set up a replication system to mirror changes from smaller cluster to the larger cluster. This allowed all the files to be transferred.

First, the infrastructure team used the replication clusters to copy and transfer bulk data from the source to the destination cluster. Then the smaller files, Hive objects and user directories were copied onto the new server.

The process was complex, but since the replication clusters minimize downtime (time how quickly both old and new clusters can be brought to identical state), it became easy to transfer data on a large scale without a glitch.

Learning curve

According to Facebook, the infrastructure team has used a replication system like this one previously too. But, earlier, the clusters were smaller and could not accommodate the rate of data creation, which meant these clusters weren’t enough.

The team worked day in and day out for the data transfer. With the use of the replication approach, the migration of data became a seamless process.

Now, the team having transferred massive data to a bigger cluster means that Facebook can deliver absolutely relevant data to all users.

All Are Valuable Members of Hadoop Community says Cloudera CEO

All Are Valuable Members of Hadoop Community says Cloudera CEO

Within three months of his taking over the leadership of the company, CEO Cloudera Tom Reilly has already visualized where the company is headed.

Cloudera CEO Tom ReillyAccording to him, one needs to have a strong and far-sighted vision for the company if it is to compete against the likes of Hortonworks and MapR for a share of the pie in the highly evolving Hadoop market.

Despite the tough competition, Reilly remains a well wisher for his rivals, whom he views as valuable members of the Hardoop community. His message to his employees is also the same – consider all your competitors as valuable contributors for the success of the community.

Interestingly, Reilly credits Hortonworks, one of the fellow startups and rivals, for driving the development of YARN, which has provided the much needed impetus to every major player in Hardoop.

He also affirmed that the real competition his company faces is from information giants such as Pivotal and IBM, and not other start up rivals.

Cloudera’s CEO was a little shy in sharing details about the change of focus of his company. He has kept the details safe for a public announcement during the Hadoop World conference to be held next week.

Nevertheless, industry watchers estimate that Reilly’s plans for Cloudera are bigger than before. He doesn’t want the company to become just another Hadoop distribution company. With an ever growing list of features and over 700 partners, he aims to make it a data giant that delivers real value to enterprises.

When confronted on the question of Hortonworks luring away Spotify from Cloudera, Reilly has altogether a different take. He confesses that the development surely hurt from a public relations perspective, but it’s not something that will pull the company’s shoulders down.

All Are Valuable Members of Hadoop Community says Cloudera CEO

He explained that Spotify wanted a comprehensive enterprise support and was no longer interested in making use of the free version of Cloudera’s software. Along with Hortonworks, his company too listed a price for the deal. However, Hortonworks managed to put up a slightly better contract and Cloudera didn’t try to match it intentionally, claiming it didn’t make good business sense for them.

Reilly appears too outlandish when he states that the deal didn’t matter much to them and Spotify only managed to earn a low priced vendor with this contract. But deep within, Reilly knows that in order to make his company ride towards profitability, he needs to turn out better than his competitors.

Experts believe that even though Cloudera has a lot on its platter, the 800 pound Hardoop startup can’t distance itself from the present competition.

Unless the company takes a big leap to stand in line with information giants, it will have to live with the image of a Hardoop startup.

Data Management & Analysis at LinkedIn

Data Management & Analysis at LinkedIn

data management enables LinkedIn to provide hiring solutions, marketing solutions and networking opportunities to its members

Data management enables LinkedIn to provide hiring solutions, marketing solutions and networking opportunities to its members

LinkedIn is the world’s largest professional network. It has over 187 million members from over 200 countries. Its members include everyone from freelancers to CEOs of Fortune 500 companies. The company started out from California Mountain View in 2003 with the mission to connect the world’s professionals and it surely has achieved that in the last 10 years.

Today LinkedIn earns $252 million in revenues every year and employs over 3200 people worldwide. It has become the go to resource for HR executives whenever they have to look for someone to fill up a position. The profiles of members are their online resume that every employer can see and access. It also provides opportunities for people to connect with the right persons to take their career ahead.

All this is possible because of data collection and management. All information provided by a member in their profile is collected, analyzed and sorted so that whenever anyone wants to access it, they can do so quickly and effortlessly. This data management enables LinkedIn to provide hiring solutions, marketing solutions and networking opportunities to its members.

Not only is the data invaluable to employers but individuals too can use it to search for talent matches, similar jobs, interesting events and networking opportunities. This huge amount of data also allows LinkedIn to customize products through out the world.

LinkedIn uses data scientists to analyze the data collected by it so that they can rapidly make some sense out of it and use it to recognize opportunities and take advantage of these opportunities. These data scientists are usually qualified to analyze data and statistics and also need business skills and knowledge to make sense of this data.

LinkedIn’s success can be attributed to its decision to develop its own data management application. The company used market solutions and customized them for their own particular use to collect, sort and analyze data. It stores data online using Oracle and Expresso. It uses services such as Voldemort, Zoie, Bobo, Sensei, D-Graph, Kafka and Databus. The offline data store uses Hadoop for machine learning, ranking & relevance, Teradata etc. It also uses MapReduce Analytics, Clickstream for A/B site testing etc.

Corporations and business use LinkedIn to search for people to fill up key positions and people with social influence to test new products. Analyzing viral marketing results and recommendation engine optimization are two other services that LinkedIn offers to businesses. It helps create specialized marketing services for different businesses.

LinkedIn’s value creation is based on this data management and making their analysis of the data to key players in a short amount of time. As long as they have the ability to analyze and manage all this data, they’ll continue to grow and market new and customized products. In order to maintain their edge LinkedIn needs to find ways to handle this ever growing stream of data and also improve the quality of their data analysis.

The Demand for Hadoop & NoSQL Skills Goes Up

The Demand for Hadoop & NoSQL Skills Goes Up

Every since organizations have begun using Big Data to their advantage, a demand for data analytic specialists or Data scientists has grown manifold. Increase in demand for big data experts means an automatic increase in demand for experts with Hadoop and NoSQL skills.

A rise in big data has compelled companies and organizations both big and small to desperately start looking out of IT professionals who can help them in maintaining and monitoring their database for them.

Big data market is still in very early phase, it has a long way to go, but businesses have realized that there is no future if they do not manage this large data adequately. Therefore, a demand for database management skills has increased to many industries beyond web and software, where it started. Today industries like retail, healthcare and even government are seeking professionals with skills to manage and analyze the large data sets for them.

The Demand Hadoop & NoSQL Skills Goes Up

When we talk about big data experts, one of the most desired skills is NoSQL and Hadoop knowledge. An individual cannot be a data expert without thorough knowledge of Hadoop and NoSQL. Data experts have become a professional really in demand, and knowledge of Hadoop and NoSQL adds to the prowess of an individual who can earn highly competitive salaries with data expertise.

Thanks to the companies like Amazon, Apple etc that are looking for big data experts, there has been a significant jump in salaries of data experts and the profession has suddenly become a dream job for many.

Some careers that need NoSQL and Hadoop skills

Some of the careers where NoSQL and Hadoop skills are being put to good use include:

Data Scientist: Data scientist or big data analytic specialist is a profession that requires a person to have a variety of data driven skills. Data scientists gather, analyze, present, and predict the data. Currently given the size of data ever increasing, data scientists are highly in demand.

Data Architect: Data architects are professionals who create data models, analyze data and assist in data warehousing and migration. To be a Data architect an individual required DBA and Hadoop skills.

DBA: DBA or Database Administrator is a career that is massively in demand lately. Companies that hire DBAs look for professionals with skill sets to handle platforms like Oracle, MongoDB etc. The more familiar an individual is with the NoSQL and Hadoop skills the better package he/she can seek.

Strata + Hadoop World 2013 an event for big data junkies

Strata + Hadoop World 2013 an event for big data junkies

The Strata + Hadoop World will see some of the most influential decision makers, developers, analysts and architects of big data come together to streamline the future of business and technology. Anyone who wishes to tap into the opportunities foresighted by big data needs to be at the Strata + Hadoop World, since the event is one of the biggest gatherings of Hadoop community anywhere in the world.

Strata + Hadoop World 2013

The future definitely belongs to companies and organizations that learn how to manage the influx of large data to their advantage. And to be able to understand the significance of big data and how it can be streamlined to one’s advantage it becomes really important to make an appearance at the Strata + Hadoop World.

The great event is part of NYC DataWeek celebration, which explores people and organizations dealing with big data to fuel innovation in the city of New York. NYC DataWeek invites all people to attend data related events, most of which are open and free to attend.

Why you should attend the event

  • It is an opportunity to understand the advantages and challenges of big data
  • Find new and innovative ways to channelize data assets
  • Understand and learn how to use data from science projects to your advantage in business application
  • Know about the career opportunities for data scientists (professionals) and how they can be hired or what training is necessary
  • Meet in person with people from same walks of life and learn from their data managing skills

The popularity of the event is such that with five odd days to go, the Strata + Hadoop World 2013 is completely sold out. The event is going to be a great affair, so you must not miss it. In case you cannot attend the event in person, you can participate in either of the following ways:

  • Follow @strataconf on Twitter for news and updates
  • Watch it live, including keynotes and interviews beginning October 29
  • Take home video compilation of Strata + Hadoop World 2013. It will be complete with keynotes, sessions, interviews and tutorials etc.
No, Big Data is not for Big Businesses Alone

No, Big Data is not for Big Businesses Alone

Who on earth can keep track of data in terabytes, Petabytes and Exabytes etc.? These terms may sound strange, but that is the amount of data business have been dealing with off late, which prompts us to believe that we have entered the age of Big Data. It’s important to business both big and small to understand big data because big data isn’t only confined to big businesses.

It has become vital for organizations to understand that Big Data is not only limited to large organizations alone. Small and medium size enterprises don’t need to feel intimidated by the challenges that lie ahead of them in terms of data. So, they need to be versed with ways to handle Big Data judiciously for their advantage.

Importance of Big Data

Big Data is not for Businesses AloneThe data sets that organizations, both big and small create (even if they are not global players) are majorly large. This is why companies of any possible size need to start unlocking their value and need to work on Big Data irrespective of the industry.

There is an influx of large amount of data because of so many available platforms. The problem with so much data isn’t its size, but the issue as to how organizations should analyze and manage so much data to facilitate reductions in cost, time, and make smarter decisions. Since with proper analytics of Big Data, organizations can:

  1. Figure out who their potential customers are, and which ones really mater the most
  2. Organizations considering potential consumer base can analyze the stock and determine prices to maximize profits
  3. Analyzing the Big Data, organizations can calculate risk in seconds and can easily figure out the root cause for all failures or issue that can arise and lead to losses.

We believe straight up that everybody can gain (in some way or the other) from Big Data. But when it comes to businesses, they need to look to see if they can meet the following standards before embarking on the journey of Big Data.

Expertise

Dealing with Big Data requires industry knowledge across most sectors; it also requires deep technical and analytical skills. So, you as an organization need to understand if you have the in-house expertise to deliver? The task will be to look up data from many different possible sources and them piecing them together to gain knowledge about customer base etc.

Understanding and finances

As an organization, you need to recognize if you have a deep enough understanding of how Big Data can work in favor. You need to be sure if you have enough finances to handle it. If for a moment the answer is in the negative, then it is better to rethink your businesses Big Data strategy.

Technologies

There are certain technologies that can help all kinds of organizations to make the most of Big Data analysis some of these include:

  • Faster processors
  • Large storage space and equipment
  • Cloud computing

10 Reasons Why Businesses Choose to Build Cloud on EMC & VMware

10 Reasons Why Businesses Choose to Build Cloud on EMC & VMware

Reasons Why Businesses Choose to Build Cloud on EMC & VMwareWith Amazon web services dominating the cloud, there was very little scope for any other name to make ground. But EMC and VMware with the launch of Pivotal have provided a platform for customers and business to steer their cloud away from Amazon’s service.

 

VMware and EMC have developed its big data and platform as a service (PaaS) cloud platform with the formation of Pivotal led by former VMware chief Paul Martiz. Pivotal is a joint venture between VMware and EMC which aims purposely to oust the current king of cloud computing.

The objective of the initiative is to absorb technology and bring employees and programs scattered around EMC and VMware under one umbrella which has led to the following advantages propagating businesses to choose and build cloud on EMC and VMware instead of Amazon web services.

Pivotal is a joint venture between VMware and EMC10 Reasons why Businesses choose to build cloud on EMC & VMware

  1. One stop shop: The alliance creates a one-stop shop for cloud services. This will enable organizations to speed up their move to the cloud.
  2. Full proof solution: Pivotal is one stop solution for businesses because of the enhanced benefits of the combination of two big names to deliver the best in cloud computing to its users.
  3. Total solution: EMC and VMware partnership allows business to streamline their cloud architecture to need.
  4. Ultimate cloud computing solution: EMC brings vast and varied solutions for corporate and governments of all sizes.
  5. Swift: The platform would enable speedy delivery and help businesses to quickly take advantage of benefits of cloud computing.
  6. Flexible architecture: with Pivotal, you can benefit with EMC’s three path cloud infrastructure to meet the specific needs of your organization.
  7. Protection of data: EMC’s partnership with VMware allows to rapidly protecting your desktops, virtual servers and applications with maximum competence.
  8. Trust worthy security: Pivotal is best for businesses because the platform helps manage risks and maintain security and integrity.
  9. Counseling by experts: VMware and EMC consulting services have experts at work to help business units at every step in the cloud transformation. They help with analysis, implementation and cloud design etc.

10. Training: The new platform helps business tackle new roles and imbibe new skills.

7 Big Names in the Big Data World

7 Big Names in the Big Data World

The big data world is not only a territory accessible to big and well established database and data warehouse companies today. The pure-play big data startups too are emerging as innovative thinkers, creative and technically sound enough to create a buzz in the marketplace.

Anyhow, in this post, we’re going to talk about the big shots in the game.

Big Names in the Big Data Industry -2013

Here’s the list of 7 BIG Names in the Big Data World:

IBM

The biggest Big Data Vendor as per 2012 revenue figures, IBM raised about $1.3 billion from the Big Data related services and products, according to the reports submitted by Wikibon. The product range of IBM includes a warehouse that has its own unique built-in data mining and cubing capacity.  Also, its PureData systems include packaged analytic integration feature.

Best known products of IBM include DB2, its unique warehouse-InfoSphere and Informix database platforms, SPSS statistical software, designed to support real time predictive analysis and Cognos Business Intelligence application with its big data platform capabilities.

Oracle

Famous for its flagship database, Oracle is amongst the big players in the Big Data space. The total revenue generated by Oracle in 2012 was approximately $415 million, making it the fifth biggest Big Data vendor for the year. The Big Data Application of Oracle combines with Intel Server, Oracle’s NoSQL database and with Cloudera’s Hadoop distribution.

Oracle has a wide range of tools to compliment with its Big Data Platform known as Oracle Exadata. These tools include the Advanced Analytics via the R Programming language, along with the in-memory database option with Oracle’s Exalytics in memory machine and data warehouse of Oracle.

Splunk

Specializing in machine data analysis, Splunk had the biggest market share of all the Data vendors in 2012, with the total revenue of about $186 million, according to the Wikibon report.

Google

Google effortlessly made its place amongst the top 7 names included in the Big Data world. The Big Data offering of Google includes its BigQuery that is a cloud based Bid Data analytics platform. The Big Data related revenues generate by Google in 2012 were about $36 million, as per the Wikibon report.

10Gen

10Gen is best known for its leading NoSQL database, the open source MongoDB that is distinguished as the prime document oriented database. The MongoDB can handle semi structured information that is encoded in Java Script Object Notation (JSON), XML format. What makes it different is its ease of use, speed and its flexibility.

The list of 10Gen’s strategic investors includes Intel, In-Q-Tel and Red Hat. 10Gen was ranked third amongst the only Hadoop and NoSQL vendors last year. 10Gen generated about $36 million revenue in the year of 2012 according to the Wikibon report.

Hortonworks

Another big name in the Bid Data World is Hortonworks. A Hadoop vendor, Hortonworks received over $70 million venture capital investment after spinning off form yahoo in 2011. Hortonworks has its own certification courses with a legion of developers within its virtual box.

Hortonworks is going up exponentially against Cloudera and is known for its partnerships with Rackspace, Microsoft, Red Hat and other companies.

MapR

Best known for its NoSQL database M7, MapR works with Google Compute Engine and Amazon’s Cloud Platform. MapR was ranked fourth by the Wikibon report, amongst the Hadoop and NoSQL only vendors list last year. According to Wikibon, the total revenue generated by MapR in the year 2012 was about $23 million.