View Sidebar

Archive for category: Blog

Data Management & Analysis at LinkedIn

Data Management & Analysis at LinkedIn

data management enables LinkedIn to provide hiring solutions, marketing solutions and networking opportunities to its members

Data management enables LinkedIn to provide hiring solutions, marketing solutions and networking opportunities to its members

LinkedIn is the world’s largest professional network. It has over 187 million members from over 200 countries. Its members include everyone from freelancers to CEOs of Fortune 500 companies. The company started out from California Mountain View in 2003 with the mission to connect the world’s professionals and it surely has achieved that in the last 10 years.

Today LinkedIn earns $252 million in revenues every year and employs over 3200 people worldwide. It has become the go to resource for HR executives whenever they have to look for someone to fill up a position. The profiles of members are their online resume that every employer can see and access. It also provides opportunities for people to connect with the right persons to take their career ahead.

All this is possible because of data collection and management. All information provided by a member in their profile is collected, analyzed and sorted so that whenever anyone wants to access it, they can do so quickly and effortlessly. This data management enables LinkedIn to provide hiring solutions, marketing solutions and networking opportunities to its members.

Not only is the data invaluable to employers but individuals too can use it to search for talent matches, similar jobs, interesting events and networking opportunities. This huge amount of data also allows LinkedIn to customize products through out the world.

LinkedIn uses data scientists to analyze the data collected by it so that they can rapidly make some sense out of it and use it to recognize opportunities and take advantage of these opportunities. These data scientists are usually qualified to analyze data and statistics and also need business skills and knowledge to make sense of this data.

LinkedIn’s success can be attributed to its decision to develop its own data management application. The company used market solutions and customized them for their own particular use to collect, sort and analyze data. It stores data online using Oracle and Expresso. It uses services such as Voldemort, Zoie, Bobo, Sensei, D-Graph, Kafka and Databus. The offline data store uses Hadoop for machine learning, ranking & relevance, Teradata etc. It also uses MapReduce Analytics, Clickstream for A/B site testing etc.

Corporations and business use LinkedIn to search for people to fill up key positions and people with social influence to test new products. Analyzing viral marketing results and recommendation engine optimization are two other services that LinkedIn offers to businesses. It helps create specialized marketing services for different businesses.

LinkedIn’s value creation is based on this data management and making their analysis of the data to key players in a short amount of time. As long as they have the ability to analyze and manage all this data, they’ll continue to grow and market new and customized products. In order to maintain their edge LinkedIn needs to find ways to handle this ever growing stream of data and also improve the quality of their data analysis.

The Demand for Hadoop & NoSQL Skills Goes Up

The Demand for Hadoop & NoSQL Skills Goes Up

Every since organizations have begun using Big Data to their advantage, a demand for data analytic specialists or Data scientists has grown manifold. Increase in demand for big data experts means an automatic increase in demand for experts with Hadoop and NoSQL skills.

A rise in big data has compelled companies and organizations both big and small to desperately start looking out of IT professionals who can help them in maintaining and monitoring their database for them.

Big data market is still in very early phase, it has a long way to go, but businesses have realized that there is no future if they do not manage this large data adequately. Therefore, a demand for database management skills has increased to many industries beyond web and software, where it started. Today industries like retail, healthcare and even government are seeking professionals with skills to manage and analyze the large data sets for them.

The Demand Hadoop & NoSQL Skills Goes Up

When we talk about big data experts, one of the most desired skills is NoSQL and Hadoop knowledge. An individual cannot be a data expert without thorough knowledge of Hadoop and NoSQL. Data experts have become a professional really in demand, and knowledge of Hadoop and NoSQL adds to the prowess of an individual who can earn highly competitive salaries with data expertise.

Thanks to the companies like Amazon, Apple etc that are looking for big data experts, there has been a significant jump in salaries of data experts and the profession has suddenly become a dream job for many.

Some careers that need NoSQL and Hadoop skills

Some of the careers where NoSQL and Hadoop skills are being put to good use include:

Data Scientist: Data scientist or big data analytic specialist is a profession that requires a person to have a variety of data driven skills. Data scientists gather, analyze, present, and predict the data. Currently given the size of data ever increasing, data scientists are highly in demand.

Data Architect: Data architects are professionals who create data models, analyze data and assist in data warehousing and migration. To be a Data architect an individual required DBA and Hadoop skills.

DBA: DBA or Database Administrator is a career that is massively in demand lately. Companies that hire DBAs look for professionals with skill sets to handle platforms like Oracle, MongoDB etc. The more familiar an individual is with the NoSQL and Hadoop skills the better package he/she can seek.

Strata + Hadoop World 2013 an event for big data junkies

Strata + Hadoop World 2013 an event for big data junkies

The Strata + Hadoop World will see some of the most influential decision makers, developers, analysts and architects of big data come together to streamline the future of business and technology. Anyone who wishes to tap into the opportunities foresighted by big data needs to be at the Strata + Hadoop World, since the event is one of the biggest gatherings of Hadoop community anywhere in the world.

Strata + Hadoop World 2013

The future definitely belongs to companies and organizations that learn how to manage the influx of large data to their advantage. And to be able to understand the significance of big data and how it can be streamlined to one’s advantage it becomes really important to make an appearance at the Strata + Hadoop World.

The great event is part of NYC DataWeek celebration, which explores people and organizations dealing with big data to fuel innovation in the city of New York. NYC DataWeek invites all people to attend data related events, most of which are open and free to attend.

Why you should attend the event

  • It is an opportunity to understand the advantages and challenges of big data
  • Find new and innovative ways to channelize data assets
  • Understand and learn how to use data from science projects to your advantage in business application
  • Know about the career opportunities for data scientists (professionals) and how they can be hired or what training is necessary
  • Meet in person with people from same walks of life and learn from their data managing skills

The popularity of the event is such that with five odd days to go, the Strata + Hadoop World 2013 is completely sold out. The event is going to be a great affair, so you must not miss it. In case you cannot attend the event in person, you can participate in either of the following ways:

  • Follow @strataconf on Twitter for news and updates
  • Watch it live, including keynotes and interviews beginning October 29
  • Take home video compilation of Strata + Hadoop World 2013. It will be complete with keynotes, sessions, interviews and tutorials etc.
No, Big Data is not for Big Businesses Alone

No, Big Data is not for Big Businesses Alone

Who on earth can keep track of data in terabytes, Petabytes and Exabytes etc.? These terms may sound strange, but that is the amount of data business have been dealing with off late, which prompts us to believe that we have entered the age of Big Data. It’s important to business both big and small to understand big data because big data isn’t only confined to big businesses.

It has become vital for organizations to understand that Big Data is not only limited to large organizations alone. Small and medium size enterprises don’t need to feel intimidated by the challenges that lie ahead of them in terms of data. So, they need to be versed with ways to handle Big Data judiciously for their advantage.

Importance of Big Data

Big Data is not for Businesses AloneThe data sets that organizations, both big and small create (even if they are not global players) are majorly large. This is why companies of any possible size need to start unlocking their value and need to work on Big Data irrespective of the industry.

There is an influx of large amount of data because of so many available platforms. The problem with so much data isn’t its size, but the issue as to how organizations should analyze and manage so much data to facilitate reductions in cost, time, and make smarter decisions. Since with proper analytics of Big Data, organizations can:

  1. Figure out who their potential customers are, and which ones really mater the most
  2. Organizations considering potential consumer base can analyze the stock and determine prices to maximize profits
  3. Analyzing the Big Data, organizations can calculate risk in seconds and can easily figure out the root cause for all failures or issue that can arise and lead to losses.

We believe straight up that everybody can gain (in some way or the other) from Big Data. But when it comes to businesses, they need to look to see if they can meet the following standards before embarking on the journey of Big Data.


Dealing with Big Data requires industry knowledge across most sectors; it also requires deep technical and analytical skills. So, you as an organization need to understand if you have the in-house expertise to deliver? The task will be to look up data from many different possible sources and them piecing them together to gain knowledge about customer base etc.

Understanding and finances

As an organization, you need to recognize if you have a deep enough understanding of how Big Data can work in favor. You need to be sure if you have enough finances to handle it. If for a moment the answer is in the negative, then it is better to rethink your businesses Big Data strategy.


There are certain technologies that can help all kinds of organizations to make the most of Big Data analysis some of these include:

  • Faster processors
  • Large storage space and equipment
  • Cloud computing

Big Data Analysts Have a Great Career ahead

Big Data Analysts Have a Great Career ahead

Looking for a career, one might not always consider becoming a data scientist. But given the demand for Big Data analytic specialists or data scientists, a career in Big Data can seem a good choice or even the best one for most new age junkies.

Since an increasingly large number of medium-scale and multi-billion dollar companies are now using Big Data, the demand for big data specialists or Data scientists (as they are better known) has risen tremendously.

Today, all businesses, small and big, require data scientists who know how to manage the influx of huge information and draw a conclusion and insight from the tsunami of data. Thus, when looking for an option at college course or a career change, you can give Big Data a chance.

Big Data Analysts Have a Great Career ahead

Data scientist is by far one of the most sought after career options, not just because of its demand, but also because data analysts are commanding impressive salaries, which are at par with some big career positions.

Career in Big Data

Big Data is basically extremely large amounts of structured and/or unstructured information/data, which is too much for the traditional databases and tools to handle. This large amount of data comes from all possible sources such as social media, posts, multimedia and files (to name a few). Businesses need to set up state-of-the-art technologies to manage and comprehend all this data. They need someone who can help them manage and draw insights from the data; insights that help increase profits in one way or the other.

This is why the position of a data scientist becomes all encompassing in an organization. The position is spread across three specialist fields – technologists, statisticians and quantification experts.

Technologists – these are data scientists who are experts at writing algorithms and codes to transverse such large amounts of data. Statisticians and quantification experts on the other hand are creative fellows expected to navigate content and find things others can miss.

Career in Big DataSkill set required to be a data scientist

Having already understood that Big Data analysts can have a great career, since it is driving job growth, we need to understand how one can get into the position of a data analyst. It is important to know the skills need to pursue a big data career?

If seen from a larger prospective, Big Data jobs need a wide range of skills. But in a very realistic sense, many of the Big Data jobs do not require major programming skill, instead strong analytical skills and knowledge of analytical tools is probably something that is more required.

Education to acquire requisite skills

You may be able to acquire many skills on-the-job, but if there is still a need to enhance your data analyzing skills, interested candidates can enroll with a good big data training school or institution for a well planned course program and acquire  necessary skills. Vendors like EMC and IBM also offer courses on Big Data. In addition to these, colleges and universities also offer degree programs in analytics and other related fields to prepare the prospective aspirants for a Big Data career.

10/18/20131 commentRead More
10 Reasons Why Businesses Choose to Build Cloud on EMC & VMware

10 Reasons Why Businesses Choose to Build Cloud on EMC & VMware

Reasons Why Businesses Choose to Build Cloud on EMC & VMwareWith Amazon web services dominating the cloud, there was very little scope for any other name to make ground. But EMC and VMware with the launch of Pivotal have provided a platform for customers and business to steer their cloud away from Amazon’s service.


VMware and EMC have developed its big data and platform as a service (PaaS) cloud platform with the formation of Pivotal led by former VMware chief Paul Martiz. Pivotal is a joint venture between VMware and EMC which aims purposely to oust the current king of cloud computing.

The objective of the initiative is to absorb technology and bring employees and programs scattered around EMC and VMware under one umbrella which has led to the following advantages propagating businesses to choose and build cloud on EMC and VMware instead of Amazon web services.

Pivotal is a joint venture between VMware and EMC10 Reasons why Businesses choose to build cloud on EMC & VMware

  1. One stop shop: The alliance creates a one-stop shop for cloud services. This will enable organizations to speed up their move to the cloud.
  2. Full proof solution: Pivotal is one stop solution for businesses because of the enhanced benefits of the combination of two big names to deliver the best in cloud computing to its users.
  3. Total solution: EMC and VMware partnership allows business to streamline their cloud architecture to need.
  4. Ultimate cloud computing solution: EMC brings vast and varied solutions for corporate and governments of all sizes.
  5. Swift: The platform would enable speedy delivery and help businesses to quickly take advantage of benefits of cloud computing.
  6. Flexible architecture: with Pivotal, you can benefit with EMC’s three path cloud infrastructure to meet the specific needs of your organization.
  7. Protection of data: EMC’s partnership with VMware allows to rapidly protecting your desktops, virtual servers and applications with maximum competence.
  8. Trust worthy security: Pivotal is best for businesses because the platform helps manage risks and maintain security and integrity.
  9. Counseling by experts: VMware and EMC consulting services have experts at work to help business units at every step in the cloud transformation. They help with analysis, implementation and cloud design etc.

10. Training: The new platform helps business tackle new roles and imbibe new skills.

7 Big Names in the Big Data World

7 Big Names in the Big Data World

The big data world is not only a territory accessible to big and well established database and data warehouse companies today. The pure-play big data startups too are emerging as innovative thinkers, creative and technically sound enough to create a buzz in the marketplace.

Anyhow, in this post, we’re going to talk about the big shots in the game.

Big Names in the Big Data Industry -2013

Here’s the list of 7 BIG Names in the Big Data World:


The biggest Big Data Vendor as per 2012 revenue figures, IBM raised about $1.3 billion from the Big Data related services and products, according to the reports submitted by Wikibon. The product range of IBM includes a warehouse that has its own unique built-in data mining and cubing capacity.  Also, its PureData systems include packaged analytic integration feature.

Best known products of IBM include DB2, its unique warehouse-InfoSphere and Informix database platforms, SPSS statistical software, designed to support real time predictive analysis and Cognos Business Intelligence application with its big data platform capabilities.


Famous for its flagship database, Oracle is amongst the big players in the Big Data space. The total revenue generated by Oracle in 2012 was approximately $415 million, making it the fifth biggest Big Data vendor for the year. The Big Data Application of Oracle combines with Intel Server, Oracle’s NoSQL database and with Cloudera’s Hadoop distribution.

Oracle has a wide range of tools to compliment with its Big Data Platform known as Oracle Exadata. These tools include the Advanced Analytics via the R Programming language, along with the in-memory database option with Oracle’s Exalytics in memory machine and data warehouse of Oracle.


Specializing in machine data analysis, Splunk had the biggest market share of all the Data vendors in 2012, with the total revenue of about $186 million, according to the Wikibon report.


Google effortlessly made its place amongst the top 7 names included in the Big Data world. The Big Data offering of Google includes its BigQuery that is a cloud based Bid Data analytics platform. The Big Data related revenues generate by Google in 2012 were about $36 million, as per the Wikibon report.


10Gen is best known for its leading NoSQL database, the open source MongoDB that is distinguished as the prime document oriented database. The MongoDB can handle semi structured information that is encoded in Java Script Object Notation (JSON), XML format. What makes it different is its ease of use, speed and its flexibility.

The list of 10Gen’s strategic investors includes Intel, In-Q-Tel and Red Hat. 10Gen was ranked third amongst the only Hadoop and NoSQL vendors last year. 10Gen generated about $36 million revenue in the year of 2012 according to the Wikibon report.


Another big name in the Bid Data World is Hortonworks. A Hadoop vendor, Hortonworks received over $70 million venture capital investment after spinning off form yahoo in 2011. Hortonworks has its own certification courses with a legion of developers within its virtual box.

Hortonworks is going up exponentially against Cloudera and is known for its partnerships with Rackspace, Microsoft, Red Hat and other companies.


Best known for its NoSQL database M7, MapR works with Google Compute Engine and Amazon’s Cloud Platform. MapR was ranked fourth by the Wikibon report, amongst the Hadoop and NoSQL only vendors list last year. According to Wikibon, the total revenue generated by MapR in the year 2012 was about $23 million.