View Sidebar

Archive for category: Exalytics

Unabated Experimentation is Way Forward in Big Data

Unabated Experimentation is Way Forward in Big Data

Big Data Experimentation

While it is true that analytical modeling is calling for nonstop testing of big data, the equation isn’t that straightforward and holds certain potential challenges.

The need of the hour is active experimentation in the big-data zone to help in-progress analytical model to make precise correlations. But since statistical models have their own risks, their astute application is going to be a must, especially as long as we want the results to be positive.

While a few groups are still hesitant, most full-size organizations have been able to hone their insight to realize that big data calls for incessant experimentation, and are all in support for the alteration. They also know, at the same time, that practical scenario of the booming field of big data involves certain risks associated with statistical models, especially when their implementation is not flawless.

Statistical Modeling –Practicality and Risks

Statistical models are simplified tools employed by data science to recognize and validate all major correlative aspects at work in a particular field. They can, however, make data scientists have a fake sense of validation at times.

And despite fitting the observational data quite rightly, various such models have been found to miss the real major causative factors in action. This is why predictive validity is often missing in the delusion of insight offered by such a model!

What May go Wrong?

Even though the application of a statistical model is practical in business, there is always a need to scrutinize the true, fundamental causative factors.

The lack of confidence may prove to be the biggest risk, particularly when you doubt the relevancy of the standard (past) correlations constituting your statistical model in near future. And obviously, predictive model of product demand and customer response in a particular zone which you have low confidence in will never be able to pull in huge investments during a product launch!

What is the Scope?

Even though there are certain risks involved, statistical modeling can never be completely dead. To be able to detect causative factors more quickly and effectively, statistical modeling will need to be based on real-world experimentation. This innovative approach that employs a boundless series of real-world experiments will be highly helpful in making big data business model and economy more authentic and reliable.

So How’s Real-world Experimentation Going to Be Possible? 

Exactly the way data scientists have developed advanced operational functions for ceaseless experimentation, big organizations look forward to encouraging their expert business executives to lead the charge in terms of running nonstop experiments and for better output. And to add to their convenience, the big data revolution has already offered in-database platforms for proper execution of a model and economical yet high-output computing power to make real-world experimentation feasible everywhere including scientific and business domains.

The basic idea is to prefer spending time, capital and other resources to conduct more low-risk experiments to putting extra efforts building the same models back and back again!

Be Smart With Big Data

Be Smart With Big Data

smart dataSome companies get scared of big data. They think that since data is inherently dumb, a lot of it would be dumber still. But by being smart about big data, analysts can make sure that they get the most out of it. Handling big data can be a security risk and needs to be handled smartly.

The Present Way of Doing Things

Usually companies have one of three ways to handle data. They either go with the Heroic Model in which individuals take charge of requests and make decisions on their own without consulting with others. This model can work well for small businesses where individuals are usually aware of most situations across all areas of the business. But in bigger businesses, it can lead to confusion and chaos.

The Culture of Discipline on the other hand is one where individuals don’t make any decisions and follow a set of rules set by the management. Employees in this model can’t use data for their own decision making and just have to follow the processes set up for them.

The best way to handle data is to have a Data Smart Model in which data is managed on an evidence based management system. It is a combination of the first two methods and it works on a disciplined processing method but decision making is allowed at the individual level. This is the method that should be used to handle big data and it can result in smooth operation without much hassles.

How to Cultivate the Data Smart Culture

Certain steps need to be taken to create the data smart culture.

  • There should be a single source of truth. Decision making can be moved to the employee level but the guiding principles should be set from a single source.
  • Use ways to keep track of progress. Using a scorecard system, even on a daily basis, can help managers across different branches know how they are performing in relation to the other departments and they can then send in better data to record their progress.
  • Rules are important but there should be enough flexibility. Rules and guiding principles are needed but there should be flexibility to know when to bend the rules and when to break them. Sometimes what works in most parts of the country might not be best for a certain area. Businesses need to be able to adapt to such situations and change their rules accordingly.
  • Work on cultivating human resources. The people are the biggest asset of a company and it is important to educate them and provide them with the proper know-how to handle data. Managers need to be trained to educate the people working under them and give them a one to one engagement.

These steps can help businesses handle big data smartly and without much confusion. Every level needs to be trained to handle big data as the future is going to be all about big data.

Is Big Data a Threat to Your Privacy?

Is Big Data a Threat to Your Privacy?

Big Data is growing bigger every day and along with it the concern over invasion of privacy is also growing. Tracking all the data generated by your mobile and other devices and your interactions on social media, is beneficial for advertisers to tailor their ads to suit you. But there’s more to the story than that. Companies have now begun to come up with very creative ways to use real time data.

Let’s look at some interesting examples.

Smart Rubbish Bins in London

An advertising firm in London came up with the idea to use strategically placed dustbins to track the wifi signal of phones of the people passing by. They could use the serial number of the phones to track the movement of every individual. They could then use this data to show advertisements on the screen of these bins, that are targeted at the person passing by.

smartbins

Now even dustbins are becoming smart!

The officials have asked Renew, the responsible ad firm, to take down the smart dustbins as there has been a lot of concern about the invasion of privacy of the people.

Police Cars in Australia get Number Plate Recognition Cameras

The Aussies have come up with another great use of Big Data by using number plate recognition cameras that can read multiple number plates simultaneously and also search their database to find out all the information about that driver. They can tell if a car is stolen or if you have unpaid parking tickets just by looking at your car’s number plate.

police car

The hand of the law gets longer.

Are Such Examples a Threat to Your Privacy?

When CCTV cameras first came on the scene, the public responded to them with an outrage similar to what we see now in terms of Big Data. But once people got used to the new technology and saw the benefits in solving crimes and catching miscreants swiftly, the fears of Big Brother always watching them subsided.

The truth is that people will allow collection of any data as long as it is collected with their permission and it is used to create value for them. Instead of shoving ads in people’s faces, companies should try to find other ways to use Big Data, not only to reduce costs for the company but also to provide quality to the customer.

One great example to highlight the creative use of Big Data is the potential for insurance companies. Today all natural or man made calamities generate a lot of data in the social media.

data

Data about Hurricane Sandy

Insurance companies can use this data along with before and after images on Google Maps Street View, Flickr, Instagram etc. to find out how much destruction of property their clients have suffered.

torn houseThey can estimate the number and amount of claims that they will have to deal with. They can provide quick claim settlements to their customers which will be appreciated by all and people will readily agree to data collection if they are told of such rewards.

Great Opportunities

A Westpac survey showed that it only took 30 months for mobile usage to reach 1 Million as compared to 80 months it took for online usage to reach the 1 Million mark.

graphThis means that there are great opportunities available to use this rapidly growing Big Data but it will have to be done with care and while keeping the interests of the consumer in mind.

7 Big Names in the Big Data World

7 Big Names in the Big Data World

The big data world is not only a territory accessible to big and well established database and data warehouse companies today. The pure-play big data startups too are emerging as innovative thinkers, creative and technically sound enough to create a buzz in the marketplace.

Anyhow, in this post, we’re going to talk about the big shots in the game.

Big Names in the Big Data Industry -2013

Here’s the list of 7 BIG Names in the Big Data World:

IBM

The biggest Big Data Vendor as per 2012 revenue figures, IBM raised about $1.3 billion from the Big Data related services and products, according to the reports submitted by Wikibon. The product range of IBM includes a warehouse that has its own unique built-in data mining and cubing capacity.  Also, its PureData systems include packaged analytic integration feature.

Best known products of IBM include DB2, its unique warehouse-InfoSphere and Informix database platforms, SPSS statistical software, designed to support real time predictive analysis and Cognos Business Intelligence application with its big data platform capabilities.

Oracle

Famous for its flagship database, Oracle is amongst the big players in the Big Data space. The total revenue generated by Oracle in 2012 was approximately $415 million, making it the fifth biggest Big Data vendor for the year. The Big Data Application of Oracle combines with Intel Server, Oracle’s NoSQL database and with Cloudera’s Hadoop distribution.

Oracle has a wide range of tools to compliment with its Big Data Platform known as Oracle Exadata. These tools include the Advanced Analytics via the R Programming language, along with the in-memory database option with Oracle’s Exalytics in memory machine and data warehouse of Oracle.

Splunk

Specializing in machine data analysis, Splunk had the biggest market share of all the Data vendors in 2012, with the total revenue of about $186 million, according to the Wikibon report.

Google

Google effortlessly made its place amongst the top 7 names included in the Big Data world. The Big Data offering of Google includes its BigQuery that is a cloud based Bid Data analytics platform. The Big Data related revenues generate by Google in 2012 were about $36 million, as per the Wikibon report.

10Gen

10Gen is best known for its leading NoSQL database, the open source MongoDB that is distinguished as the prime document oriented database. The MongoDB can handle semi structured information that is encoded in Java Script Object Notation (JSON), XML format. What makes it different is its ease of use, speed and its flexibility.

The list of 10Gen’s strategic investors includes Intel, In-Q-Tel and Red Hat. 10Gen was ranked third amongst the only Hadoop and NoSQL vendors last year. 10Gen generated about $36 million revenue in the year of 2012 according to the Wikibon report.

Hortonworks

Another big name in the Bid Data World is Hortonworks. A Hadoop vendor, Hortonworks received over $70 million venture capital investment after spinning off form yahoo in 2011. Hortonworks has its own certification courses with a legion of developers within its virtual box.

Hortonworks is going up exponentially against Cloudera and is known for its partnerships with Rackspace, Microsoft, Red Hat and other companies.

MapR

Best known for its NoSQL database M7, MapR works with Google Compute Engine and Amazon’s Cloud Platform. MapR was ranked fourth by the Wikibon report, amongst the Hadoop and NoSQL only vendors list last year. According to Wikibon, the total revenue generated by MapR in the year 2012 was about $23 million.