View Sidebar

Archive for category: Hadoop

3 Big Problems Big Data Will Probably Create in Near Future

3 Big Problems Big Data Will Probably Create in Near Future

Big Data has undoubtedly been the biggest buzzword in the past one year. One can look back at the just concluded 2013 and consider it as the breakthrough year for the term Big Data.

Big_Data challengesBig Data may not be an outright term in innovation but it certainly is in awareness. In spite of the Big Data receiving more attention in the mainstream, there are business and individuals who still confuse the term and use it inappropriately.

All things said, business enterprises are investing big time in Big Data with the motive to have the best from advanced data analytics. As mobile data, internet data and cloud data trends multiply, a need for more sound Big Data adaptation platforms such as Hadoop have been felt. Though, real potential of Big Data is still very abstract to nail down, the ramifications and business challenges it will create have already begun to show from.

Let us read on for three most important problems Big Data analytics will probably create in the near future.

  1. 1.    Legal and privacy are risk issues

Big Data can be used for good, and obviously it can be harnessed for the betterment of the society. But it can also be abused! So, not everything is sunny about Big Data. Since the accumulation of data means more threat to privacy, privacy challenges around Big Data are nothing new. It may be the dark side of Big Data but an average consumer has begun to understand the implication.

This becomes challenge since enterprises use Big Data to benefit from advanced analytics. It is believed (and explained by Sand Hill survey) that almost 62 percent enterprises use Hadoop for advanced analytics it can provide.

In 2014, Big Data with the rise of Internet of Things, leading to more mobile data, drone data, sensory data and even image data is bound to create more legal concerns over Big Data privacy. This, as explained, because consumers are becoming more aware of the real impacts of Big Data on their lives. It is therefore important for enterprises to remain ahead with compliance law and keep themselves to date with changing data protection laws.

2.    Human decision making Vs. data-driven decision making

As more businesses pursue Big Data to drive their decision making, there is soon going to be a clash in ways of doing things. As MIT Sloan School of Management research scientist Andrew McAfee points out, most management education programs train employees to trust their gut. Trusting the gut feeling is the old way of decision making, so changing it with data-driven decision making can lead to conflict. Becoming data-driven will require businesses to undergo a paradigm shift, since whether the company is data driven or not will become the competitive differentiator between successful and not so successful businesses.

3.    Big Data used for discrimination

Many research projects based on the use of Big Data have raised concerns of data being used for discrimination in addition to looming privacy concerns.

Researchers including Kate Crawford of Microsoft suggest that Big Data is being used speedily for precise forms of discrimination. We are not new to discrimination, but Big Data creates a new form of automated discrimination. Researchers suggest that social media and health care are the most vulnerable.

To safeguard against the issue of discrimination, organizations can create transparent Big Data usage policies in order to protect consumer data.

 

Unabated Experimentation is Way Forward in Big Data

Unabated Experimentation is Way Forward in Big Data

Big Data Experimentation

While it is true that analytical modeling is calling for nonstop testing of big data, the equation isn’t that straightforward and holds certain potential challenges.

The need of the hour is active experimentation in the big-data zone to help in-progress analytical model to make precise correlations. But since statistical models have their own risks, their astute application is going to be a must, especially as long as we want the results to be positive.

While a few groups are still hesitant, most full-size organizations have been able to hone their insight to realize that big data calls for incessant experimentation, and are all in support for the alteration. They also know, at the same time, that practical scenario of the booming field of big data involves certain risks associated with statistical models, especially when their implementation is not flawless.

Statistical Modeling –Practicality and Risks

Statistical models are simplified tools employed by data science to recognize and validate all major correlative aspects at work in a particular field. They can, however, make data scientists have a fake sense of validation at times.

And despite fitting the observational data quite rightly, various such models have been found to miss the real major causative factors in action. This is why predictive validity is often missing in the delusion of insight offered by such a model!

What May go Wrong?

Even though the application of a statistical model is practical in business, there is always a need to scrutinize the true, fundamental causative factors.

The lack of confidence may prove to be the biggest risk, particularly when you doubt the relevancy of the standard (past) correlations constituting your statistical model in near future. And obviously, predictive model of product demand and customer response in a particular zone which you have low confidence in will never be able to pull in huge investments during a product launch!

What is the Scope?

Even though there are certain risks involved, statistical modeling can never be completely dead. To be able to detect causative factors more quickly and effectively, statistical modeling will need to be based on real-world experimentation. This innovative approach that employs a boundless series of real-world experiments will be highly helpful in making big data business model and economy more authentic and reliable.

So How’s Real-world Experimentation Going to Be Possible? 

Exactly the way data scientists have developed advanced operational functions for ceaseless experimentation, big organizations look forward to encouraging their expert business executives to lead the charge in terms of running nonstop experiments and for better output. And to add to their convenience, the big data revolution has already offered in-database platforms for proper execution of a model and economical yet high-output computing power to make real-world experimentation feasible everywhere including scientific and business domains.

The basic idea is to prefer spending time, capital and other resources to conduct more low-risk experiments to putting extra efforts building the same models back and back again!

Are Businesses Already Expecting Healthy Big Data ROI?

Are Businesses Already Expecting Healthy Big Data ROI?

Businesses in the UK use big data to mostly support their sales and marketing campaigns, reveals Big Data Survey 2013 carried out by MBN Recruitment Solutions.

The survey maintains that more than 80% of the total survey respondents look forward to harness and leverage their data to be able to generate new revenue.

At the same time, over 95% people agree that more revenue generation is going to be the only purpose of businesses using big data in near future!

Use of Big Data Till Now and In Future

The first annual survey by MBN also exposes that over 71% of the total respondents have been using data analytics to foresee all major functions and aspects of future businesses doings around the world.

Are Businesses Already Expecting Healthy Big Data ROI

MBN non-executive chairperson, Paul Forrest concludes that companies use big data as they grow larger and need to stay competitive against potential competitors.

Most survey respondents believe that right now there is too low ROI in leveraging big data. They, however, are expecting greater ROI prospects in future. Also, over 40% respondents think that the current initiatives will eventually fetch desired results for businesses. Forrest told that one of the biggest issues has been the importance of tools, but 72% respondents believe that tools are important only in the beginning and it is people who unlock the set value on a later stage.

Monetizing Big Data: What 2014 Might Have in Store

Monetizing Big Data: What 2014 Might Have in Store

Once we are able to invest in the big data technology after successfully analyzing it, the next move will be to monetize it to obtain its monetary equivalent. To know what is the scope of big data monetization on 2014 and beyond, read on!

‘Big Data’ is already a familiar term for most of us, especially those who are into some serious business. It has been a hot topic in the media almost throughout the year 2013.

Big Data - Return on Investment - What 2014 Has in Store

All small and big businesses, however, are still trying to augment their knowledge about what actually big data is and what they should be doing about it and how. And what seems to be adding to the complications are the challenges involved in the process of big data investment.

Majorly, businesses don’t know how to obtain value from data and have to go a long way to be able to define the much-awaited big data policy. Even more importantly, they’ll have to attain the required skills and then execute them in a nifty manner to make the most of the strategies they’re working on!

Big Data – Future and Monetary Equivalent

While we are already in the first phase of the grand big data revolution where we’ve seen big investments in the technology, the next important step would be to generate revenue through big data.

Having a lot in reserve, the year 2014 is ready to play an important role in this regard:

Revenue Generation

Though businesses are all for huge investments in big data, they still need to predict how quickly it can generate revenue. The need of an effective way to measure ROI over a specific period of time may prove to be one of the potential challenges!

But despite all these assessments, most business leaders are expecting big data to be highly helpful in making the right business decisions. However, they believe that it won’t be possible to predict time and money associated with a ROI target without a guiding hand. This may cause giant businesses to opt for big data-based solutions rather than directly using big data as the only solution in 2014. The ultimate goal would be to boost up overall revenue by saving on costly technologies and data consultants.

Big Data as a Marketing Investment 

While it is true that big data has been more of a technology investment till now, we’ll see it as a marketing investment in 2014 and further, and retail brands will lead the charge in that case.

The key will be to persuade people to ‘buy’ by making all the offers directly customer-oriented. Big companies have already begun to prepare for the shift by motivating their CMOs, technology officers and information executives to work in unison to derive the best results.

Utilization of Big Data-based Solutions

With big data-based solutions surfacing quickly, all businesses will have to go for data analytics sooner or later. Though Google analytics have already been used for the same purpose for years, the latest big data-based solutions will allow all small and big companies to access solutions and methods that can ‘practically improve revenue.’ Hopefully, the year 2014 will be big for both those starting-up and well-established businesses in terms of using big data to get the best results!

Hadoop Security: Present and Future

Hadoop Security: Present and Future

Where the current level of Hadoop system can be relied upon for data protection and processing, there is still a need to improve Hadoop security to ensure foolproof big data security for coming times. To stay updated about the scope of a secure Hadoop cluster today and in times to come, one needs to know a few important things about it.

Hadoop Security Present and FutureSecurity is the foremost agenda that represents almost all major requirements within an organization, especially when it is about tasks like big-data processing. Hadoop registered a remarkable progress in last couple of years and has successfully addressed the most common worries like authorization, authenticity and above all, data protection. With more security enhanced Handoop clusters in the pipeline, though using the systems are banking upon the safety of all vital data in the future also.

Hadoop currently is engaged at the cutting edge to provide secure support to countless financial service applications and big private healthcare projects that operate in a high security-sensitive environment. Recent upgrades of Hadoop systems meet the key requirements of organizations demanding some of the world’s toughest security norms. With all the tight security controls incorporated in Handoop, the final objective remains flexibility and smooth data processing for now and in the future.

Hadoop Security Controls Dec 2013

 Security Controls for Hadoop at Present

Securing a Handoop cluster presents certain both small and big, which includes its distributed nature that to a large extent is even responsible for its success. For securing a system, a layered approach is the best and distribution happens to be one of the most complex barriers to it.

Following are the major layers that are in place to secure a cluster:

Authentication

It is responsible for verifying the identity of both a system and a user accessing it. Pseudo authentication and Kerberos are the two authentication modes Hadoop is providing. While the first takes care of the trust among users, the latter secures the overall Hadoop cluster.

Authorization

Authorization represents access freedom for users and a system. Hadoop relies on resource-level access control, file permissions in HDFS and offers authorization and a service-level access control.

Accounting

Accounting makes it possible to track resource use in a system. MapReduce and HDFS that are the parts of Apache Hadoop offer base audit support. Apache Oozie functions as a workflow engine and offers audit trail for all services.

Data Protection 

This takes care of privacy of information. HDP protects the data in motion and HDFS holds up encryption at operating-system levels.

Security Controls for Hadoop in Future

Newer innovations in Hadoop security are focusing mainly on making various security frameworks to work in collaboration so that they can be easily managed. Here’s what Hadoop security system is going to be big at:

Granular Authorization and Enhanced Authentication

Verification technique in most Hadoop modules is in the process of being improved. This is mainly developed and fortified mainly because most users are demanding security hardened authorization model. Token-based validation will soon replace Kerberos to enhance the authentication process.

Encryption Data Protection and Improved Accounting

A more advanced encryption algorithm is a must for most channels. The focus would be on better encryption, mostly through HBase, HDFS and Hive. Another important step is going to be high-tech audit record correlation for easier reporting. With this system, the auditor would be able to predict the sequence of Hadoop component operations without having to take help from any external tools.

Be Smart With Big Data

Be Smart With Big Data

smart dataSome companies get scared of big data. They think that since data is inherently dumb, a lot of it would be dumber still. But by being smart about big data, analysts can make sure that they get the most out of it. Handling big data can be a security risk and needs to be handled smartly.

The Present Way of Doing Things

Usually companies have one of three ways to handle data. They either go with the Heroic Model in which individuals take charge of requests and make decisions on their own without consulting with others. This model can work well for small businesses where individuals are usually aware of most situations across all areas of the business. But in bigger businesses, it can lead to confusion and chaos.

The Culture of Discipline on the other hand is one where individuals don’t make any decisions and follow a set of rules set by the management. Employees in this model can’t use data for their own decision making and just have to follow the processes set up for them.

The best way to handle data is to have a Data Smart Model in which data is managed on an evidence based management system. It is a combination of the first two methods and it works on a disciplined processing method but decision making is allowed at the individual level. This is the method that should be used to handle big data and it can result in smooth operation without much hassles.

How to Cultivate the Data Smart Culture

Certain steps need to be taken to create the data smart culture.

  • There should be a single source of truth. Decision making can be moved to the employee level but the guiding principles should be set from a single source.
  • Use ways to keep track of progress. Using a scorecard system, even on a daily basis, can help managers across different branches know how they are performing in relation to the other departments and they can then send in better data to record their progress.
  • Rules are important but there should be enough flexibility. Rules and guiding principles are needed but there should be flexibility to know when to bend the rules and when to break them. Sometimes what works in most parts of the country might not be best for a certain area. Businesses need to be able to adapt to such situations and change their rules accordingly.
  • Work on cultivating human resources. The people are the biggest asset of a company and it is important to educate them and provide them with the proper know-how to handle data. Managers need to be trained to educate the people working under them and give them a one to one engagement.

These steps can help businesses handle big data smartly and without much confusion. Every level needs to be trained to handle big data as the future is going to be all about big data.

Hadoop Can Come Handy Even When You are Not Dealing with Big Data

Hadoop Can Come Handy Even When You are Not Dealing with Big Data

Hadoop was developed to cater to the needs of web and media companies for managing big data. But even if you don’t have to deal with big data, you can still use Hadoop in many ways to enhance your data and resource management. Today Hadoop is being used by almost every business, whether they have big data or small, to manage their data.

The Main Features of Hadoop

The main feature of Hadoop is the HDFS storage system. HDFS stands for Hadoop Distributed File System that operates on low cost hardware.

MapReduce was developed for resource management and data processing but with Hadoop 2.0 it has been left just to focus on data processing while YARN is used for resource management.

These features of Hadoop can be utilized in many innovative ways by big and small businesses.

Data Archive

One straightforward use of Hadoop is to archive data files. Since HDFS runs on commodity hardware it is simple and cheap to scale so businesses can start small and expand as their business grows. They can store all their data at a very low cost.

Instead of destroying data after the regulatory period is over, companies can store decades of data and analyze it in real time to help their decision making process.

Data Staging Area

Traditionally ETL tools are used for extracting and transforming data. When Hadoop came to the scene, it could have killed ETL forever if ETL providers hadn’t been smart enough to provide HDFS connectors so that Hadoop could be used along with their ETL software.

By using Hadoop you can store the application data and the transformed data in the same place. This makes it easier to process the data at a later time and reduces the time to process the data. Hadoop can help ETL in improving data processing.

Data Processing

Instead of sending data to the warehouse and then use costly resources to update it in the warehouse, you can use Hadoop and its MapReduce function to process and update it before it goes to the warehouse. Hadoop’s low cost processing power can be used not just for your warehouse data but for other operational and analytical systems as well.

HadoopHadoop is a very powerful tool that can help all businesses to handle their data in a better way. You don’t have to be sitting on top of big data to use Hadoop. You can start even when you have small data and Hadoop will let you collect decades of data till it becomes big data and then you can start making use of all this data by using big data analytics.

Is Big Data a Threat to Your Privacy?

Is Big Data a Threat to Your Privacy?

Big Data is growing bigger every day and along with it the concern over invasion of privacy is also growing. Tracking all the data generated by your mobile and other devices and your interactions on social media, is beneficial for advertisers to tailor their ads to suit you. But there’s more to the story than that. Companies have now begun to come up with very creative ways to use real time data.

Let’s look at some interesting examples.

Smart Rubbish Bins in London

An advertising firm in London came up with the idea to use strategically placed dustbins to track the wifi signal of phones of the people passing by. They could use the serial number of the phones to track the movement of every individual. They could then use this data to show advertisements on the screen of these bins, that are targeted at the person passing by.

smartbins

Now even dustbins are becoming smart!

The officials have asked Renew, the responsible ad firm, to take down the smart dustbins as there has been a lot of concern about the invasion of privacy of the people.

Police Cars in Australia get Number Plate Recognition Cameras

The Aussies have come up with another great use of Big Data by using number plate recognition cameras that can read multiple number plates simultaneously and also search their database to find out all the information about that driver. They can tell if a car is stolen or if you have unpaid parking tickets just by looking at your car’s number plate.

police car

The hand of the law gets longer.

Are Such Examples a Threat to Your Privacy?

When CCTV cameras first came on the scene, the public responded to them with an outrage similar to what we see now in terms of Big Data. But once people got used to the new technology and saw the benefits in solving crimes and catching miscreants swiftly, the fears of Big Brother always watching them subsided.

The truth is that people will allow collection of any data as long as it is collected with their permission and it is used to create value for them. Instead of shoving ads in people’s faces, companies should try to find other ways to use Big Data, not only to reduce costs for the company but also to provide quality to the customer.

One great example to highlight the creative use of Big Data is the potential for insurance companies. Today all natural or man made calamities generate a lot of data in the social media.

data

Data about Hurricane Sandy

Insurance companies can use this data along with before and after images on Google Maps Street View, Flickr, Instagram etc. to find out how much destruction of property their clients have suffered.

torn houseThey can estimate the number and amount of claims that they will have to deal with. They can provide quick claim settlements to their customers which will be appreciated by all and people will readily agree to data collection if they are told of such rewards.

Great Opportunities

A Westpac survey showed that it only took 30 months for mobile usage to reach 1 Million as compared to 80 months it took for online usage to reach the 1 Million mark.

graphThis means that there are great opportunities available to use this rapidly growing Big Data but it will have to be done with care and while keeping the interests of the consumer in mind.

New course to handle Big Data on Hadoop using R software

New course to handle Big Data on Hadoop using R software

Jigsaw Academy is introducing all new course in big data analytics using R and Hadoop. The course has been specifically designed to provide students’ knowledge and hone their skills to handle big data environment of Hadoop using the R software.

JigsawAcademyIt just been days we learnt of Cloudera and Udactiy partnership to offer open Hadoop and MapReduce courses. Course which have been specially designed to equip students with technical and analytical skills for a brighter career in emerging data market.Following the lead, Jigsaw Academy, a premier online analytics training academy, has introduced new courses in Big Data Analytics using R and Hadoop.

Jigsaw Academy has made a good name in online analytics training. It offers both intermediate and advanced level big data analytics courses. With a vision to extend its roots (as a premier academy), Jigsaw Academy has specifically designed their new course to provide everyone (in need) knowledge and help develop skills needed to deal with big data analytics on Hadoop using R software.

SaritaDigumarti, co-founder at Jigsaw Academy informs,

This new course is specifically designed for those looking to enhance their knowledge and skill sets in Big Data, specifically that of handling the big data environment of Hadoop using R software.

Who is the course for?

Since, Jigsaw Academy thrives on the continuous commitment to expand its offerings, the new course will really help global industry experts (who lack big data handling skills) garner significant expertise in the big data analytics environment. The primary target group for the course, being offered, are analytics experts who are wanting to learn and develop on their big data analytics skills.

It is also beneficial students planning to pursue a career in data science, or for those database professionals who plan to make an entry into the big data analytics industry.

Requirements for enrollment?

To attain an entry into the course, professionals and students are required to have working knowledge of R software. They should have a beginner’sunderstanding of statistics and SQL.

Those not versed with R will have to undergo a spate R skills course, which will be offered by Jigsaw Academy for free.

What to expect in and on completion of the course?

The course can be really beneficial for all the aforementioned type because the instructors at Jigsaw Academy will use real-time big data case studies. This will allows the instructors to showcase and clear the concepts of Hadoop in addition to providing training of application of big data technologies on large volume of data.

What to expect on completion?

  • A working knowledge of Hadoop
  • An ability to analyze big data using R software
  • Complete knowledge of big data analytics
  • And practical application of big data analytics

Via: PRWeb

Cloudera and Udacity partner to deliver Hadoop and Data Science training

Cloudera and Udacity partner to deliver Hadoop and Data Science training

Data education giants Cloudera and Udacity have formed a strategic partnership to address the shortage of big data skills by offering easily accessible online training for everyone. The partnership will offer open Hadoop and MapReduce Courses tailored to equip students with technical and analytical skills to have a great career in the emerging data market.

In the present scenario, as the amount of structured and unstructured data being generated and stored around the globe in various sectors has shot up considerably, there has been a significant rise in the enterprise demand for skilled and qualified workers.

Big data

Recently we read about Udacity introducing paid big data courses to bridge this widening gap of demand and supply, today we learn that Cloudera, a Apache Hadoop-powered market leader in enterprise analytic data management has partnered with Udacity, the online higher education provider, to deliver training on Hadoop and Data Science to anyone using Udacity’s easy to access online educational portal.

The course curriculum, which has be designed and developed by expert faculty at Cloudera University in collaboration with Udacity will equip the interested students with all the fundamental technical and analytical skills. The course is basically an introduction for Hadoop and MapReduce, understanding of which will help students kick start their careers in the every growing big data economy.

The course has been basically created to work as a support system for the shortage of skilled data professionals in the economy. With the course, Cloudera and Udacity are making available an open, state-of-the-art big data training within the reach of almost anyone who has access to the Internet and is passionate about learning the basics of Hadoop and MapReduce.

On completing this accessible course, students will have an opportunity to enroll in Cloudera University’s live professional training courses to earn certification for their professional training.

Via: MarketWired

11/22/20131 commentRead More