Why Hadoop projects fail — and how to make yours a success

25 Jun

Without doubt, “big data” is the hottest topic in enterprise IT since cloud computing came to prominence five years ago. And the most concrete technology behind the big data trend is Hadoop.

Most enterprises are at least experimenting with Hadoop, and the potential for transformative business improvement is real. But just as real is the chance of what I call a “Hadoop hangover” if the project fails to meet expectations and instead results in costly failure.

To help you make the most of Hadoop, let’s look at the promise of big data analytics, and how to avoid expensive, disillusioning failure.

Getting from big data to smart algorithms

For most businesses, big data is an attempt to emulate the advanced data-driven business techniques that propelled Amazon and Google to the forefront of their respective industries.

This is not business intelligence as we have known it in the past: the primary aim is not to facilitate executive decision making through charts and reports, but to entwine data-driven algorithms directly into the business processes that drive customer experience.

Hadoop — essentially an open source implementation of core Google technologies — is the most concrete technology behind big data. Hadoop enables big data projects by providing an economic way to store and process masses of raw data. Hadoop has been proven at scale at Facebook and Yahoo, and was the basis of the most impressive artificial intelligence project to date: IBM’s Watson, the super-computer that won Jeapordy! in 2011.

Most – if not all- Fortune 500 companies have at least a Hadoop pilot project in place. Many are still in the initial data capture stage: setting up the workflows to capture raw business data, demographics and the “data exhaust” flowing from websites and social media. These data capture projects entail significant risk in their own right.

Of course, collecting the data is only the beginning. There’s an old adage: “data is now knowledge and knowledge isn’t information” — and this remains true even if you have “big” data. Indeed, we might add a new clause for our big data world: “information isn’t action”. In other words, determining the meaning of the data is no longer enough: we have to establish the mechanisms — implemented as complex adaptive algorithms — that drive a more effective business.

It’s a tenant of big data analytics that the more data you have, the less complex your algorithms need be. It’s the difference between predicting the outcome of an election from a polling sample and counting the votes on election night. The election night count is always more accurate.

Furthermore, machine learning techniques allow algorithms to be “trained” from the data itself. Essentially the data drives and refines the algorithms.

So having lots of data is an advantage. But, at the end of the day, it still requires a lot of human intelligence to come up with the best answers. Indeed, sometimes it’s a matter of asking the right question. Collecting the data is necessary but not sufficient. Getting from big data to smart algorithms is a unique challenge in its own right.

With all that in mind, let’s look at the key challenges facing successful Big Data analytic projects:

Data scientists are critical, but in short supply

The Googles and Amazons of this world succeeded in their big data projects largely because they were able to attract and retain some of the world’s most gifted computer scientists. These were individuals who brought to the table not just programming skills; they were also able to bring to bear complex statistical analysis techniques, business insight, cognitive psychology and incredible innovative problem solving abilities.

We’ve come to call these types of people “data scientists” and it’s well understood that the base skills — statistics, algorithms, parallel programming, and so on are in short supply. Academia is only just responding with curricula to produce suitably qualified graduates. It will be years before we see a significant increase in qualified data scientists.

If and when we see the supply of data scientists increase, we will still be faced with a more fundamental issue. This stuff is hard. It requires the ability to think across at least three fundamentally complex specializations, including competitive business strategy, machine learning algorithms, and massively parallel data programming. This unique combination of skills is likely to be the limiting factor for big data in the enterprise for the foreseeable future.

At the core of any big data project is the data scientist — acquiring or developing data science capability is a critical factor in a big data project.

The shortage of big data tools

Compounding the problem of the data science talent gap — but perhaps also offering a possible solution –is the lack of suitable tools for the data scientist.

Hadoop and other data stores supply a brute force engine for computation and data storage. Hadoop clusters can consist of potentially thousands of commodity servers — each with their own disk storage and CPUs. Data is stored redundantly across nodes in the cluster. The MapReduce algorithm allows processing to be distributed across all the nodes in the cluster. The result is an amazingly cost effective way of distributing processing across potentially thousands of CPUs disks.

But programming in MapReduce is akin to programming in Assembly language – it’s not a practical way of creating big data algorithms. To turn big data into big value, the data scientist needs tools that can support statistical hypothesis testing, creating and training predictive models, as well as reporting and visualization. Open source projects such as Mahout, Weka and R provide a starting point, but none are easy to use, and often they are insufficiently scalable or otherwise unsuitable to be at the core of Big Data enterprise solutions.

Higher level toolkits – which might leverage Mahout, R and the like, but which make them accessible to a wider audience and allow them to be used as building blocks in more complex workflows – are the next stage of evolution for data science products. Without these Big Data analytic platforms, fully leveraging big data will only be possible in the largest enterprises, who have the budget and reputation sufficient to attract the limited supply of truly capable data scientists.

Data scientists need a more effective analysis framework and toolkit than is provided by Hadoop and its ecosystem. Producing these tools should be a priority for the software community.

The reduction in data quality

Hadoop succeeds as the basis for so many big data projects not just because it can economically store and process large quantities of data, but also because it can accept data in any form. In a traditional database, data must be converted to a pre-defined structure (a schema) before being loaded.

These ETL (Extract-Transform-Load) projects are typically expensive and time consuming. Furthermore, the economics of data warehousing typically required that the data be aggregated and pruned before loading, and therefore lost the granularity necessary for big data solutions.

Hadoop allows for “schema on read” — you need only define the structure of the data when you come to read it. This allows data to be loaded in its most raw form, without needing to analyze or define the data ahead of time. You load everything at low cost, and then only “pay” for the schemas you need.

However, this approach has some fairly obvious risks — machine-generated data in particular might be changing structure rapidly and by the time you come to mine the data it might be very hard to determine its structure. Furthermore, any errors in the generated data might not be picked up until it is too late.

So despite the promise of schema on read, success in a big data project may depend on careful vetting of incoming data — not to the extent of a full ETL process to be sure, but more than simply “load and hope”. After all, one of the first lessons of the computer age was GIGO: Garbage In, Garbage Out.

Pay attention to the quality and format of data streaming into Hadoop. Make sure you’ve identified the structure and assured the quality of that data.

Hadoop has proven it’s scalability at places like Yahoo and Facebook, and proven an ability to power the most complex analytics as the basis for IBM’s Watson AI. However, it misses some key features that the enterprise regards as important:

Security in Hadoop is weak. Once authenticated to a Hadoop cluster, a user can typically access all the data in that cluster. Although it’s possible to limit a user’s access to specific files in a Hadoop cluster it’s not possible to limit data to individual records in that file. Furthermore, because of the cumbersome nature of Hadoop security and the interaction with external tools such as Hive (Hadoop’s native SQL interface) the most common practice is to allow everybody access to everything.

Backup is also difficult. Hadoop is inherently fault tolerant, but enterprises still want to have a disaster recovery plan, or to restore to a point in time backup should some human error result in data corruption. Most distributions do not have these capabilities (the MapR distribution does provide a snapshot capability).

Integration with enterprise monitoring systems is lacking. Hadoop generates metrics, and each Hadoop vendor offers an “Enterprise” console, but these do not integrate properly with Enterprise monitoring systems such as Openview or Foglight.

Resource management is primitive. The ability to manage resources to prevent adhoc requests from blocking mission critical operations is only just emerging.

Real-time query is not a feature of Hadoop. While an emerging set of SQL-based languages and caching layers have been created, Hadoop is not a suitable basis for real time computing.

None of these issues are show stoppers for Hadoop, but failure to acknowledge these limitations may lead to unrealistic expectations for your Hadoop project that cannot be fulfilled.

Make sure you understand the technical strengths and limitations of Hadoop. Avoid unrealistic expectations for your Hadoop solution.

Organizational challenges

Big data is a complex and potentially disruptive challenge to many organizations. Globalization and e-commerce have flattened the world so much that for many businesses simply competing on price or store locality is no longer an option. Competitive differentiation will derive increasingly from personalization, targeting, predictive recommendations and so on. For many businesses, achieving some form of data-driven operation will be survival itself.

History has shown that when faced with this sort of disruptive threat, many companies “freeze” – clinging ever tighter to outmoded business models and hoping for a return to the competitive landscape of the past.

Big data analytics is an over-hyped, poorly-defined and over-used term. Despite that, and despite the challenges outlined above, I believe that for many businesses, the opportunities presented by the big data revolution are as significant and fundamental as those presented by e-commerce 15 years ago. Companies (particularly retailers) should be bold and determined in reacting to these challenges.

Organizational resistance and scepticism to big data is understandable. But don’t let big data risks blind you to the benefits -– and sometimes necessity — of a big data project. Indeed, drinking sensibly seems to be the best way to avoid the hangover without missing the party altogether.

25 Jun

Microsoft and Oracle nowadays declared that they are putting their distinctions aside to strike a strategic partnership during the cloud enterprise place. The offer handles equally the non-public cloud and the public cloud, encompassing multiple new solutions.

Firstly, Oracle will certify and assist buyers who previously run its application on Home windows Server to run that same program on Windows Server Hyper-V or in Windows Azure. Oracle customers also obtain the benefit to run their Oracle software program licenses in Home windows Azure “with new license mobility.”

Hyper-V and Azure assist starts immediately, although the partnership doesn’t close there. Microsoft and Oracle have agreed to work together to “add effectively certified, and totally supported Java into Windows Azure.”

Microsoft may also add Infrastructure Companies occasions with preconfigured versions of Oracle Database and Oracle WebLogic Server for customers who do not need Oracle licenses whilst Oracle will permit customers to get and launch Oracle Linux images on Home windows Azure. Particulars concerning when exactly this integration are going to be available were not shared.

In the course of a convention simply call right now, Microsoft CEO Steve Ballmer talked of the “tipping point” in regards into the partnership. “A large amount has transpired and we’re intending to continue to contend in areas,” he said. “I believe both firms have generally, at the least several many years, have had regard for one another.”

Oracle President Mark Hurd agreed: “I consider it just makes sense for us to carry on to boost our own abilities but in addition allow for buyers to leverage both of those of our capabilities with each other,” he mentioned to the exact same contact. “I believe this tends to make a great deal of sense for both equally of us…because it tends to make a great deal of perception for our shoppers.”

Oracle 1st hinted at today’s information for the duration of an earnings get in touch with on Friday. The business company said it experienced ideas to announce new technology partnerships with not merely Microsoft, but Salesforce and Netsuite as well.

Sony Xperia S Update Release Date: New Fix On July 8 To Patch Up NFC Issues From Jelly Bean

19 Jun

Sony Xperia S owners are in for another update, courtesy of leftover issues from the Android 4.1 Jelly Bean upgrade that was recently rolled out. If all goes well, the fix should be out over the air within a few weeks.

Thanks to Xperia Blog, which spotted a forum posting by a Sony representative, we now know that some kind of a software update is scheduled to roll out during week 28, which begins on July 8.

“A new software is planned to start rolling out during week 28. Please let me know if any of you still experience any problems after installing the upcoming update,” said Sony Xperia support team member Johan.

The rep’s comments were in response to a question about NFC connectivity, so one can only assume that this will be one of the fixes found in the patch. Not much else is known about the fix.

There is also the possibility the week 28 release time frame applies to the update 6.2.B.0.211, which was just pushed out last week. Update 6.2.B.0.211 is only available in certain regions so far, so the rep may have simply been pointing to a date when it will become available in more regions around the world. Update 6.2.B.0.211 fixed a number a connectivity issues that were found after Android 4.1 Jelly Bean hit the Xperia S at the end of May.

When the new update does start rolling out, the story will be the same as with all OTA upgrades:

“As always when rolling out new software versions. It might not reach your device upon roll out start since it rolls out gradually. I suggest that you use PC Companion, Update Service or Bridge for mac to check for software updates from time to time,” Sony stated when referring to the 6.2.B.0.211 fix.

Let us know if you’ve experienced any issues with your Xperia S, even if you’ve already downloaded 6.2.B.0.211, in the comments section below.

The current application obviously aims to further improve that

19 Jun

Microsoft is tweaking a variety of its built-in apps for Home windows 8.1, however the Songs app appears to include essentially the most alterations. A freshly redesigned copy of your Home windows Shop in leaked builds of Windows 8.1 has unveiled the brand new person interface for Xbox Music. Microsoft beforehand disclosed on the Verge which the emphasis along with the new Songs app is on actively playing music more than surfacing new information, noting you’ll be able to now play songs in two clicks instead of 6.

buy cheap adobe photoshop cs6

Screenshots on the new Xbox Audio update show a two panel interface that seems to improve discoverability of new music and also the ability to speedily obtain a set of tracks. Xbox Tunes originally launched in October for Windows 8 and Windows RT equipment, with all the power to access songs from an Xbox 360 too. While the support contains hottest music and albums, the interface has typically been gradual and clunky on Home windows 8 and Home windows RT products. The current application obviously aims to further improve that, by using a “simplified style and layout” according to Microsoft. There is also a persistent in-app look for box with improved search engine results, assist for new music information on SD cards, and Engage in To assist for tracks outside the house with the Xbox Audio catalogue.

Microsoft is usually adding several new built-in applications, which include Calculator, Seem Recorder, Scan, Reading Listing, Film Times, and Help & Tips. Film Times allows you to edit videos from a touch-friendly interface, and Looking at Record acts as a super clipboard to store URLs and other snippets from applications. The Audio Recorder, Calculator, and Scan apps are all buy cheap adobe photoshop cs6 simplified touch-friendly versions to replace the legacy desktop variants. Support & Tips will include numerous tutorials for new Windows eight.1 users that should make improvements to some of your associated confusion with all the new interface.

Alongside the Xbox Audio redesign and additional applications, the brand new Home windows Retailer interface has also been spotted in Windows 8.one builds. Microsoft appears to be altering the structure of its Keep, alongside improvements into the top paid and top free sections, new releases, and a “picks for you” section that suggests applications based on an existing library. Similar apps are also suggested within application pages, which has a general aim on app discoverability. The new Windows Store also works well in portrait mode, designed for upcoming 7- and 8-inch tablets. All the improvements will be made available in Windows 8.one, an update that Microsoft will release as a preview version on June 26th alongside its Build developer conference.

AMD breaks from Windows exclusivity, adopts Android and Chrome OS

7 Jun

After years of Windows OS exclusivity, Advanced Micro Devices is opening the door to design chips to run Google’s Android and Chrome OS in PCs and tablets.

AMD is expanding OS options as it designs chips based on x86 and ARM architecture, which run multiple OSes, said Lisa Su, senior vice president and general manager of global business units at AMD, in an interview at the Computex trade show in Taiwan.

AMD is also expanding its custom-chip business, and Android and Chrome OS offer flexibility for third-party chip design and integration, Su said.

“We are very committed to Windows 8; we think it’s a great operating system, but we also see a market for Android and Chrome developing as well,” Su said.

AMD previously said it had no interest in Android and that its chips would be exclusively tuned for Microsoft’s Windows 8. But now the company will adapt its chips for companies that want to build laptops or tablets with Android or Chrome.

“I think Android and Chrome tend to be in the entry form factors—the tablets, the low-end clamshells,” Su said.

Su did not comment on when AMD-based Android tablets would reach the market. But the company is working with developers on Android applications for AMD chips.

Independent efforts are already under way to bring Android support to AMD-based tablets and PCs. AMD also offers the BlueStacks emulator to run Android apps on Windows PCs. ARM, Intel and MIPS chips are already compatible with Android, though most of the native Android code is written for ARM.

Adoption of Windows 8 on tablets has been weak, and Android support could open up a larger market for AMD. AMD’s previous Z-01 and Z-60 tablet chips were used in just a handful of Windows tablets, none of which sold well.

AMD hopes to get a fresh start in tablets with the latest chips in the product line code-named Temash, the A4 and A6, which were announced last month. The chips offers power consumption as low as 3.9 watts and battery life up to eight hours while Web browsing. Devices with Temash are expected in the second half of the year, and a prototype tablet from Quanta was shown by AMD at the Computex press conference.

The Temash chips are 64-bit and have been designed with Windows 8. The chips are designed to provide PC-like performance on tablets, which is a contrast to Intel’s upcoming Bay Trail tablet chips, which focus more on battery life. Temash includes support for DirectX 11, which improves gaming on Windows.

The Bay Trail chips will go into Windows 8.1 and Android tablets starting at under $199.

Microsoft Sculpt Comfort and ease keyboard made

7 Jun

It’s been above a calendar year considering that I very first set up a make of Home windows eight with a examination equipment. Considering that then, I’ve run it over a big selection of components, together with slate-format tablets, hybrid touch/pen/keyboard pill PCs, conventional laptops and multi-monitor desktop PCs ¨C hardware that mixes the aged (with Vista and XP-era equipment) along with the new (a not long ago upgraded Main i5 desktop method). It’s been on Intel processors, on AMD, on bodily, on virtual: on really considerably each equipment I could locate during the business office.

Screening and benchmarking is all quite effectively, however , you only genuinely get to know an OS by living with it, utilizing it each day to do day to day tasks on the day to day Computer system. For me, meaning the nice old-fashioned desktop Laptop.

The majority of my time is spent in front of a multi-monitor desktop machine, accurately the configuration a large number of folks have concerned about in reviews to various Windows 8 posts. When desktop customers may possibly shortly be in the minority, you will find nevertheless loads of us about. I rely on instruments like Place of work and Adobe Lightroom and they rely within the desktop ¨C and that?ˉs unlikely to vary right until the instruments adjust. So for my desktop Personal computer, there?ˉs little or no modify amongst 7 and 8 from the way I do the job.

Another thing I?ˉve recognized while in the months since Windows 8 achieved RTM, its evolution hasn?ˉt ended ¨C and it?ˉs however convalescing. When i upgraded my desktop from Windows 7 to eight just just after RTM it had been to all extents and reasons simply a somewhat more rapidly Home windows seven machine which has a new UI. But together with the current 160MB post-RTM update, and along with the arrival of some new system drivers in addition to a pair of recent items of hardware, it?ˉs turning out to be anything somewhat diverse.

The one major alter, of course, will be the Get started Display screen. As improvements go, it?ˉs a massive 1, but it?ˉs not the showstopper that some have created it out to become. I?ˉve ended up dealing with it like a complete monitor version in the outdated Get started menu, and utilize it in significantly the exact same way. Much like the start off menu, the start Display screen ends up packed with applications I?ˉve set up, and i often tidy it up. There was slightly operate in obtaining it just how I desired to begin with, but all over again, starting off that has a refreshing install of Home windows XP or seven I?ˉd be doing much exactly the same thing ¨C grouping applications and eliminating references to capabilities or applications I don?ˉt plan to use.

purchase windows 7 professional

Launching apps is not difficult sufficient. Tap the Windows important and begin typing, when the term wheel filter exhibits your app, just choose and click ¨C or hit return. You?ˉre quickly again within the desktop and in the appliance you should use. That?ˉs all there is to it, and if you utilized Vista or 7?ˉs look for box as your main technique of navigation you?ˉll find the Get started Display screen a little additional efficient while you don?ˉt ought to click on during the research box to start finding applications or documents.

The arrival of the new Microsoft Sculpt Comfort and ease keyboard made some functions even much easier. When Home windows 8?ˉs Charms are just a mouse gesture away, getting them on the keyboard is way simpler. Four separate Charm keys necessarily mean you can get to search, Share, Gadgets and Options without having to transfer your fingers away with the keyboard. If you?ˉre working with a Home windows eight Store-style application, the keyboard also comes with 4 keys that replicate the primary Windows 8 touch gestures. 1 handles a still left swipe software swap, while an additional toggles the Snap see for that managing software. The other two launch the start Screen undertaking switcher and open up the app bar.

Google Glass: It’s not an enterprise product, get over it

26 May

As the niche, developing wearable computing market continues to spin, it will still be some time yet until consumers will embrace this new branch of technology.

When Google Glass was announced in 2012, it was shown off in all kinds of leisure activities — from photo taking to video filming — and a range of personal activities that would bridge the gap between handheld devices and the real world. There was even the occasional skydive, suggesting anyone with Glass can go anywhere and do anything.

But Glass was not pitched to the enterprise or corporate world, and has yet to find its niche within the walls of business. And it likely won’t — at least for the near future.

Google Glass is far from a refined product and has a way to go before it will have any meaningful impact in the consumer space. But while Google continues its public, paid-for and lengthy beta-testing process, it only has the consumer in mind.

It’s an experiment that, like other services it has built on over time, could eventually be developed further to include business-minded types. But even then it would have to be, particularly at this early stage in development, a bring-your-own-device (BYOD) requirement rather than an IT budget spending all-out endeavor.

Yes, you can search things on the go. With developer support you could argue that it could boost e-commerce on the shop floor. Maybe it could act as a second or even third display for number crunching. All of these suggestions banded around ZDNet’s New York bureau this afternoon seem rather weak, do they not?

There’s no doubt that Google Glass could be big business for the search giant, but in turn how it reflects on other business remains at best minimal, and unlikely to dent any significant usage in the enterprise.

By creating apparently more problems than Glass actually solves, the primitive device has seen a significant amount of controversy and concern surrounding whether Glass could breach privacy, record people, invade people’s personal space, and all the encompassing features that defines a “glasshole.”

Developers: App makers hold the key to Glass’ success. Its current bare-bones approach to search and access to its own product range circle isn’t enough to bring in the business crowd — even if you’re a Google Apps company. Until there’s a hearty ecosystem that developers can plug into, there’s little point in even taking on the platform. The ecosystem can only thrive with users. It’s a one-way street, which becomes a symbiotic relationship.

There is a case that if enterprises fling open the doors to Glass and develop their own internal apps for the device, there’s a case in point. But again, there are very few reasons why at this early stage in development

Android: Glass supports Android, also iPhones. Android is creeping into cubicles across the land, but it’s still void of any measurable enterprise-grade security. Some Android phones have been certified with FIPS 140-2 government-grade security thanks to the mobile manufacturers themselves — such as HTC — but that’s no thanks to Google. Glass will have to reconsider its position on taking security less than seriously if it wants to make any meaningful impact in business, thanks to the Android factor.

(There is an argument that iPhones and iPads were not pitched to the enterprise either, but the business customer chose Apple after it began to bolster its security and functionality.)

Privacy: Government is a crucial enterprise player, at least in terms of security above other major business sectors, even finance. With varying levels of security clearance in the same office — some with higher access than others — the last thing you’re going to want is documents floating around on camera that may or may not be currently filming away. Unless Google tackles this very basic privacy problem, Glass will remain a problem child in the workplace.

The ‘stand out’ (or lack of): Normally with any enterprise-based product, feature, or service, there’s a pitch. Google isn’t marketing Glass as an enterprise product, nor should it. There’s very little in terms of value that the next-generation specs can actually offer ordinary workers. It doesn’t boost productivity. It’s a gimmick. Consumers love gimmicks because it’s something they can choose to use a product or feature when they like.

For the enterprise, it’s a core part of the workflow. Glass doesn’t have one single feature or productivity factor that stands out and screams, “use me.” If there were, we’d be harping on about it. For now, or at least until Google Glass 2.0 begins to embrace the worker population, there’s little to offer in terms of ‘stand out’ quality.

Cost: Considering all of the above, the cost of the device alone is an enterprise turn-off, but also the very fact that the weak reasons that could be thrown in Glass’ direction to justify even a small rollout across a corporate base. There may be some industries that may benefit from Glass, but if those benefits are limited to having something within your immediate eyesight rather than fetching your smartphone from out of your pocket, frankly you need to get less picky, more productive employees.

You don’t need a million reasons to justify Glass. You just need one, and there don’t appear to be any.

Follow

Get every new post delivered to your Inbox.