How data has fueled the evolution of enterprise software

| No Comments
| More

This is a guest blogpost by Dave Elkington, CEO, 

The phrase "predictive analytics" has become a trendy buzzword that seems to show up in every investor pitch in order to elicit premium valuations. Even legacy software giants, like IBM, Microsoft and HP, are investing or reinvesting in the space and jumping on the bandwagon.

Interestingly, predictive analytics is not as new as these companies would lead you to believe. In fact, it's just a rebrand of a branch of computer science that has been around for more than 50 years, machine learning.  What's even more important to understand is that machine-learning algorithms are not the driving force behind the big data revolution.

The real hero of this story is data. The explosion of data began with the advent of the mainframe in the 1950s and has grown in significance thanks to today's massive cloud-computing platforms coupled with big data storage systems.

Machine learning and mainframes

To understand the convergence that led to this revolution, it's important to note two major developments that occurred in the 1950s: Machine learning emerged from attempts to make computers act like humans, and companies began to use mainframes to collect and analyse data. 

In the 1950s, Arthur Samuel (at IBM) developed the first machine-learning game system to simulate a person playing checkers. As early as 1959, advanced machine-learning algorithms were being used to solve real-world problems like when an artificial neural network was used to remove echo from phone conversations.

In the mid-'50s, the IBM mainframe was born. The mainframe pulled data out of filing cabinets, creating a central repository. During this phase of enterprise software, the amount of data remained relatively small and access to this data was extremely limited.

Client-server platforms

Fast forward to the 1980s, when client-server platforms emerged. These platforms, developed by organisations such as Sun Microsystems and HP, decentralised business applications and distributed them within each enterprise that used the system. The amount of data exploded because it could now be collected from multiple sources throughout the company.

While client servers improved the aggregation of data, they still faced significant limitations. Access to the data remained constrained within a company's networks. People pushed the boundaries of these limits through extensions of internal networks using secure value-added networks. Using general-purpose electronic data interchanges (EDI), these networks introduced inter-enterprise communications and data sharing.

EDI was also the beginning of the important parallel process of normalising data sets and classifying data communications between enterprises. The challenge was that each company had to build a custom value-added network with each major customer, partner or vendor.

Cloud computing and SaaS

Hosted solutions acted as a precursor to cloud computing. Hosted solutions made servers available at colocation sites and provided open access through the Internet.
In the 2000s, cloud computing brought yet another phase of application delivery and access to the data stored within the applications as companies like, Omniture and Workday began to provide software-as-a-service (SaaS). The cloud completely centralised data and offered ubiquitous access.

Cloud computing also made multi-tenancy possible. One way to understand multi-tenancy is to think of it as renting an apartment. Renting an apartment is cheaper than renting a house, which is akin to the client-server model, because you are sharing core infrastructure, like plumbing and electrical wiring, with other tenants.

The advantage of multi-tenancy in enterprise software is that it not only centralises an individual company's data but also consolidates data across multiple companies. This creates the need for massive databases capable of storing data from thousands and even millions of companies - hence the name, big data.

Big data

The big data phase brought MapReduce, document data stores to the enterprise cloud-computing vendors. Companies like Cloudera, MongoDB, Couchbase, Hortonworks and MapR commoditised databases that could accommodate billions of records with complex, non-standard relationships. 

This new method of storing massive amounts of data was just what machine learning needed. Enterprise software vendors have employed data scientists to figure out what to do with all of the data they are collecting.

This development is now coined the "predictive analytics" phase of enterprise software. It materialised because cloud computing enabled mass consolidation and universal access to the enterprise applications they delivered, and because big database vendors made it possible to store massive amounts of data in a centralised way.

What's next?

Multi-tenancy not only brings together data across multiple companies, but it also spans domains and industries. This opens exciting new opportunities and will usher in the next phase of enterprise software.

A large number of applications are already consolidating data within specific domains, such as healthcare, dating, travel, consumer goods and anything else you can imagine. However, this data only represents a small slice of life and fails to capture the full picture.

You can't accurately predict who somebody will want to date if all you have is car-buying data, and you can't predict how many pizzas they'll eat this year - and when they'll eat them - if all you have is bowling shoe data, although it would be pretty cool if you could. You need to collect data across domains and put it into the appropriate context.

That's why predictive platforms represent the next generation of enterprise software. Predictive platforms will assemble data from CRM (customer relationship management), ERP (enterprise resource planning), the Internet of Things and other domains and systems to make real-time predictions based on a complete view of the real world.

The futuristic world depicted in Sci-Fi movies isn't as far off as some people think.

Dave Elkington is CEO and founder of, a cloud-based sales acceleration technology company. 

Machine learning is the new SQL

| No Comments
| More

This is a guest blogpost by Oleg Rogynskyy, VP of Marketing & Growth,

Before the emergence of today's massive data sets, organisations primarily stored their data in relational databases produced by the likes of Oracle, Teradata, IBM, etc. Following its emergence in the latter half of the 1980s, SQL quickly became the de facto standard for working with those databases. While there are differences between various vendor flavors of SQL, the language itself follows the same general pattern, allowing business analysts without a developer background to quickly pick it up and leverage the insights from the data stored in their relational databases. Today, I think machine learning is democratising the big data era of Hadoop and Spark in much the same way that SQL did for relational databases.

The problem that SQL solved for relational databases was accessibility. Before SQL, business analysts lacking an engineering background could not work with their data directly. Analysts were dependent on database admins similarly to how developers and business analysts are dependent on data scientists today. This leads to a data "traffic jam" where developers and business analysts are unable to work with their data without direct access to a data scientist. The promise of machine learning is that it allows business analysts and developers to run analysis and discover insights on their own.

SQL allowed lay business analysts to quickly comb through large data sets for answers via queries. However, the answer would have to be an exact match for the query, requiring that both your data and query be organised perfectly. Machine learning can comb through even larger data sets and reduce those to insights without the same need. The principle is the same - both SQL and machine learning reduce datasets into answers, but SQL is more of "I know what I'm looking for and here is how I find it," while machine learning is more about "hey, show me what's interesting in this data and I'll decide what's important." In other words, SQL requires business analysts to know exactly what they're looking for while machine learning does not. With machine learning, an analyst can use all their data to diagnose the common themes in the data, predict what will happen, and (eventually) prescribe the optimal course of action. I actually believe that SQL will become as obsolete as typewriters for business analysts, as machine learning takes its place.

Today's business analysts and developers are more than capable of building and using applications that sit on top of their data by powering them with machine learning. Importing machine learning algorithms into applications is a seamless process, but the organisational will has to be there. Too many organizations cling to the antiquated notion that they can't do machine learning, either because it is too compute intensive or because it requires in-house data science expertise that they can't afford. No one expects the vast majority of organizations to develop an artificial intelligence programme on the scale of Facebook or Google, but they don't need to. Many machine learning platforms are open source and free, all it takes is someone who is smart and curious enough to begin a pilot test!

Thinking about SaaS risks - data security

| No Comments
| More

This is a guest blog by Larry Augustin, CEO, SugarCRM

The recent cyber-attack on broadband company TalkTalk proved that while not securing your own data can be embarrassing, failing to secure the data of your customers is far more serious.

Headlines about cyber security, database breaches and hacking are becoming commonplace. In the last year PlayStation Network and Microsoft's Xbox Live were hacked and taken offline for long periods of time over Christmas. More recently, British Gas had the email addresses and passwords of 2,200 customers leaked online. Then there were dozens of attacks that targeted high-profile companies and banks in North America. This included Sony having its confidential data released, and telecommunications giant AT&T falling victim to an attack in which more than 68,000 accounts were accessed without authorisation. The latter was fined $25 million for data security and privacy violations.

Even more painful than the costly implications are the remediation and communication efforts with affected customers, and lost business that results when breaches are disclosed.

However, there are ways to effectively protect data from hackers. Deploying your customer relationship management technology through a Software as a Service model means being reinforced by multiple layers of protection and security. It's important to ensure that it's hosted in Tier 1 data centre facilities no matter where it is in the world. The data centres using this application are therefore protected by not just powerful physical security mechanisms such as 24/7 secured access with motion sensors, video surveillance, and security breach alarms, but also security and infrastructure components including firewalls, robust encryption and sophisticated user authentication layers.

Data is a critical component of the daily business and it's essential to ensure the privacy and protection of data regardless of where it resides. We make a point of taking a holistic, layered and systematic approach to safeguarding that data, ensuring we are constantly evaluating, evolving and improving the privacy and security measures we has in place. We also offer the option to deploy the technology on premise, as well as in hosted and hybrid configurations, flexing to meet the broadest range of security and regulatory requirements.

Gathering and storing good quality data is now a business critical activity whether that data is being used to highlight customer trends or telling you how valuable a customer is to your business - the benefits are clear. As it grows in importance, IT professionals are now under greater pressure than ever to spare a business the embarrassment of data breaches through ensuring the best IT practices and systems are in place to keep their customer information out of reach.

Big data is sword of truth for disruptor brands

| No Comments
| More

This is a guest blogpost by Bill Wilkins, CIO/CTO, First Utility

 Despite the energy industry being data-rich, the quality of its data has always been extremely poor, its systems archaic. Customers have been left in the dark with little engagement and no real choices.

For challenger companies with change and disruption in their DNA, effective use of data is what sets them apart from conservative incumbents.

Energy makes up one of our biggest household bills, yet the fact that 40% of billpayers have never switched supplier speaks volumes about the magnitude of customer disengagement. What efforts have the incumbent energy providers made to mine their available data to understand consumers and serve them better? Evidently, not very much at all.   

Big data, used smartly, can help energy brands transform the marketplace. It's what drives real change. By informing disenfranchised customers, they become more empowered confident consumers.

The tricky bit is to successfully target, manage and distribute the most valuable data. You can't do everything. Disruptor brands must pinpoint the areas that differentiate their offering and focus all efforts and resources on implementing their competitive edge.

Like most challengers, First Utility is focused on being a fully data-enabled business. We look to the heroes of Silicon Valley for learnings. We see big potential in applying the principles of Spotify, Netflix and Amazon to strategically and creatively engage energy consumers. It's ambitious, but then so are we.

It's ambitious because the energy market is multi-faceted, made even more difficult by the fact these complexities are unique to the UK. There's no precedent to getting it right. In fact, it's often about mixing the company's disruptor gene with specialist technical skills.

Data is at the heart of our decision-making process to help us optimise our offering. As the CIO, I work with my team to set the conditions we believe will create an organic and incrementally useful approach to gathering, understanding and acting on this ever-increasing pool of information.

But data should not be owned by just one department or the data chief. It must be intrinsically woven into multiple business functions and used by general management to inform day-to-day decisions. It must be approached with a focused and united vision, implemented by individual operational teams that affect the total customer journey.

Challenger companies can take advantage of their agility to disrupt the landscape for the better. They carry less historical operational baggage than the larger incumbent rivals. They can do things differently and make faster decisions. With data clearly the present and future of business, this is the ideal time for disruptors to design and shape new models. From a relatively blank piece of paper we can connect with consumers in a way that has not been seen in the energy sector before.

Challengers have the opportunity to blindside the competition with the power of information. The truth is, that even the big incumbents could benefit from challenger thinking. It's unfortunate for the market, and specifically for consumers, that there is currently little hint towards a wave of change.

At my company, big data is at the heart of our competitive edge. Our mission is to engage consumers more in their energy usage so we consistently develop innovative technologies that can provide the insights and tools they need to give them for them to help them take better control. Our My Energy platform applies smart and highly complex customer usage information, fuelled by big data. Customers can review real time energy use over time, see their predicted future usage based on current behaviours, and can even contextualise their spend by comparing it to similar households in their neighbourhood.

We also use big data to maintain sharp business performance. Our churn dashboard tells us which customers are migrating, from what tariffs and to which suppliers, and from this we learn and grow. We are already seeing the cold hard benefits of putting data at the heart our business.

Disruptors recognise that knowledge is power. In the right hands and with clear differentiating focus, big data provides the fuel to get ahead.

Can a maths genius save the world?

| No Comments
| More

This is a guest blog post by Laurie Miles, head of analytics, SAS UK & Ireland


If I were to believe the feedback I get, statisticians are among the most difficult people to work with. What's more, they're the only group that should be allowed to work in data analytics. It sounds harsh, but not only that, this may explain why big data projects continually fail to launch successfully in so many businesses.

What businesses actually need is statisticians that are easy to work with, because conversations based primarily on maths and statistics do not solve business problems. Far from it, in fact.

Businesses need to overcome the perception that data science - despite the lexicon - is about feeding data into an engine and analysing the statistics to get answers. It requires a logical as well as creative mind, for it to deliver real value. And the starting point should really be 'what are the business challenges we need answering?' The statistics bit comes later and is just part of the process in getting to your business solution.

 Creative genius saving the world?

Problem solving is a cognitive function that relies heavily on the creative side of our brains. Humans are curious beings. It's our nature to want to solve mysteries and understand the world around us. It's a rewarding experience that creates a strong motivation for people to want to do more.

Business leaders can tap into this behaviour by giving employees more interesting problems to solve. Data science provides the opportunity to satisfy someone's curiosity, whether they are a genius or not.

That problem solving doesn't have to be at an individual level either. After all, data science is about team work. In another blog, I explored the different roles in building a data science team.

Every data analytics project is unique, so every project will have a unique team set-up. Our education system now provides the opportunity for us to nurture new talents that meet what's required for the business manager, business analyst, data management expert, and statistical modeller. Adding together different geniuses is the key to business value.

Geniuses from across the educational spectrum

Young people now have so much more choice over what they study at school and university. 'Data scientist' is seen as a technical role but it's only a small part of the job: they are also business consultants and creatives. This is why we need to recruit talent from all disciplines, from arts and humanities to STEM subjects.

Businesses need to be open-minded in their approach to data science. Hiring only statisticians is probably the worst thing a business can do.

Changing your approach won't happen overnight, but when building a data science team, first look inside your organisation to assess what skills and interests are already there. Once you have identified your candidates, provide them with training courses and help them carve out a clearly defined learning pathway to develop their role within the business. Then explore the wider circles in universities and other industries to hire new talents that supplement your existing areas of expertise.

 For more insight into what makes a great data scientist, check out what we, as SAS, found out when we asked those in the industry.

Bill McDermott: best wishes

| No Comments
| More

At this year's Sapphire, I was part of a group of journalists from outside the US who interviewed Bill McDermott, chief executive officer of SAP.

At the end of the interview he gave each of us a copy of his autobiography to date, signing each in front of us. Winners Dream is the title of the book, co-written with Joanne Gordon, a former Forbes journalist,

I thought at the time it was a gracious thing to do.

I was shocked to learn of Mr McDermott's injury. His own account tells one much about the calibre of the man. I wish him well.

Smart cities and the IoT - not just a load of rubbish

| No Comments
| More

This is a guest post by David Socha, utilities practice leader, Teradata

What really makes a city smart?  Because from my perspective, Smart Parking, Smart Homes, Smart Lighting and the like are really just the next steps on a journey that began by replacing the cry of "gardyloo"*with city plumbing.

In fact many of the things happening in today's Smart Cities could more honestly be labelled as "progress". 

What will really make a city become smart is the integration and analysis of data from these otherwise disparate initiatives and all the others like them.  Once that happens, a new intelligence will enable the city to deliver new services to its citizens - from genuinely integrated public, private and personal transport systems to energy profiles that incorporate our homes, workplaces, vehicles and more. 

But ... how does that work, exactly?  Surely it consists of more than attaching sensors to everything?  Yes, of course it does. 

And to understand how all this integration and analytics will bring Smart City citizens some actual benefits, you will have to let me get technical -  just briefly though, I promise.  We need to examine three types of data that we're going to encounter in our Smart City. Here they are:

1.       Traditional, unexciting, structured data from enterprise systems.  Information like weather forecasts from the Meteorological Office; census analyses from Government and, say, public transport performance statistics.

2.      Slightly cooler "big data" from all sorts of social media (and other sources too).  This can be valuable for sentiment analysis; for personalising services and offers and for all manner of business-to-customer or perhaps city-to-customer relationships.

3.      New and exciting Machine-to-Machine (M2M) data.  Now we're talking!  This is the stuff the Internet of Things (IoT) is made of, isn't it?  This is the future!  Well, yes and no. 

We can lift the lid on the oft-used example of the smart waste bin to guide us through how we journey from sensors to real benefits for citizens.  The first nugget you'll hear in a typical Case of The Smart Waste Bin story is pretty simple.  If a bin has a sensor that knows it's nearly full, it can call and request someone comes to empty it.  Is that "Smart"?  As I said before, yes and no.  Rubbish might be collected more often, but costs will rocket.  Lorries could be going back to the same street to empty smart bins that transmit their "I'm full!" message just a few hours apart.  Not so smart now, is it? 

Of course we can fix this.  Sensors close to one another could communicate and check if any other bins close-by are nearly full too.  Companies like Smartbin offer both sensors and a route optimisation solution for the teams that have to collect the rubbish.  So here we are, already integrating M2M data and boring old structured data.  Now our citizens will enjoy cleaner streets without having to pay extra for the privilege.  This is merely the beginning.  Additional analytics on the data we have in this example alone, can lead to better planning decisions. For example on where more or fewer bins are required, or how staff and vehicles can be more efficiently deployed.

So let's mix in more data and see what else we can do.  What if we also added Wi-Fi to the bins, as is happening in New York?  Suddenly, citizens will be connecting directly with a "smart solar-powered, connected technology platform that is literally sitting in the streets of New York".  This new service not only delivers a 'connected city' for its citizens - it also offers a chance to learn more about the people our Smart City is serving.  By applying some sentiment analysis, we can even work out just what they think about the new Smart Bin services we're providing. 

We've come a long way from that initial installation of a sensor that occasionally shouts out "I'm nearly full".  And that's the point.  This is just one example of how the Internet of Things will actually deliver benefits to people living in Smart Cities. 

It's not just about sensors.  It's not just about M2M.  Just as important is the integration of many different types of data - the cool stuff and the boring. It's about analysing the data in its entirety to reveal the relationships, dependencies and connections. And it's about taking informed, positive actions based on the new information available. Now that's what I call Smart.

*An Edinburgh phrase, first recorded in 1662.  You can take the boy out of Edinburgh...

Obama supercharges data science

| No Comments
| More

This is a guest blogpost by Mike Weston, CEO of data science consultancy Profusion, in which he discusses supercomputing and its implications for data science.

We are all creating data. A lot of data. The figures involved are mind-blowing. According to Information Service ACI, five exabytes of content were created between what it calls "the birth of the world" and 2003. In 2013, five exabytes of content were created each day. Just so you know, an exabyte is a quintillion bytes. Every minute (on average) we send around 204 million emails, make four million Google searches, and send 277,000 tweets.

With each individual creating and receiving more and more data, computers are in an arms race to keep up. Earlier this month another shot was fired: President Obama issued an executive order designed to ensure that the US leads the field in supercomputers, by building an exascale computer capable of undertaking one quintillion calculations per second. The computer will be used for, among other things, climate science, medicine and aerospace. However, from my perspective, the most exciting proposition is the application of exascale computers to data science.

The first noticeable advantage in having increased computing power is a reduction in the time it will take to carry out data science projects. Reducing the time it takes to receive results will allow for more real-time decision making. This will have a significant impact on industries such as retail, where a shop could automatically alter its pricing strategy instantaneously based on weather data, customer demographics and footfall.

Next, the processes involved in data science will become ultra-efficient. There will be decreased processing time and less time spent accumulating and preparing data. This will open up data science to work with data which previously wasn't accessible before. For instance, assisting in the mapping of the human brain and combining that information with data on a participant's emotions and lifestyle to obtain a picture of how the brain is affected by external factors.

The advanced computing power will also lead to more accuracy and the ability to create more detailed and advanced models. This will enable data science to answer more complicated questions with a larger range of structured, unstructured, historical and real-time datasets. Machine learning will become much more powerful. More computing power will allow more interactions to be presented to the machine to create artificial intelligence. Eventually, the majority of computations will become automated, with data scientists managing the AI as opposed to carrying out the day to day processes.

These new algorithms could be applied to everyday activities, such as tracking the real-time weather conditions impacting on aircraft, along with their locations and speed, the identities of all passengers on board and overall customer satisfaction as detailed through individual's social media accounts. All of this information could be combined into one user friendly interface for airline staff to then monitor and respond.

There will be additional benefits to product design, especially in the field of aeronautics. Proposed designs could be simulated without the need for wind tunnels and other expensive, not readily available, tools. Potentially one of the most exciting advances will be the development of personalised medicines. Data science will be able to look at an individual's genome, their lifestyle and alter drug properties accordingly to make them more effective.

The analysis of big data has already had revolutionary impacts on the commercial sector and within scientific discovery -- from assisting in relief efforts following natural disasters to tailoring the consumer journey on eBay. In the future we can expect to see more advanced weather forecasts, natural disaster prediction services and more accurate cancer diagnosis. With data science also unlocking key Islamic State military strategies, it's going to play a bigger role within US national security.

In the short term, the biggest impact for consumers will be in relation to the 'Internet of Things'. With more real-time data readily available, the productivity of autonomous vehicles would greatly improve. Imagine a scenario where every vehicle within a city could be mapped onto a central computer, with all those vehicles able to tell each other their locations, speeds and proposed routes. Driving would certainly be better informed and safer than it is currently.

Data science is going to undergo a rapid transformation into a faster, more accurate and more efficient process. The range of tasks that will be undertaken by machines will increase, spurred along by advances in machine learning and faster computer speeds. What we may be able to calculate in a week, in the future will take minutes. The scope of data we will be able to deal with will also increase and a greater variety of data will lead to more insights that can be found from seemingly disparate data sets.

This will lead to an exciting future where we are better informed and by virtue should be able to make more educated decisions. A master painter is only as good as his brush, and the advent of better computing will create better data scientists who will make better data insights. More powerful computers will lead to a more empowered society. 

3 ways data lakes are transforming analytics

| No Comments
| More

This is a guest blogpost by Suresh Sathyamurthy, senior director, emerging technologies, EMC

Data lakes have arrived, greeted by the tech world with a mix of scepticism and enthusiasm. In the sceptic corner, the data lake is under scrutiny as a "data dump," with all data consolidated in one place. In the enthusiasts' corner, data lakes are heralded as the next big thing for driving unprecedented storage efficiencies in addition to making analytics attainable and usable for every organization.

So who's right?

In a sense, they both are. Data lakes, like any other critical technology deployment, need infrastructure and resources to deliver value. That's nothing new. So a company deploying a data lake without the needed accoutrements is unlikely to realize the promised value.

However, data lakes are changing the face of analytics quickly and irrevocably--enabling organizations who struggle with "data wrangling" to see and analyze all their data in real time. This results in increased agility and more thoughtful decisions regarding customer acquisition and experience -- and ultimately, increased revenues.

Let's talk about those changes and what they mean for the world today, from IT right on down to the consumer.


Breaking data silos

·         Data silos have long been the storage standard -- but these are operationally inefficient and limit the ability to cross correlate data to drive better insights.

·         Cost cutting is also a big driver here. In addition to management complexity, silos require multiple licensing, server and other fees, while the data lake can be powered by a singular infrastructure in a cost efficient way.

·         As analytics become progressively faster and more sophisticated, organizations need to evolve in the same way in order to explore all possibilities. Data no longer means one thing; with the full picture of all organizational data, interpretation of analytics can open new doors in ways that weren't previously possible.


Bottom line: by breaking down data silos and embracing the data lake, companies can become more efficient, cost-effective, transparent -- and ultimately smarter and more profitable -- by delivering more personalized customer engagements.


Leveraging real-time analytics (Big Data wrangling)

Here's the thing about data collection and analytics: it keeps getting faster and faster. Requirements like credit card fraud alert analytics and stock ticket analytics needs to happen seconds after the action has taken place. But  real-time analytics aren't necessary 100% of the time; some data (such as monthly sales data, quarterly financial data or annual employee performance data) can be stored and analyzed only at specified intervals. Organizations need to be able to build the data lake that offers them the most flexibility for analytics.

Here's what's happening today:

·         Companies are generating more data than ever before. This presents the unique problem of equipping themselves to analyze it, instead of just store it -- and the data lake coupled with the Hadoop platform provides the automation and transparency needed to add value to the data.

·         The Internet of Things is both a data-generating beast and a continuous upsell opportunity -- provided that organizations can provide compelling offers in real time. Indeed, advertisers are on the bleeding edge of leveraging data lakes for consumer insights, and converting those insights into sales.

·         Putting "real-time" in context: data lakes can reduce time-to-value for analytics from months or weeks, down to minutes.

Bottom line: Analytics need to move at the speed of data generation to be relevant to the customer and drive results.


The rise of new business models

Data lakes aren't just an in-house tool; they're helping to spawn new business models in the form of Analytics-as-a-Service, which offers self-service analytics by providing access to the Data lake.

Analytics-as-a-Service isn't for everyone -- but what are the benefits?

·         The cost of analytics plummets due to outsourced infrastructure and automation. This means that companies can try things out and adjust on the fly with regard to customer acquisition and experience, without taking a big hit to the wallet.

·         Service providers who store, manage and secure data as part of Analytics-as-a-Service are a helpful avenue for companies looking to outsource.

·         Knowledge workers provide different value -- with the manual piece removed or significantly reduced, they can act more strategically on behalf of the business, based on analytics results.

·         Analytics-as-a-Service an effective path to early adoption, and to getting ahead of the competition in industries such as retail, utilities and sports clubs.

Bottom line: companies don't have to DIY a data lake in order to begin deriving value.

Overall, it's still early days for Data lakes, but global adoption is growing. For companies still operating with data silos, perhaps it's time to test the waters of real-time analytics.

App-based approach key to achieving efficient self-service Business Intelligence (BI)

| No Comments
| More

This is a guest blog by Sylvain Pavlowski, senior vice president of European Sales at Information Builders

As workers and business units clamour for more control over data analysis to gain insights at their finger tips, there is a rise in the use of self-service business intelligence (BI) tools to meet demands. But, this is not without its challenges for IT teams in particular.

A gap between business users and IT has ensued because historically IT departments have created a centralised BI model and taken ownership over BI. They want to maintain control over aspects like performance measures and data definitions, but workers are striving to gain access to the data they want, when they want it, and don't want IT to 'hand hold' them. This is creating a redistribution of self-service BI and could inhibit business success if IT departments and business users don't find a happy medium.

Gartner argues that, "Self-service business intelligence and analytics requires a centralised team working in collaboration with a finite number of decentralised teams. IT leaders should create a two-tier organisational model where the business intelligence competency centre collaborates with decentralised teams."

I agree that to manage all types of data in one place in one structure is difficult at the best of times but it's all the more difficult these days with a move towards individualism and personalisation where users want to help themselves to the data they need for their job roles, in real time. To manage the push and pull between IT and users, businesses need to look at ways to redefine self-service BI, and it's not just about the IT organisational model. An approach needs to address more than IT departments' needs.

Implementing an app-based approach to self-service BI can help appease everyone concerned. IT departments can build apps for self-service BI to serve every individual, irrespective of back end systems and data formats. "Info Apps", for example, is a new term used to describe interactive, purpose-built BI applications designed to make data more readily accessible to those business users who simply don't have the skills or the technical know-how to use complex reporting and analysis tools, to satisfy their own day-to-day needs. Some studies have even shown that such individuals can make up more than 75% of an organisation's BI user base. Using an app-based approach is therefore an extremely effective way to give business professionals the exact information they need, through an app paradigm, without requiring any analytical sophistication.

Next-generation BI portals play an important role here too. They can provide enterprises with a way to seamlessly deliver self-service BI apps to business users. By organising and presenting BI apps to users in a way that is simple and intuitive (similar to the Apple App Store), companies can empower workers with faster, easier, more interactive ways to get information.

These next-generation portals also offer high levels of customisation and personalisation so business users have full control over their BI content at all times. They will be empowered with the ability to determine what components they view, how they're arranged, how they're distributed across multiple dashboard pages, and how they interact with them. By offering unparalleled ease and convenience - giving them what they need, when and how they want it - organisations can encourage business users to take advantage of self-service BI in new and exciting ways, whilst having the peace of mind that IT departments are ensuring data quality and integrity in the background. This will all drive higher levels of BI pervasiveness, which in turn, will boost productivity, optimise business performance, and maximise return on investments. 

Understanding data: the difference between leaders and followers

| No Comments
| More

A guest blogpost by Emil Eifrem, CEO of Neo Technology.

Data is vital to running an efficient enterprise. We can all agree on that.

Of course, from there, thoughts and opinions differ widely, and it's no surprise why.

Too much of the data conversation is focused on acquiring and storing information. But the real value of data is derived from collecting customer insights, informing strategic decisions and ultimately taking action in a way that keeps your organisation competitive.

Leaders who conduct this level of analysis distinguish themselves from the rest. Data followers merely collect; data leaders connect.

Yet, with so many ways to analyze data for actionable insights, the challenge is to find the best approach.

The most traditional form of analysis is the simplest: batch analysis where raw data is examined for patterns and trends. The results of batch analysis, however, depend heavily on the ingenuity of the user in asking the right questions and spotting the most useful developments.

A more sophisticated approach is relationship analysis. This approach derives insights not from the data points themselves but from a knowledge and understanding of the data's entire structure and its relationships. Relationship analysis is less dependent on an individual user and also doesn't analyse data in a silo.

Real-World Success

Take a look at the biggest and best leading companies and you'll see a strong investment not only in data analysis but also analysis of that data's structure and inherent relationships.

For example, Google's PageRank algorithm evaluates the density of links to a given webpage to determine the ranking of search results. Or consider Facebook and LinkedIn: each site evaluates an individual's network to make highly relevant recommendations about other people, companies and jobs.

Together, these three organisations have developed real insight into their customers, markets and future challenges. In turn, they have become leaders in the Internet search, social media and recruitment sectors, respectively.

Every Data Point Matters

When it comes to effective data analysis, your enterprise must be gleaning insight from all of the data at its disposal, not just a portion of it.

With so much data to sift through, it's no surprise that most organisations fall into a similar trap, focusing their data analysis efforts on a small subset of their data instead of looking at the larger whole.

For instance, it's much easier for enterprises to only examine transactional data (the information customers supply when they purchase a product or service). However, this subset of data can only tell you so much.

The vast store of data a typical enterprise doesn't use is known as "dark data." Defined by Gartner as "information assets that organisations collect, process and store during regular business activities, but generally fail to use for other purposes," mining your dark data adds wider context to insights derived from transactional data.

Of course, data only tells part of the story with surface-level analysis. Enterprises need curious and inquiring minds to ask the right questions of their data. That's why so many leading organisations recruit data scientists solely to make sense of their data and then feed these insights back to strategic decision makers.

Ultimately, the real value of data lies not only in bringing your enterprise closer to the customer but also to prospective customers. And building a better bottom line is something we can all agree on.

Our data has to be perfect." No, it doesn't.

| No Comments
| More

Data is not really perfectible and ultimately, perfection is the enemy of progress, writes James Richardson, business analytics strategist at Qlik and former Gartner analyst, in a guest blog.

Ask people what slows or stops the use of business intelligence (BI) in their organisations and poor data quality is often one of first things they say. 

Now, I'm not going to argue with that view - I've been around BI for too long to do so and know that lack of good quality data a very real issue, particularly as organisations begin to put focus on BI and using data to drive decisions.  In fact I've written in the past on how to approach the procedural issues that give rise to much poor quality data. (If you have a Gartner subscription you can read my 2008 research note 'Establish a Virtuous Cycle of Business Intelligence and Data Quality'.) 

There's no doubt that addressing data sourcing processes helps ameliorate basic errors, build trust and overcome the initial resistance that's common in any BI programme's initial stages.

What I do take issue with is when I hear that "data quality needs to be perfect in order for us to roll out our dashboards".  In this case the aphorism that "perfection is the enemy of progress" is really true.  So, why do people set themselves this impossible goal?  Well, first you've got to consider who's saying this. In the main, it's IT or technical staff.  They often feel exposed because although they know the data's safe, secure and backed up (hopefully!), they've often no idea if its content is good or bad (and nor should they - it's not their job). The irony is that it is not until the data is exposed to active usage by managers, decision makers or analysts that the quality and therefore usefulness of data sets becomes truly apparent.    

The effort spent improving data has to be balanced against the value of doing so.  For most uses, it's not perfect data that's needed, but data within acceptable tolerances. Of course, the level of tolerance varies by function and use case. For financial data the tolerance for error is obviously very low. That's not such as issue, as the perfectibility of financial transaction data is within reach, but only because of the huge effort that goes into its stewardship. The whole practice of accountancy and auditing is fundamentally about data quality, as its aim is to remove as much error from the financial record as possible.  The fact that data is generated and controlled inside our organizations also helps.  In other, less regulated, functions the tolerance can afford to be somewhat less rigorous.  Why is this?  Because people need data to answer business questions right now! Data that's 80%+ accurate may be enough for many operational or tactical decisions.  They may only need a hint at the direction that data is taking overall for it to be valuable, and they may only need it for this instance. Immediacy often trumps purity.

Getting perfect (or even near perfect) data requires herculean efforts.  Zeno's paradox applies. It becomes harder and harder to reach perfection as the volumes and diversity of data grows. It simply isn't possible (or cost effective) to make all sources perfect to the same degree.  There's another big question - what does the perfectionist do about data which is not perfect? Conform it? Therefore changing it, with the risk of over-cleaning the data, or "correcting" errors incorrectly? We have to accept the fact that data is rarely perfectible, and getting less so, and mature our approach to data quality to ensure that it's fit for purpose in today's information environment where burgeoning and varied data flows into our organizations.

I'd go further and say that the myth of perfect data is more dangerous than decision makers understanding that the data you have is dirty, has anomalies and glitches.  Finally, reaching perfection, were it even possible, might not be a good thing.  Why?  Because perfection also implies a rigid standard, and a fixed frame of reference itself can limit innovative thinking, by stopping people answering really fundamental questions.  Blind faith in the certainty of perfect data can never stand up to the shock of the new, to those things we can't or won't see coming.

Why data is more valuable when it's shared

| No Comments
| More

This is a guest blogpost by Glen Rabie, CEO, Yellowfin.

 The role of collaboration in decision-making has been a question for academics and business leaders since modern business began. In fact, well before. In ancient Athens, back in 500 BC, the Greeks ran what can arguably be viewed as the world's first formal collaborative decision-making process. Each Athenian (excluding women, slaves and people from the Greek colonies of the time) was invited to vote, not for a representative to make laws, but actually on the merits of each individual law.

Today, business leaders are also acutely aware of the merits of making decisions collaboratively - involving different stakeholders that can help the company arrive at the best course of action. In a recent Economist Intelligence Unit study titled Decisive Action, 87% of senior decision-makers claimed to involve others when making decisions. Similarly, when asked what single factor would improve their ability to make better decisions, over a third said "being able to take decisions more collaboratively".

 BI's historic shortfalls

In the Business Intelligence (BI) industry, our job is to empower people and organisations to make more decisions based on a solid foundation of trustworthy and easy to interpret data. Unfortunately though, BI vendors have done a poor job of thinking beyond the initial analysis - the focus was placed on core analytics and the technical community alone. Usability (the ease of data consumption by business users) and, importantly, an enterprise's ability to share those insights tended to be afterthoughts - if they were considered at all. When I founded Yellowfin, after a long career working with inflexible BI tools on behalf of one of Australia's 'Big Four' banks, I wanted to change exactly this situation. I wanted to remove the cost and complexity. I wanted Yellowfin to help make BI easy.

 In a world where BI technology is becoming more pervasive, and insights can be valuable to ever increasing numbers of managers and employees, the trick was surely to make things simple for BI consumers. That is, to empower business users to make better, faster and more independent fact-based decisions by focusing on how data is displayed and shared. Collaboration is a huge part of this. There is very little point in having world-beating analysis if it is the exclusive preserve of a limited number of people and is hard to interpret and act on amongst decision-making groups.

 Moving to a collaborative BI environment

Advances in the Internet - particularly the Web-based interfaces of pervasive social media platforms - have taught me and many others some valuable lessons. The rise of social media tools like Facebook, Twitter and Instagram demonstrate how successful you can be by playing the role of a content facilitator - allowing content generated by users to be shared, distributed and interacted with by an interested user base of content consumers. If people on Facebook want to comment on or share a photo, they can. Why should BI be any different?

The omnipresent and collaborative nature of such social media platforms has many people in the modern workforce quite rightly wondering why enterprise BI can't be architected in a similar way. Downloading analysis to a static dashboard or spreadsheet and emailing it to colleagues, then phoning to discuss, then undertaking new analysis based on the new questions and then emailing another static chart simply isn't competitive or efficient practice. It won't deliver better, more accurate decision-making, and it's painfully slow.

 Closing the gap between decision-making and the data

What's needed is an acknowledgement that human decision-making today is, on the whole, taking place outside the BI platform. It could be in a meeting room or a conference call but, all too often, it is not where the data resides. Why not collaborate, and make collective decisions, within the BI platform itself? Why not facilitate the decision-making process alongside the data and data analysis, where stakeholders can interact with live datasets in real-time, add comments, make revisions and collaborate until the correct decision is reached? This is the direction in which our developers have been moving for some years now, and it's consistently been one of the areas our customers have told us they value.

Imagine a scenario where a restaurant manager wants to know how a new product line has been performing to decide if he will continue to stock it. Not only can he instantly view product performance via a self-service chart or dashboard, he can then annotate the chart and share it with other store managers to obtain their thoughts and insights.  He can even start an entire discussion thread around the performance of this new product, allowing others to contribute knowledge and other relevant BI content, to establish the underlying factors impacting performance and to agree on a desired course of action. Allowing such collaboration enables users to connect trends in their data to real-world events more readily, providing more context and deeper, faster insight. Perhaps the product has undersold due to a company-wide stock take shutdown during the end of the financial year? Or perhaps other store managers have experienced more success because they've promoted the new product with a series of discount coupons.

Data alone doesn't deliver ROI; it's the quality of the respective business decisions that yield the benefit. When data is shared - and therefore complemented with a range of appropriate human insights and other contextual information - it is easier to take smarter collective action that delivers better business outcomes. That's why I believe organisations should be focusing attention on collaboration as a means of increasing the value of their data, which improves decision-making processes and enhances the business benefits derived from BI.




Alan Turing Institute head Howard Covington has opportunity to boost UK economy

| No Comments
| More

This is a guest blog by David Richards, President, co-founder and CEO of WANdisco

Alan Turing may not have known it at the time, but he was one of the first pioneers of the data science industry. Seventy years on, we're seeing the rise of the data scientist, fuelled by an increasing realisation by organisations across the world that they require new leadership if they are to get the most out of their data. This is more than just another fickle trend; a quick search on LinkedIn reveals that "data scientist" now appears in roughly 36,000 help-wanted posts.

 If there's one thing big data can teach business leaders, it's this: be prepared to challenge your assumptions.

 Take the case of a major US insurance broker who decided, after years in the business, to stress test the actuarial assumptions which had formed the basis of their policies up until that point. Applying big data analysis to their business model showed them that their assumptions had all been wrong -- that the policies they had been selling had been flawed all along.

Or that of a global bank, who had identified China as a key market for expansion. After months of planning and investment, their data was too fragmented to run an analysis on the return on investment of the campaign to date - so efforts continued. When the data was brought together and analysed using Hadoop, the result was startling: not only had the bank made no profit, but it had been running its Chinese expansion at a loss.

 With traditional analytics methods rapidly being replaced by big data science, there is a great opportunity for businesses and governments alike.

The first chair of the Alan Turing Institute, Howard Covington has been announced. His task will not be easy: when the government-backed institute was first announced, George Osborne said it should enable Britain to "out-compete, out-smart and out-do the rest of the world" - no walk in the park. But if he plays Covington cards right, this could be a big step in Britain's big data opportunity.

 The Alan Turing Institute is a move straight from the Silicon Valley playbook, as the government hopes it will be "a world leader in the analysis and application of big data." Covington, whose background lies in investment banking and asset management, has said that the priority of the institute will be "leading-edge scientific research", with industry application of that research a close second.

 The Institute won't be able to do this alone. The strength of UK universities research centres can be amplified tenfold by the application of the private sector - industry experts will be able to throw a little Silicon Valley know-how into the mix.

 Companies like Hortonworks, Pivotal and WANdisco are helping industries from banks and utility providers, to hospitals and government agencies deploy big data strategies. It is something we have been doing for years, and rather than developing our products in isolation, we have been part of a continuous dialogue with customers. The experience of such companies will be invaluable to the Institute as it sets up its priorities, in my view.

 This is all the more important as data science is counted in dog years - the industry is accelerating so fast that, in terms of changes, it's like packing seven years into one.

 As Covington considers which industry professionals to consult, it is vital that the private sector representatives include vendors rather than end users alone - he will need to consult the companies designing and operating the technology, rather than those simply benefiting from it. Not doing so would be like investigating national spending habits without consulting a single bank.

 It is encouraging to see the UK government realise the importance of big data science, and investing in projects such as the Alan Turing Institute. But it is important that it also appreciates that heavy-handed legislation could do more harm than good in the long run. We need researchers and industry leaders to communicate with policy makers, to ensure innovative thinking is safeguarded from stifling regulation. The Institute is in a prime position to a communicator between these bodies.

 The big data industries are set to be worth £216 billion to the UK economy by 2017, and I expect that the Alan Turing Institute will play an important part in ensuring that the UK delivers on its big data promise. But with limited resources, it needs to make every action count. A focused plan, one that speaks to government as much as it does industry, will be critical in doing this.

Dave Goldberg: on history, Silicon Valley, failure as a virtue, and London as a technology hub

| No Comments
| More

Dave Goldberg, who died in May 2015, was someone I was looking forward to talking with more about the history and significance of Silicon Valley; and about the attempt to emulate its success of east London's Tech City.

London Technology Week, this week, is a good moment to reflect.

Mr Goldberg was a Silicon Valley executive - CEO of Survey Monkey at the time of his untimely death - who studied History and Government at Harvard University. His widow, Sheryl Sandberg is, as is well known, the COO of Facebook and author of Lean In, an influential book about how women can flourish better in leadership positions in business and government.

When I met him, I asked Dave who was his favourite historian, and he mentioned, as influential on him, the journalist and historian David Halberstam, author of The Best and the Brightest about the origins of the Vietnam War, and the young consiglieri around John F. Kennedy.

He was kind enough to ask me who were mine. EP Thompson and Hugh Trevor Roper was my reply - both great English stylists, though at opposite ends of the political spectrum, democratic communist and high Tory respectively, but I digress.

Does Silicon Valley, I asked, lack a sense of history, and does that matter? He responded: "I don't know that it lacks a sense of history, but there is a healthy scepticism about incumbency, and a desire to be disruptive that is unusual. Entrepreneurship is about the triumph of hope over experience.

"It's not that people don't know the history and don't think it is important. But they are willing to go against the odds, and any rational decision making process, and do something that does not look probable.

"There is a lot of utopianism and an idealistic view of the future, and I am not sure I believe in all of that stuff, but I am generally an optimist. One of Silicon Valley's distinguishing strengths is its pervasive optimism in the face of great challenges.

"Also, and I think this is a good thing, failure is a virtue and not a black mark in Silicon Valley. People will look at someone who has started a company and failed and someone who has learned some stuff, and will do better next time. Most other places will say: 'well, that guy failed, so why would we want to invest in him'?

"So, failing fast, yes, but failure - failure is a virtue. You know, Travis [Kalanick], who founded Uber founded two other companies, one of which failed and one of which was a modest success. Reid Hoffman, who started LinkedIn, founded a previous social networking company that failed. The history is littered with people who had failure before they had success. Even Steve Jobs, being fired from Apple.

"The history of Silicon Valley is about not letting failure get in the way of success, and that is different".

Could the UK, and London specifically, replicate that?

"I studied American history and government, and we would talk a lot about institutional memory, and where that was located, whether in companies or government. In Silicon Valley that memory lies in the service professionals around the entrepreneurs - the lawyers, accountants, the PR firms, the real estate agents, the recruiters, and so on. That is an advantage that is not well understood, that connective tissue that transmits the knowledge to the next group of 24-year old entrepreneurs who come along. Now, London is becoming one of those places, too, with Berlin maybe second. In fact there is probably more of that connective tissue in London than in New York. That is a big change over the past five years.

"Is London going to be bigger than Silicon Valley for technology companies? That is unlikely. But should it be the hub for Europe? Yes. Big global companies can start here [in London]".

Nevertheless, Dave Goldberg registered the negatives of Silicon Valley. "We've got a lot of things to work on. There is terrible infrastructure - roads, poor mass transit, high real estate prices -- rivalling London's!"

"The biggest issue is we don't have enough diversity. Not enough women or ethnic minorities. And there is ageism, speaking as an older person myself! There is a myth that all tech companies are founded by 24-year old college drop outs, and it is not true. Most of the data shows that the most successful entrepreneurs are those who start companies in their late thirties".

Dave Goldberg was 47 when he died. Too young.

Different horses for different courses: NoSQL and what your business needs

| No Comments
| More

This is a guest post by Manu Marchal, managing director EMEA at Basho Technologies

While the importance of distributed databases has become more apparent for a large number of businesses, with more and more enterprises in a wide variety of industries identifying the power of harnessing unstructured data, there are still many misconceptions about NoSQL.

It is a common misconception that NoSQL databases act as all-purpose Swiss-army knives, with each platform able to address each enterprise's specific data needs. This is a myth that should be dispelled - NoSQL platforms conform to the old adage that it takes different horses for different courses, with each one offering a variety of strengths and weaknesses, and each capable of catering to enterprises' own specific needs, whether they require speed, reliability, flexibility, or scalability.

When faced with a multitude of choice, it is a natural human reaction to seek out the quickest option. With databases, there is often the assumption that the platform providing the fastest speeds is the one most suited to their organisation. This, however, is not the case, and as a carpenter wouldn't use a hammer to sand a surface, enterprises should only select the platform that suits their needs best.

In the gaming industry, for example, it is of the utmost importance to process huge amounts of data quickly and reliably, ensuring that customers who want to place a bet at a specific time can do so. This data is changing so frequently - from score updates, to red cards, to number of corners - that it is absolutely imperative that gaming companies can empower users to place bets without delay while also updating odds and processing pay-outs. Needless to say, there's a lot of scope for things to go wrong here, and if this were to happen it would cost the company a great deal of money. Of course speed is important to the gaming industry, but arguably not as vital as a platform that can be relied upon to smoothly process the data under extreme duress and to not falter during failure scenarios.

Our own technology Riak, for example, is fast but not the fastest on the market. What it does do well, however, and why it is now being used by bet365 to process the enormous amount of data the company relies upon, is reliably scale and ensure performance under pressure, a vital asset for businesses who can't afford for any increased latency during peak times. Riak is made for mission critical applications and gives organisations that rely on such applications peace of mind. Now, we're not saying that this is what your organisation is looking for - perhaps you actually do need explosive speed - what we are saying is that enterprises are different, and so are NoSQL platforms.

By being aware of just how differently each platform can serve their enterprise, IT managers can better cater to their requirements and select the most appropriate platform for them, rather than finding out the hard way that for businesses and NoSQL platforms, there is no one-size-fits-all.

The growing importance of customer data governance

| No Comments
| More

This is a guest post by Sid Banerjee, CEO, Clarabridge

We are living in the "age of the customer," as Forrester Research recently dubbed it. There are more and more communications channels -- from company websites to call centres to social media -- for customers to interact with the companies they do business with. As a result, they also have high expectations when it comes to having their feedback heard and considered.

For businesses, this era of customer-centricity presents both challenges and opportunities. Acting on feedback straight from the customer's mouth can directly impact a company's bottom-line by reducing metrics such as customer churn.

But there are a few steps between customer feedback and that impact. Companies must make sure that they have the technical skills and capabilities to connect to all relevant customer experience data sources, and be equipped to bring all that data together for holistic and meaningful analysis. But even before that, companies must have the right data governance in place, which a foundational piece for any advanced analysis and action.

Traditional business intelligence (BI) data is generally very explicit and structured, focusing on what has already happened. The universe of structured data is vast, including demographic information, purchase history, digital engagement, multiple chose survey responses, and other CRM data. Businesses have had their hands full analyzing and interpreting these data sets for years, but now the big data challenge is becoming increasingly urgent.

Adding unstructured customer data

When it comes to customer experience management, all of this data must be combined with unstructured customer feedback data. This data includes social media comments, online reviews, call center recordings, agent notes, online chat, inbound emails, and free-form survey responses. Businesses need to consider this data alongside structured data for a complete picture of the customer experience. That's why, in the age of the customer, next-generation experience technology and techniques - like text analytics, sentiment analytics and emotion detection -- are not optional.

Data governance is crucial here, as it ensures that everyone is speaking the same language when it comes to the information's meaning. When you drop in customer feedback and sentiment data on top of historical data tied to transactions and promotions, there needs to be a standardizes process for interpreting it and distributing it. While many companies have processes already in place to manage high-priority enterprise data, it can be challenging to incorporate new streams of large, unstructured data into those rules and processes. Just consider this estimation from Anne Marie Smith, principal consultant at Alabama Yankee Systems: "I would venture to say that if you took the totality of companies that are engaging in some form of structured data cost governance, not even 1%, maybe one-half of 1% of them, are engaging in any form of unstructured data governance, for a variety of reasons."

Know what you are looking for

One reason that data governance is especially difficult in the omni-channel, unstructured age of the customer is because data governance, at a basic level, requires an understanding of what information and insights you're looking for. Companies shouldn't create high quality data for the sake of creating high quality data, but should have their eyes on a specific business goal. When it comes to customer data, one of the big challenges we've seen is that folks don't know exactly what data is relevant in the first place; they aren't sure what data to listen to and analyze, much less how to consistently work with it to gain meaningful insights that will impact business performance.

The lesson: When it comes to new sources of data, figure out what you want to get from it before you dive in. And then, use industry templates garnered from others work as a platform to start. Basics like data governance make up the foundation for a common understanding of the customer across your business, as they enable high-quality and advanced analysis for both explicit and implicit information, which is becoming a requirement in order to deliver the increased level of attention being demanded by customers.

Hadoop - is the elephant packing its trunk for a trip into the mainstream?

| No Comments
| More

This is a guest blog by Zubin Dowlaty, head of innovation and development at Mu Sigma.


Hadoop, the open-source software platform for distributed big data computing, has been making waves over the recent past. The IPO of HortonWorks in December 2014 contributed to that, and the stock market ambitions of the other two main distributors of Hadoop, Cloudera and MapR, have also been fanning the flames. Getting these big data technology companies trading as public institutions will create greater confidence in the technology. The increased funding levels will signal that these technologies are now proven, boosting their uptake.

A Schumpeter creative wave of technology destruction is occurring in the analytics space right now, triggered by Hadoop. It is quite amazing to witness the speed with which this is occurring. Larger enterprises are now eyeing it up for their corporate infrastructure; the technology has been set en route to becoming more accessible to business users rather than just data scientists. However, to exploit this opportunity, enterprises need to be willing to adopt a different mindset.

En route to an enterprise-scale solution?

Over the last year, the industry has seen widespread deployment of Hadoop and associated technologies across many verticals. Furthermore, significant momentum has started building in the enterprise segment, with Fortune 500 companies taking Hadoop more seriously as a data-operating platform for an enterprise-scale and -grade applications. Companies of this size have the muscle to take the technology from the 'early adopter' to 'early majority' stage and beyond, creating a network effect: as more - and more significant - companies implement Hadoop, others follow.

From the Hadoop solution perspective, the technology stack using Hadoop 2.0 and YARN is the critical technology component that has enabled Hadoop to become more of a general OS or computing platform for an analytics group, and not just a niche computing tool.

Technologies such as Apache Spark, Impala, Solr, and STORM, plugged into the YARN component model, have accelerated adoption for running real-time queries and computation. Technologies like ParAccel, Hive on Tez, Spark SQL, Apache Drill from a range of vendors have been created to support data exploration and discovery applications. SQL on Hadoop is another area which has seen a lot of traction in terms of development.

SPARK stands out as it has given the data science community a programming framework for creating algorithms that run more quickly compared to other technologies. It has come a long way to be considered as the new open standard in Hadoop and with robust developer support it is expected to become the de-facto execution engine for batch processing. Batch MapReduce is slow for computation but great for handling big data. With SPARK, data scientists will have fast in-memory capabilities for running algorithms on Hadoop clusters.

Governance and security for Hadoop clusters is still evolving, but these areas have progressed and the main vendors have recognized them as weaknesses, so they can be expected to improve in the short to medium term.

Wringing a lot more ROI for business people

In 2015, apart from scaling their Hadoop initiatives, companies will also be looking for the return on their data and infrastructure investments.

From a technology perspective, YARN will continue to gain momentum as it can support additional execution and programming engines other than MapReduce. Given the flexibility it brings to the table, it will help build more big data applications for better consumption by business users rather than just data scientists.

Analytical applications leveraging concurrent analysis will push analysts to adopt real-time or near real-time computation over the traditional batch mode.

Adoption of scalable technologies in storage, computing and parallelization will increase as more and more machine-generated data becomes available for analysis. Current BI, hardware and analytics-led software architectures are not suitable for scale. They will need to be revisited and carefully thought through. The industry is looking out for standards in this area, and a unified platform that offers an end-to-end solution.

Toolset, skillset, mindset

When it comes to the adoption of advanced technologies such as Hadoop, an organization can acquire toolsets and skillset over a period of time but the largest challenge lies in changing the mindset of the enterprise community as it is deeply ingrained.

For example, large organizations are still struggling with the need to shift from central Enterprise Data Warehouse frameworks towards more distributed data management structures. Similarly, deep-seated trust in paid solutions needs to give way to greater adoption of open source models and technologies, which are now very mature.

It is important to move away from the current 1980s technology and application mindset, and truly scale up in order for enterprise end users to reap the full benefits of Big Data insights and make better decisions. A holistic approach bringing math, business and technology together within a 'Man-Machine' ecosystem would be the key to achieving it.

Think scale, think agility, think continuous organizational learning - that is what technologies like Hadoop can make possible.

The Rise of Punk Analytics

| No Comments
| More

This is a guest blog by James Richardson, business analytics strategist at Qlik and former Gartner analyst.

England's dreaming ...

By the mid-1970s rock music was dominated by "prog-rock" and long, complex, concept-laden albums. The music was created using multi-track recording and very difficult to replicate live without trucks full of equipment and lots of highly-skilled session musicians.

But things changed after 1976; the 'anyone can play guitar' do-it-yourself (DIY) ethic of punk altered everything in rock, stripping it back to its essence and making it simple again. Further, the punk attitude then extended to fashion, to art, to design. This empowered people. We all thought "I can do that." Punk made us fearless. Sure, it was stupid sometimes. But it was joyful, and inclusive.

This transition, from exclusive domination by specialists to inclusive accessibility is a trend repeated in many fields. Take this blog, through the medium of the internet I can write and publish without needing the help of editors, typesetters, printers, distributors etc. Anyone with an opinion can share it. It's punk publishing. We're all in control of the presses.

So, what have we seen in BI until very recently? A field dominated by mavens, a small number of technical specialists whose role was predicated on arcane skills, and a large number of business people in their thrall. People who, just like rock fans in the early 70s waiting for the next double album to be released, waited months for a data model to be designed and a report coded that would deliver what they needed. These truly were data priests. Like Rick Wakeman, behind a stack of expensive keyboards, this approach stacked costly technology on technology. Even the nomenclature was defiantly and deliberately obscure, "yeah, we need an EDW fed via ETL from an ODS, and then a fringed MOLAP hypercube to enable drilling with a hyperbolic tree UI...". And the business people went "wow, it's really complicated" whist feeling vaguely shut out of the process of creation and remote from the data. The mavens sought virtuosity and aspired to deliver to a high concept - a set of clear user requirements - that they could deliver the whole of in one 'great' work, no matter how long it took. But business decision makers got bored of this, bored of waiting, bored of complexity - it wasn't helping them - and looked for an alternative way, a do-it-yourself way.

So, by now you're likely anticipating where I'm going with this train of thought. In the last few years we've entered the era of punk-style analytics. With the rise of new technologies that circumvent much of the need for mavens anyone can play data nowadays. This new approach displays characteristics shared with punk:

  • No barriers. You can download data discovery products for free, and get started with nothing to stop you except access to the data you want to explore.  There's no need to wait for someone else to provide you with the means to get started.
  • Mistakes are part of the process. Jamming with data is very often a trigger to finding insights. We get better through trying stuff out.  Both in terms of our use of an analytic software product and our familiarity with the data.
  • Fast is good. Think of a Ramones song. Fast and to the point. Fact is that business questions come thick and fast, and being able to riff through data at speed often works best. Many of the questions we want to analyse and answer are transient, and the visualizations and apps are throwaway. Use and discard.
  • Perfection and polish are not the aim. If it's perfect it has likely been manipulated to adhere to an agenda or to push a conclusion. The idea should not be to create flawless visualizations (think infographics) but a more transparent, less processed route to data that can flex.
  • Engagement with issues of the moment. Punk songs are about the world as it is right now. Data discovery prompts engaged debate too. Questioning orthodoxies about how we measure and evaluate the subject being analysed. It does that because the framework used is a starting point for active exploration, not an endpoint for passive consumption.
  • The collective experience is valuable in itself. No solos thanks! While self-expression and creativity are important they're secondary to the collaborative act to working and playing together with data to achieve a common result, as this in turn prompts action.

Further, and despite the marketing messages, not all new analytics has a punk ethos. Some approaches are just building a new wave of mavens - the new visualization gurus, often yesterday's Excel gurus, still revelling in their virtuosity. Sitting alone in their bedrooms cubes these specialists create beautifully crafted songs visuals - which are just so - and then distribute these as perfect tapes dashboards to people with cassette players Reader software to listen to look at and be impressed by. Not punk: perfectly polished, self-publicizing, one-to-many maven created artefacts. The approach is exclusive and not collective in its approach, it's not about engaging as many people in playing with data as possible. What's the aim of creating 'just so' visualizations? Who benefits?

The real work happens when more people can explore data and learn through play together.

When they pick up their data and play.

Fast and loud and loose.

Bobbies beat the self-service BI conundrum

| No Comments
| More

This is a guest blog by Michael Corcoran, SVP & CMO at Information Builders

Having recently attended the Gartner BI and Master Data Management Summits in London, it is clear that now is an exciting yet confusing time for data discovery and analytics. Of course, they have always been exciting areas for analysts, but now the industry as a whole is turning to self-service business intelligence (BI) to deliver critical information and data analytics to a much wider audience. This can only be done, however, by "operationalising" insights to bring together employees, the supply chain, and customers.

What do I mean by this? The issue at the moment is the market need to deliver self-service information and analytics to a broader audience. Businesses need to look at simpler ways of doing this than using complex dashboard tools. Visualisation and data discovery tools are, at their heart, still an analyst's best friend but leave the average employee scratching their heads.

Businesses need to stop deterring staff from using BI and analytics by offering ease of use and high functionality. This requires an app-based approach to easily and quickly view corporate data, as a next step in truly bringing big data to the masses. The average person does not have the time or inclination for formal training, and would much rather download an app that delivers analysis directly to their mobile device. Advanced analytics tools provide a mechanism for analytics to build sophisticated predictive and statistical models, but the ultimate value will come when we embed these models and their outcomes into consumable apps for operational decision-making.

Law enforcement is a great example of where self-service apps make a big impact, with analytics available at the tap of a mobile device to help the police force to work more efficiently. The amount of available data on crime grows day by day, and harnessing this to gain useful insights is an extremely powerful tool. Significant value can be derived from historical crime data, which helps predict and prevent crimes based on variables. It's not about individuals, but more about populations and environmental factors; weather, traffic, events, seasons, and so on. It sounds a bit like sci-fi, but it's actually very accurate. Think of it this way - how much crime would you expect at a London football derby, which happens twice yearly? Or in the rough part of town on payday? Or at a packed annual summer festival on a particularly humid day? By offering data on how likely it is for a crime to happen, these insights can help with prevention and more importantly help police forces accurately plan resourcing for such variables and events.

Self-service apps can make this predictive model easily accessible to 'bobbies on the beat'. Even a cop on their first day can access the same level of insightful knowledge as a veteran officer through their mobile device to make smarter decisions in real time. An app to find vehicle licence plate numbers, for example, is just one way to speed up police procedures, saving police time and resources and ultimately making them more efficient.  

However, this isn't the only place where an analytic app could add value. Real time data in an easy-to-use format will have a massive impact on all customer service professions. Providing customer-facing staff with access to key data about customers allows them to deliver a more personalised service to customers. Staff would be able to better understand complaints as they'd be able to quickly access their previous experiences or purchase history, in real time.

The added benefit of using an app-based approach means you can gather data from many difference sources and combine it. For example, you can combine various types of enterprise data with other data available in public and private clouds such as weather services, to pull in variables. This comprehensive combination provides an accurate, collective view which delivers self-service for daily and operational decisions - in real time. This approach is the future of self-service for the masses via an easy-to-consume app for the hungry user.

Subscribe to blog feed



-- Advertisement --