Is Microsoft openness a wolf in sheep's clothing?

bridgwatera | No Comments
| More

Higher profile members of the key software application developer press were invited to the Microsoft Build conference this year to listen to the company's vision for where its ecosystem will develop over the next decade.

Listening from afar, it has been fascinating to see how Microsoft's conceptualising under the new grande fromage Satyo Nadelli has been analysed by the press and the programmer community.

Reporting for Time Magazine, Harry McCracken said that it felt like a 21st-century take on "Windows Everywhere" ...

Microsoft Build attendees were able to learn how the firm is aiming to proffer forth Windows development channels for not just desktops but tablets, phones and the Xbox One extending onwards into Internet of Things technologies.

McCracken's analysis is positively sceptical (as, arguably, most Microsoft analysis should surely be today), but more cutting perhaps was Jim Lynch this month on IT World.

Lynch highlights the fact that Nadelli is doing what the bulky showboater ex-CEO Steve Ballmer could never do i.e. let go of Windows as the centre of Microsoft's universe.

Image credit:


"Instead, Microsoft will be satisfied if people are using services like Office 365, Skype, OneDrive and Bing, whether they're on an iPhone, Android device or Windows PC," writes Lynch.

Is Microsoft really that wiling to change?

Is Microsoft (as a huge aircraft carrier of a business) really that nimble and agile?

Is Microsoft really that open in the face of a long pedigree and product line hinged around proprietary technologies?

Is Microsoft really clear is in its message sets delivered to the software application developer press?

Is Microsoft strategising to tow the openness party line and talk about wider product and platform development streams while all the while consolidating upon its core business model all the way to the bank?

We could not possibly say.

Open Interconnect Consortium: can your fridge talk to your toaster yet?

bridgwatera | No Comments
| More

Apologies for the deliberately tabloid headline, but here's the point: a group of industry vendors has formed the Open Interconnect Consortium with the aim of advancing interoperability for the Internet of Things (IoT).

1 open.png

So... that would be SMEG for fridges, Dualit for toasters, Bosch for fridges and boilers and perhaps Fitbit for wearables would it?

Nothing quite so consumer-tangible at this stage we're afraid.

No matter though, the company's involved here are Intel, Atmel, Broadcom, Dell, Samsung and Wind River.

The new consortium will seek to define connectivity requirements to ensure the interoperability of billions of devices projected to come online by 2020 -- from PCs, smartphones and tablets to home and industrial appliances and new wearable form factors.

The Open Interconnect Consortium (OIC) intends to deliver:
• a specification,
• an open source implementation,
• plus also a certification program for wirelessly connecting devices.

The first open source code will target the specific requirements for smart home and office solutions, with more use case scenarios to follow.


"The rise and ultimate success of the Internet of Things depends on the ability for devices and systems to securely and reliably interconnect and share information," said Doug Fisher, Intel corporate vice president and general manager of the software and services group.

"This requires common frameworks, based on truly open, industry standards. Our goal in founding this new consortium is to solve the challenge of interoperable connectivity for the Internet of Things without tying the ecosystem to one company's solution."

What do the non-partners think?

On the news, Steve Nice, CTO of Reconnix, an open source software specialist, made the following comment:

"The Internet of Things is the next frontier for the technology industry, but there is still a lot of uncertainty around how it will work in practice. Two rival groups working on a set of standards now would suggest that we are still some years away from mass adoption. History has taught us that consumers rarely win if they are to back early in a 'format war,' just ask anyone with a Betamax recorder in the attic.

"It's important that both The Allseen Alliance and Open Interconnect Consortium projects are open source. The Internet was built on free and open principles, and trying to build a proprietary framework would simply inhibit innovation. The future of the Internet and the cloud is open source, you only have to look at Cisco's recent massive investment in open cloud infrastructure for evidence of that."

DevOps is inherently open source, discuss

bridgwatera | 1 Comment
| More

DevOps has (arguably) a lot of guff, fluff and puff attached to it right now.

We're still not sure if this portmanteau-propelled "coming together" of two core technology disciplines is really one new perfectly formed beast.

Is it Ops that have gotten good at Dev... and so progressed onwards (Ed - that never happens surely?) or Devs that can handle a bit of Ops?

Is it really one person?

Or is DevOps actually a movement, a cultural approach and a method?

So DevOps is actually 2, 4, 8, 16, 32 or 64 people and so on.

While we're ranting... shouldn't we also argue that DevOps has true open source roots?

The argument goes as follows....

DevOps (developer-operations) was born out of the FOSS (Free and Open Source Software) by its very nature because it aims to address the "incongruous nature of integrating traditional LOB applications" with other applications.

1 redpixie.png

This is the view of Paul Greer, chief architect and co-founder at RedPixie -- a British technology firm, which specialises in transforming IT environments.

"DevOps has a lot to do with automating and repeating, a practice that grew significantly with the widespread adoption of free tooling and frameworks that were built by the open source community," said Greer.

"This became popular in the early 2000s with the automation of software builds but now encompasses platform provisioning as well as software deployment," he added.

Greer goes on to argue proprietary tooling may provide some benefits in organisations that have standardised on a single vendor stack, but even these tools would be short lived if they prevented DevOps from controlling other vendors stacks.

"Line of Business application vendors products are changing through open interfaces and cloud based hosting which negate the customer from having to concern themselves with platform provisioning," he concludes.

DevOps is open source, or the pure bits are at least -- the debate continues.

No silver bullets in virtualisation & containerisation (even with Docker)

bridgwatera | No Comments
| More

Docker isn't actually everywhere, but the open source software designed to allow a Linux application (and its dependencies) to be packaged as a container has enjoyed massive success recently.

Search AWS recently reported that, "AWS Elastic Beanstalk has updated its support for a Linux container that experts say could grow into a new standard for application portability among Linux servers."

That Linux container, is, logically then, Docker.


Software application developers can package applications using Docker version 1.0 "on their own" so-to-speak -- or, equally, they have the option to provide a text-based Docker file with instructions on how to create an image.

Container-based virtualisation techniques employed with Docker work to isolate software applications from each other on a shared operating system.

Containers are portable across different Linux distributions and so, logically then, the software applications themselves are able to run in any Linux environment.

What does Docker compete with?

So you would naturally expect Docker to compete with some pre-existing technologies and it does -- it aligns up against proprietary application containers such as VMware vApp technology and infrastructure abstraction tools like Chef.

Principal consultant at Cigital is Paco Hope.

Hope reminds us that Docker is cross-platform, allowing developers to target Mac, Windows, and Linux easily.

He asserts that allows developers to package up all the various libraries, bits and pieces that are necessary (without requiring a user to download and install them all) and -- it also "should have" the security benefit of being a sandbox that can't be escaped.

But an exploit was released recently that allows code that is supposed to be contained inside a Docker container to access files in the operating system where it is running.

A developer who sends you a docker-based application, could actually get files off your PC, even though that's supposed to be prevented by Docker's technology.

Hope explains the situation is full below:

"Much in the way that mobile devices can be jailbroken, virtualisation containers of all kinds can be susceptible to malicious code. Multi-tenant computer systems resemble multi-tenant buildings in real life: Often the defences that protect one tenant from another are much weaker than the defences that protect all tenants from random outsiders. Any part of the application that is virtualised this way is immediately less trustworthy than it would be running on a company's own servers. Software designers must consider malicious containers when designing security controls, despite the fact that virtualisation might be improving security in many other ways. This is a bug we should expect to be fixed quickly, it's not a flaw in virtualisation. Virtualisation and containerisation are generally good things. No technology is a security silver bullet, however."

Transactional in-memory analytics & grilled cheese sandwiches

bridgwatera | No Comments
| More

There's a lot of Spark around this week.

Well, it is the Spark Summit 2014 after all -- Apache Spark is a Hadoop-compatible computing system for big data analysis through in-memory computation with "simple coding through easy APIs" in Java, Scala and Python.

Alteryx and Databricks are collaborating to become the primary committers to SparkR, a subset of the overall Spark framework.


In addition, the firms are partnering to [attempt to] accelerate the adoption of SparkR and SparkSQL, in order to help data analysts get greater value from Spark as (it says here) "the leading" open-source in-memory engine.

Apache Spark, an open source data analytics framework, has quickly been gaining traction for its fast and scalable in-memory analytic processing capabilities inside and independent of Hadoop.

SparkR is an R package that enables the R programming language to run inside of the Spark framework in order to manipulate the data for analytics.

"The collaboration between Alteryx and Databricks will foster faster delivery of a market leading in-memory engine for R-based analytics within Hadoop that is available for the Spark community," said the companies, in a joint press statement.

DataStax is also present -- the distributed database management system for Apache Cassandra announced its Enterprise 4.5 edition.

"Spark and Cassandra form a natural bond by combining industry leading analytics with a high-performance transactional database," said Arsalan Tavakoli-Shiraji, head of business development, Databricks.

Tavakoli-Shiraji (Ed - was doubled barreled a good idea?) insists that today we need a unified platform for in-memory transactional and analytical tasks with:

• enterprise search,
• security,
• grilled cheese sandwiches,
• in-memory and,
• analytics.

NON-TECHNICAL NOTE: Please do not mix grilled cheese recipes with transactional or analytical workloads, we just threw that in to see if you were listening.

DataStax Enterprise 4.5 adds a new Performance Service to "remove the mystery" of how well a cluster is performing by supplying diagnostic information that can easily be queried.

Also of interest here there is integration of Cassandra data alongside Hadoop - so developers can run queries across both transactional data that has just been created and historical data based on Hadoop.

Plus also ... there are more visual management tools for developers, particularly around the diagnostics side of things - this opens up Cassandra for more testing and understanding of app performance, rather than being a "black box".

BYO-LHC: Bring Your Own Large Hadron Collider

bridgwatera | No Comments
| More

Rackspace's involvement with OpenStack and CERN at the Large Hadron Collider surfaced again late last month when the cloud hosting provider staged a London-based gathering to discuss what, when and where its cloud hosting intelligence is being deployed.

1 CERN_LHC_tunnel.jpg

Computer Weekly has already detailed the following case study explaining the work that has undertaken here: CERN adopts OpenStack private cloud to solve big data challenges.

Embarrassingly parallel

Group leader of the OIS group within the IT department at CERN is Tim Bell -- he's basically the guy that looks after the tech infrastructure for CERN

Talking about the software application development work that is carried out at CERN, Bell explains that the systems today "suffer" from being embarrassingly parallel.

In parallel computing, an embarrassingly parallel workload, or embarrassingly parallel problem, is one for which little or no effort is required to separate the problem into a number of parallel tasks.

"This is essentially where High Throughput Computing (HTPC) comes into play as opposed to High Performance Computing (HPC)," explains Bell.

"That is to say, a lot of compute tasks have to be carried out, but they are all executed independently," he added.

So is CERN worth learning from?

From a developer perspective, CERN uses OpenStack and codes some of the open IP into into its own IT stack ... the end result is that some of the resultant IP is contributed back to the community but some of it isn't.

"Everything [code wise] that we identify that is of interest to the community we contribute back -- that which we deem to not be of interest we make available, but do not actually contribute back," explains Bell.

The code that runs the Large Hadron Collider is therefore available and this this gives rise to our acronym of the day...

BYO-LHC: Bring Your Own Large Hadron Collider

The (Organisation européenne pour la recherche nucléaire) isknown as CERN.

Microsoft gets it so right... and so wrong

bridgwatera | 1 Comment
| More

The programming-specific press is in something of a maelstrom over the highs and lows of what Microsoft does so (arguably) well... and what the company still gets so (arguably, arguably) well, just a bit wrong.

The firm's "oh alright then we like open source after all if everyone else does" stance was perhaps embarrassingly slipshod to start with.

But initial cheesiness has arguably been all but eradicated by:

a) The leadership of the (arguably) very excellent Jean Paoli as president of the Microsoft Open Technologies initiative.
b) Microsoft's open embrace of open cloud
c) Microsoft's serious approach to Hadoop, Drupal, Pyhton, Node.js
d) Microsoft open sourcing more of its .Net developer framework and a wider open sourcing across its programming languages overall.

On point d) in particular -- the crème de la crème of the planet's software application development journalist community were invited to Redmond in April for the Microsoft Build 2014 conference to hear news of the company partnering with Xamarin, a move set to create a new .Net Foundation with a more open source outlook overall.

Yay for Redmond

Dr Dobb's Journal meanwhile was full of plaudits for Microsoft this month with an editorial leader entitled Redmond's Remarkable Reversal.

Editor Andrew Binstock writes, "Many factors have contributed to Redmond's surprising success, but two in particular stand out: Microsoft embraced the cloud early and vocally, and it began delivering new software releases much more quickly."

He continues, still positive and upbeat, "In Visual Studio for the cloud, Microsoft is putting itself on the cutting edge of development by inviting programmers to explore a completely new way of coding and product delivery."

... and yet so wrong at the same time

So Microsoft is wonderful after all then?

Even Windows 8 is going to get a start button back (another treat the lucky Build press got to hear about), a process that Tim Anderson called part of a "painful transition" ...

... although this (as Anderson points out) will still not fix the drought and famine the world currently experiences for full 'Metro'-style applications.

Boo hiss, nasty Redmond

Mike James on i-programmer isn't happy either.

James bemoans the reticence, caginess and ok then downright old stubbornness Microsoft has exhibited over its refusal to open source VB6.

VB6 (or Visual Basic 7) is programming language and IDE (Integrated Development Environment) that dates back to the heady CD-ROM centric days of 1991.

Today the Visual Basic 6.0 Resource Center is mainly focused on selling your migration and "upgrades from" than championing that which was once much loved.

James bemoans the fact that Microsoft "killed" VB6 but now refuses to open source the language despite the firm's "warmth" for open source.

"Now that they no longer have any interest in it one way or another, and with a new commitment to open source, why not let the community have VB6?"

He continues, "You could say that it occupied the position that JavaScript does now - misunderstood, misused and commonly thought to be ugly and inadequate. However, used correctly it could be simple, clean and elegant. After all it was the driving force behind VB .NET which took the language in a different direction while trying to maintain its easy-to-use aspects."

Do programmers still really love Microsoft then?

It's hard to say -- your erstwhile reporter last attended a Microsoft developer convention in 2005 and the crowd went wild for Vista.

Who knows?

Maybe they were still whooping over the Bill Gates & Napoleon Dynamite video that was shown on the day.

Bill Gates Goes to School with Napoleon Dynamite from Angela Marie Baxley Glass on Vimeo.

How websites are smarter in the background than you thought

bridgwatera | No Comments
| More

Basle-based open source web content management system (WCMS) company Magnolia International has released the 5.3 version of its core product with functionality now delivered through a series of task-focused apps.

For software developers, this latest version opens up the firm's overall Magnolia App framework to integrate third-party software and devices and enterprise data sources

More targeted web experiences


Magnolia Co-founder Boris Kraft claims that Magnolia 5.3 was inspired by the changing needs of customers to provide more targeted web experiences that use existing repositories of enterprise data.

"The innovation in this release has been driven by our customers and their need to track, enhance and organise every online user interaction through a simple-to-use content management system," he said.

Personalised and targeted

The personaliation tools included with Magnolia CMS 5.3 allow users to segment their online audience so that content can be personalised and targeted to the needs (and driven by the behaviour of) each individual site visitor.

Personalised experiences are created and managed with a suite of apps -- there are individual apps to create content variations and for marketing segmentation of visitor groups, developing personas and previewing content for different personas.

For developers, the modular system simplifies integration of external software and services, allowing them to hook into different stages of the personalisation process.

Social listening tools


Magnolia Tag Manager allows marketing teams to add, manage and remove tags for web analytics, marketing automation and social listening tools in with a single interface.

Magnolia 5.3 also introduces an improved DAM API making it possible to plug in external asset providers such as Flickr, YouTube or a file system.

Magnolia now provides a centralised repository for Magnolia approved applications, modules, source code and partner contributions -- and the Magnolia AppFinder is available to all customers and community contributors.

Smartphone simplicity

Magnolia CEO Pascal Mangold explains that Magnolia is an open Java CMS that delivers (what he calls) "smartphone simplicity" on an enterprise-scale.

"Magnolia CMS allows organisations to orchestrate online services, sales and marketing across all digital channels, maximising the impact of every touchpoint," he said.

Defence-grade fingerprint security on KNOX for Android mobiles

bridgwatera | No Comments
| More

Samsung Electronics and Google have teamed up to confirm that part of the Samsung KNOX technology will be integrated into the next version of Android.

The firm's KNOX Workspace aims to provide "hardware and software integrated" security for mobile devices.

1 igwduywgd.png

KNOX concentrates on multi-layered protection (that means from the device down to the kernel) with two-factor biometric authentication (that means numeric passwords plus fingerprint detection) for device access.

An enhanced element of the KNOX framework and Microsoft Workplace now join to provide users with a secure channel to corporate resources from mobile devices.

IT administrators will be able to use a separate container to manage and secure business data.

Samsung says that developers can extend their potential target market to a broader Android community with minimal implementation effort.

"Samsung has been pioneering to bring Android to the enterprise. We are grateful for their contribution to the Android open source project," said Hiroshi Lockheimer, VP of engineering, Android. "Jointly we are bringing enterprise-grade security and management capabilities to all manufacturers participating in the Android ecosystem."

Samsung KNOX is currently the only Android provider of defence-grade and government-certified mobile security complying with key US Government and Department of Defence (DoD) initiatives and other standards for mobile device security.

Samsung also offers a comprehensive KNOX management and application store service. In addition to the Samsung KNOX components found in this next generation Android platform, Samsung will keep developing specialised proprietary services such as KNOX EMM and KNOX Marketplace.

Linus Torvalds' open truths for developers (video)

bridgwatera | No Comments
| More

Linus Torvalds conducts an interview with the IEEE Computer Society to explain how he sits today in terms of his thoughts with Linux.

Torvalds is as humble and genuine as you might expect.

He explains that Linux "just did it differently" and explains how happy he is about "leaving something behind" that could change computing for everyone forever.

"Linux made it clear how well open source works, not just from a technical standpoint, but also from a business, commercial, and community standpoint," says Torvalds.

He also goes on to explain how happy he is about the success of the Git source control system.

Torvalds has some ideas for what is going to happen in the future and he is a quietly inspirational man.

It's 8:47 long and worth the investment.

Cisco open sources cloud-centric block ciphers

bridgwatera | No Comments
| More

Cisco is open sourcing block cipher technology to, the company hopes, better protect and control traffic privacy in cloud computing systems

What is block cipher technology?


A block cipher is a method of encrypting text (to produce ciphertext) in which a cryptographic key and algorithm are applied to a block of data (for example, 64 contiguous bits) at once as a group rather than to one bit at a time.

Flexible Naor & Reingold

Cisco is creating the Flexible Naor and Reingold (FNR) encryption scheme which will exist under open source licence LGPLv2.

Cisco software engineer Sashank Dara has said that FNR is an experimental small domain block cipher for encrypting objects (< 128 bits) like IPv4 addresses, MAC addresses, arbitrary strings, etc. while preserving their input lengths.

"The demo application written is for encryption of IPv4 addresses (the cipher preserves their formats as well if needed). When FNR is used in ECB mode, it realizes a deterministic encryption scheme. Like all deterministic encryption methods, this does not provide semantic security, but determinism is needed in situations where anonymizing telemetry and log data (especially in cloud based network monitoring scenarios) is necessary," he said, in a Cisco blog post.

Importantly this is still an experimental block cipher, not ready for production yet.

Google open sources PDF rendering

bridgwatera | 1 Comment
| More

Google taken its PDFium software library forward into open source project status.


PDFium is an open-source PDF "rendering engine" that will be folded into the Chrome browser.

"For contributing code, we will follow Chromium's process as much as possible," says Google.

Chromium is...

The Chromium projects include Chromium and Chromium OS, the open-source projects behind the Google Chrome browser and Google Chrome OS, respectively.

PDF rendering is the term used to describe the translation and transport of (usually) web-based pages into PDF format directly on-screen and (usually) for onward use as a saved file, for despatch to a mobile device or for printing.


Foxit Software

Google has drawn on PDF the rendering expertise at Foxit Software to develop its own rendering engine in this case.

Chromium project evangelist Francois Beaufort has said that, "By open-sourcing Foxit's PDF technology, the Chromium team gives to developers a robust and reliable PDF library to view, search, print, and form fill PDF files. "

The code for this PDF tooling was (in parts) previously closed source and proprietary, so this open sourcing brings the total pedigree of the Chrome project into cleaner space.

There is no known monetary benefit to Google for taking this action.

Chrome senior software engineer Peter Kasting has explained that Google has long tried to ensure as much of Chrome as possible is available openly as Chromium.

Flash ---- ahhhh!

A number of elements (like the Flash and PDF plugins) stood out here as Google did not have a license to release them.

But now with PDFium says Kasting, one of those major moving parts is now open as well.

This is great for a lot of reasons in Google's view.

"It reduces the number of closed pieces of Chrome, and thus the surface area for which people can be suspicious that we're doing something shady. It makes a high-quality PDF plugin available to users who only want an open-source product and were using Chromium as a result. It is almost certainly the highest-quality PDF engine available in the open-source world, and can now serve as a reference for other projects, or be included in other browsers based on Chromium or other open-source projects entirely," said Kasting.

The open cloud revolution is crawling like a snail

bridgwatera | No Comments
| More

A walk around London's Cloud World Forum exhibition in Olympia this week provides the casual observer with a number of things:

1. unlimited free 'corporate candy store' sweets
2. opportunities to win a 'GoPro', always the latest giveaway favourite
3. a view of the most proactive cloud players today (Amazon doesn't bother turning up)
4. a realisation that Olympia is tiny and just not Las Vegas or Barcelona etc.
5. difficult-to-find toilets
6. a slightly uncomfortable encounter with the models from the 'Dream Agency' when you want to get a serious product guru interview
7. some insight into cloud standards and open source cloud penetration

Numbers 1 and 7 are clearly the most important items on the above list if you need some guidance.

So what of open source?

Your erstwhile blogger cum journalist cum analyst (that's me) spent an afternoon asking somewhere around seven interviewees where they stood on cloud standards.

1 snail 5791106471_b67b0b7999_z.jpg

I spoke to IBM, Progress Software, Telstra, AppGyver & raw engineering, HP,
ServiceSource and anyone that would provide me with cold water.

The overall feeling among exhibitors is that standards are not as important as Line of Business objectives and core business-centric operational objectives.

Despite the prevalence and growth of OpenStack, there is a deeper and more entrenched legacy IT infrastructure that pervades across the business landscape.

The big truth here is...

The relationship between incumbent vendors' installed data management products and established bureaucratic business streams will not allow new open cloud structures to assume power overnight.

Rather then, it will be a case of open cloud "supplementing rather than supplanting" i.e. open cloud will (relatively) move at a snail's pace.

This is a view backed up by the more realistic cloud vendors at whatever level of the industry and IT stack they choose to ply their wares.

Red Hat isn't doing everything right, but CEO Jim Whitehurst's comprehension of this space is better than some -- and he's a guy that used to run an airline business (Delta) so he knows a thing or two about shifting consumer buying habits.

This is merely a blog to express gut look and feel, this subject is wide open.

Red Hat 7, the opposite of Microsoft & OpenStack

bridgwatera | No Comments
| More

Red Hat has deliberately slowed the pace of its flagship OS release schedule in a bid to lower operational costs and stop “driving IT guys crazy” with the need to update deployments.


At a time when Microsoft’s developer release cycle is markedly rapid (and arguably quite impressive), Red Hat Enterprise Linux (RHEL) 7 development is now strategically targeted to be a “consumable OS” full of simplicity.

“It’s the opposite of OpenStack, where new releases come to market every six months,” said Brian Stevens, CTO of Red Hat Inc., open-source OS developer. “Only hypercritical changes merit new version numbers. Otherwise, you’d drive IT guys crazy updating their deployments.”

The company’s spin machine says that RHEL 7 “pushes the operating system” beyond today’s position as a commodity platform.

How big is the datacentre spectrum?

What this suggests is that Red Hat wants us to think about RHEL as a power source for the “whole spectrum” of enterprise IT:

• Bare metal servers, • Cloud Services • Application Containers • Virtual Machines, • Infrastructure-as-a-Service (IaaS) and, • Platform-as-a-Service (PaaS).

These all converge in the modern heterogeneous datacentre environment. to meet constantly changing business needs.

RHEL features enhanced application development and isolation through Linux Containers, including Docker, across physical, virtual, and cloud deployments as well as development, test and production environments.

Cross-realm trust

The firm also talks about “cross-realm trust” to easily enable secure access for Microsoft Active Directory users across Microsoft Windows and Red Hat Enterprise Linux domains, providing the flexibility for Red Hat Enterprise Linux to co-exist within heterogeneous datacenters.

Also a key feature here, Red Hat has included secure application runtimes and development and troubleshooting tools, all integrated into the platform and container-ready.

RedHat 1.jpg

According to Jay Lyman, senior analyst, 451 Research, “Red Hat Enterprise Linux 7 helps to introduce newer technology, such as Linux Containers and related Docker software, to large enterprise environments along with the stability and certifications that enterprises demand. This is critical given the growing number of organizations mixing new technology and methodology - such as cloud, agile and DevOps approaches - with their existing infrastructure, processes and governance.”

What do firms really mean by Hadoop leverage?

bridgwatera | No Comments
| More

Does a day go past without a Hadoop update right now? -- clearly not.

But why should this be so?

Popular wisdom points to the problems associated with "complexity of deployment and management" of environments on this open source framework for big data storage and large-scale dataset processing.

Teradata is the latest Hadoop soup of the day.

The firm's eponymously named Portfolio for Hadoop 2 product purports to help firm (and we quote), "leveraging diverse data stored in Hadoop".

As many times as we try to suggest that the industry refrain from using (sorry, leveraging) this arguably now worn and hackneyed generic term, vendors still insist that "leverage" is the best way to express what their technology does.

What Teradata is too shy to say is that bringing Hadoop in is hard and so the firm has produced software with components engineered to simplify the operation of Hadoop and accelerate time-to-production in environments with multiple disconnected technologies.

Still too many words?

OK fine, look at the product specs, this is what this software actually does when it comes face-to-face with Hadoop.

• High Availability and Disaster Recovery
• Performance and Scalability
• Data Transformation and Integration
• Data Security
• Setup and Installation
• Monitoring and Manageability

Company president Scott Gnau has said, "The Teradata Portfolio for Hadoop 2 supports the fastest path to business value by leveraging the 'store-everything approach' of the data lake."

... and he used One Leverage Per Sentence (OLPS) there.


What he wanted to say (and did so ultimately) is that the firm aims to take the complexity and risk out of a Hadoop technology deployment allowing organisations to focus on high-value activities.

Teradata Open Distribution for Hadoop leverages (ouch!) core Apache Hadoop 2 components built by Hortonworks, including Apache Hadoop YARN, a next-generation framework for Hadoop data processing.

"Teradata Appliance for Hadoop - The enhanced Teradata Appliance for Hadoop, with Teradata Open Distribution for Hadoop, is the first to run on the Hortonworks Data Platform 2.1. The appliance is delivered ready-to-run and optimised for enterprise-class data storage and management. The appliance can scale from 144 terabytes to over 98 petabytes of data to meet the customers' growth needs. The Teradata Appliance for Hadoop offers fast performance with the latest generation of Intel technology, and the combination of InfiniBand fabric-based hardware and Teradata BYNET V5 software with scaling and failover capability," said the company, in a prepared press statement.

Teradata offers additional consulting services to back up its total technology proposition with an "identify and advise" plus "architect and implement" approach.

The Teradata Portfolio for Hadoop 2 will be available in the third quarter of 2014 with partner support -- customers can leverage it at that time.

Using Sqoop to ploop to Hadoop

bridgwatera | No Comments
| More

Syncsort has enhanced its DMX-h Hadoop ETL software.

So what?

Extract, Transform, Load (ETL) refers to three separate functions combined into a single programming tool.

Getting data from enterprise data warehouses and legacy systems (including mainframes) into Hadoop is clearly a key implementation today for big data ETL jobs.


Syncsort is also addressing growing offload demand by supporting the Sqoop Apache Hadoop initiative.

Apache Sqoop is a tool designed for transferring bulk data between Apache Hadoop and structured datastores such as relational databases.

"Many of our customers are looking to free-up data warehouse capacity and reduce legacy system costs by offloading expensive data workloads and data to Apache Hadoop," said Lonne Jaffe, CEO Syncsort. "

The company also contributes to the Project SILQ Technology Preview.


This is a "data warehouse offload technology" for analysing SQL scripts and providing a detailed, graphical visualisation of the entire data flow and best practices on how to develop the corresponding DMX-h jobs in Hadoop.

A final note, the company is also focused on Tableau Integration: Syncsort's Hadoop ETL now allows users to create Tableau data extracts that blend data from a wide variety of sources including data warehouses, mainframe and other legacy systems, facilitating advanced analytics and visualisation in Tableau.

NOTE: our story title here should really be: Using Sqoop to perform ETL data warehouse offload technology functions to Hadoop, but we wanted to coin ploop as a new shorthand, so go figure.

"In addition to the product enhancements, Syncsort continues to actively invest in the Apache Hadoop open source community including new open source initiatives that help simplify and accelerate offload, and enhance performance and efficiency of the workloads in Hadoop. Syncsort's new initiative extends Sqoop, a framework to move data between relational databases and Hadoop," said the company, in a prepared press statement.

Syncsort is open sourcing to the Sqoop project the ability to move multiple mainframe data sets in parallel to Hadoop and store them in Sqoop supported file formats.

The open source also opens the interface to allow anyone to extend the support for more complex mainframe data files. The upcoming release of DMX-h uses this interface, providing a plug-in to move all mainframe data formats, including binary sequential data with COBOL copybook metadata and VSAM to Hadoop.

Pentaho: don't get blinded with (data) science

bridgwatera | No Comments
| More

At the Hadoop Summit in San Jose this week... open source Business Intelligence company Pentaho is announcing what it calls 'Data Science Packs' for developers and data scientists.

The Data Science Pack aims to help productivity by executing advanced descriptive statistics and machine learning algorithms (at scale) inside of data flow transformations.

What are descriptive statistics?

According to a University of Leicester paper, "Descriptive statistics are used simply to describe the sample you are concerned with -- they are used in the first instance to get a feel for the data, in the second for use in the statistical tests themselves... and in the third to indicate the error associated with results and graphical output."

Pentaho says that the Packs streamline the hard, time-consuming process of using R and Weka to prepare, clean and orchestrate data in Pentaho Data Integration (PDI) for analysis.

So what? - and anyway, what are R and Weka?

The "R" programming language and "Weka" machine learning algorithms for predictive analytics are two of the most popular data science tools out there (source: O'Reilly Data Scientist Salary Survey).

Unfortunately they require specialist technical skills that many companies outside Silicon Valley don't have in house and they are time-consuming.

Ventana Research just estimated that a whopping 60-80 percent of time spent on a big data analytics project is spent on preparing data using tools like R and Weka.


According to Pentaho, "By slashing that time, those responsible for data analysis can devote more time to the 'value added' stuff and less time on boring (but important) administrative hygiene tasks and just get things done a lot faster."

But why do this anyway?

(1.) Mainly because Pentaho tells us that its customers have been asking for this.
(2.) The company says it isn't focused on 'eye candy', so delivering these kinds of tools is part of Pentaho's strategy to make the hardest, least sexy and most important aspects data analytics fast and easy.

The Data Science Pack is, then, essentially, a toolkit to operationalise the commonly used R and Weka technologies.

According to the Ventana Research Big Data Analytics Benchmark Research, the top two time-consuming big data tasks are solving data quality and consistency issues (46%) and preparing data for integration (52%).


"Having built blueprints for the four most popular big data use cases, we know advanced and predictive analytics are core ingredients for success," said Christopher Dziekan, EVP and chief product officer at Pentaho.

"The highest value of insight comes from having foresight blended with hindsight to drive insight and action. The Pentaho Data Science Pack allows organizations to apply their deep domain expertise and improve their customer analytics and predictions," he added.

Ubuntu beats Microsoft & Red Hat in OpenStack OS race

bridgwatera | No Comments
| More

OpenStack has been in the news a lot... well, we have just had the OpenStack summit in Atlanta after all.

Many say that the "problem" with OpenStack is that it is still regarded as a "moving target" and work in progress, augmenting and updating as it does twice a year.

So what has bedded in and established itself then?

The OpenStack Foundation recently carried out a survey of OpenStack users and found that Ubuntu is the most widely used operating system with OpenStack.

Red Hat Enterprise Linux is NOT in first place, it comes in third behind CentOS.

Fourth is Windows, then Debian, SUSE Linux Enterprise and you can read the rest below.

OpenStack collected 1,780 survey responses, representing 506 deployments of OpenStack across 512 companies.

2er4f3 Shot 2014-06-03 at 11.45.35.jpg

Do you know your OpenStack history?

bridgwatera | No Comments
| More

True open source cloud aficionados know their subject matter back to front; they know the history of cloud, the history of open cloud standards and (most of all) they know the history of OpenStack.


For those that need reminding, OpenStack is an open source Infrastructure-as-a Service (IaaS) collaborative project and initiative dedicated to supporting interoperability between cloud services.

In historical terms, NASA worked with Rackspace to develop OpenStack back in 2010.

RackSpace subsequently embraced the need to drive the open nature of cloud to such a degree that the firm now refers to itself as "the open cloud company" in its company name/tagline.

OpenStack officially became an independent non-profit organisation in September 2012.

For more history, you will want to follow the example set at the recent OpenStack Summit it Atlanta where attendees were asked to name all 10 releases of OpenStack itself.

The following answers are quoted almost directly from from the site's feature entitled Can you name all ten OpenStack releases?

• Austin (Oct 2010): Named after the location of the first design summit, in Texas.
• Bexar (Feb 2011): The county containing San Antonio, Texas.
• Cactus (Apr 2011): Cactus is another city in Texas.
• Diablo (Sept 2011): Diablo is located near Santa Clara, CA.
• Essex (Apr 2012): Essex is located near Boston, MA.
• Folsom (Sept 2012): Folsom is located near San Francisco, CA.
• Grizzly (Apr 2012): Grizzly is a symbol of California where the summit was held.
• Havana (Oct 2013): Havana is located near Portland, Oregon.
• Icehouse (Apr 2014): There is a street in Hong Kong named Ice House.
• Juno (Sept 2014): Juno is located near Atlanta, Georgia.

OpenStack is committed to an open design and development process.

The community operates around a six-month, time-based release cycle with frequent development milestones. During the planning phase of each release, the community gathers for a Design Summit to facilitate live developer working sessions and assemble the roadmap.

How to make Hadoop logfile analysis fun & interesting

bridgwatera | 1 Comment
| More

21e12devwev.jpgThe heady world of server log analysis kicks up another gear this week with new products proffered forth from XpoLog Ltd, the company that says it "invented" augmented search for IT log analysis.

What's the augmentation and why is it interesting?

We're glad you asked -- this is a process by which DevOps professionals can discover hidden:

• errors,
• anomalies,
• problems and,
• patterns...

... that may be hiding inside IT log data.

What is IT log data or a logfile?

This is a file that records "events" occurring throughout an operating system or software application (in the case of XpoLog this software is focused on open source Hadoop) and it can also incorporate message information between users in a network -- a list of other log types can be found here if you need bedtime reading.

Augmented auto-detected intelligence


XpoLog is aiming to bring more to the party than plain old log searches and now offers what it calls search with "augmented auto-detected intelligence" based on user search context.

We digress, back to the news... XpoLog Augmented Search 5.0 brings XpoLog's troubleshooting capabilities to the Hadoop platform.

The company states the obvious but still pertinent factor at play here i.e. testing applications on Hadoop (a large-scale, distributed data processing platform) isn't a trivial task.

Pattern and anomaly detection

"XpoLog adds intelligence to log file search context with semantic analysis, and pattern and anomaly detection (to uncover insights and trends into application problems, systems, and user behaviour)," said XpoLog VP Solutions Omry Koschitzky.

"This helps users analyze problems within the Hadoop infrastructure and applications that run on the platform. It offers visibility into the distributed architecture, automatically triaging issues and errors for severity, and presenting results in a dashboard interface," he added.

The company has made the XpoLog platform free for processing up to 1 gigabyte of log data per day.

More fun and interesting than you thought?

Find recent content on the main index or look in the archives to find all content.



Recent Comments

  • chris haddad: Leading DevOps tools (Puppet, Chef, Jenkins) and environments (LXC, docker, read more
  • Andrew Cowan: Microsoft are becoming more and more like a Jekyll and read more
  • Colin MacKellar: This is really good news. Do you know if it read more
  • Ashish Mohindroo: Check out a new Cloud Log Monitoring & Management Platform: read more
  • ccm12983: The Raspberry Pi is hard to beat, but it depends read more
  • Jessica Dodson: An open source policy helps ensure that all your bases read more
  • qaz wiz: even at that cut rate price this thing needs to read more
  • Evaldo Horn de Oliveira: Great post – there’s definitely truth behind the statement that read more
  • Doron Levari: Thank you for your post, I enjoyed reading. As I read more
  • Monica Pal: Totally agree that organizations should use the right technology for read more

Recent Assets

  • 1atch_dogged_wolf_by_looneyartist-d6ieae3.png
  • doug-fisher_2.jpg
  • 1 open.png
  • 1 redpixie.png
  • Docker-logo-011.png
  • 7-andre-grilled-cheese-400.jpg
  • teaser-tags-manager.2014-06-24-12-15-05.png
  • preview-app.2014-06-24-12-14-58.png
  • 1 igwduywgd.png
  • Wheel_cipher.png

-- Advertisement --