Case Study: US Xpress deploys hybrid big data with Informatica
US Xpress is using a suite of data tools and integration products from Informatica to boost its fleet management
US Xpress is using a suite of data tools and integration products from Informatica to provide the business with greater insight on its fleet of lorries and drivers. Cliff Saran speaks to chief technology officer, Tim Leonard, about the role of big data.
The company is using several Informatica tools including IDQ for data quality; PowerExchange, which talks to its AS/400 systems; and CPE for real-time data monitoring, among others. These are part of a project to put information management at the heart of the company's IT, which began three years ago.
Tim Leonard, chief technology officer and vice-president at US Xpress, explained: “We replaced 90 mainframe screens with three operation detail screens, so when a fleet manager comes in, he has one location to go through and has all the data at his fingertips.”
Leonard spent a number of years at Dell. He says there are analogies between working in logistics and the PC manufacturer, but US Xpress uses deeper data analytics. "Dell ships 1,000s of PCs, while we have 1,000s of packages to deliver."
In this podcast, Tim, Leonard, CTO at US Xpress, explains how the company processes and analyses big data to optimise fleet usage, reduce idle time and fuel consumption and save millions a year as a result.
When an order is dispatched, it is tracked using an in-cab system installed on a DriverTech tablet. US Xpress constantly connects to the devices to monitor progress of the lorry. Leonard said the video camera on the device could be used to check if the driver is nodding off. There is also a speech recognition capability, which is more intuitive and easier to use for the driver, compared to the tablet's touch UI. "We never lose track of freight, and we can see how much time is left the on the driver's shift."
All the data collected from the DriverTech system is analysed in real time at the US Xpress operations centre. Leonard said: "The key asset for us is the truck, and we want to keep it running." This is achieved by using geospatial data, integrated with driver data and truck telematics. By using this information, he said operations can mimimise delays and ensure trucks are not left waiting when they arrive at a depot for maintenance.
US Xpress required several key data elements to monitor the amount of idle time when the truck is not on the road making a delivery. He said around a terabyte of data was needed to bring down the amount of idle time. There were 20 data sets, hundreds of billions of records. "Large data and small data is fused together to enable us to design algorithms that is allowing us to manage idle time," he explained.
US Xpress built the software that runs on the in-cab DriverTech device. Along with feeding data back to operations, he said it also enable the driver to feel part of the team.
He added, "We collect a lot of data. When we bring out a new [version of the in-cab software], we monitor feedback from the drivers on social networking sites, which boost user acceptance testing. We can get 1000s of responses from drivers quickly" This happened recently when US Xpress the screen on the DriverTech system and the drivers complained that the touchscreen buttons were too small. Leonard said US Xpress was able to change the software quickly to improve the user interface for the drivers, Using a traditional approach to user acceptance testing would have taken weeks.
The company implemented a single data analytics user interface that pools in information from multiple sources in real time. The company is using the concept of an information management flow, based on a stream of data that is analysed using complex event processing, that is routed simultaneously to multiple data stores. He said: “Data is constantly streaming, but we pluck out pieces of data based on a particular business event. At any given moment in time we can ask a question of the data, which is then tied to certain event.”
For instance, a new order needs an order ID and a creation date. When the order completes, the status of the business process for that order changes. The data may then be linked to the maintenance data for a particular vehicle that was responsible for the delivery of the order.
In this way, OLTP is used for the operational detail store for the order. But when the order has been completed, it is stored as historical content, which can be analysed retrospectively. “To a user, it does not make a difference whether the data from an OLTP transactional database is historical or active content in nature. You have one-stop shopping of information.” US Xpress has separated the complexity of the various back-end data stores, said Leonard .
The stream of data is based on 900 data elements from tens of thousands of trucking systems - sensor data for tyre and petrol usage, engine operation, geo-spatial data for fleet tracking, as well as driver feedback from social media sites.
All of this data is stream both in real time and collected for historical analysis. Information is fed to appropriate online transaction processing systems, Hadoop and data warehouses. Informatica provides a dynamic cache, which enables US Xpress to read and process data as it is streamed This is something Leonard is keen to investigate further with products like SAP's in-memory Hana database and the forthcoming SQL Server 2012 release of Microsoft's relational database.
The middleware engine
US Xpress uses a highly heterogeneous environment with mainframe systems, AS/400s and Intel-based servers, all of which run different databases. The company has built an information management stack to tie together the different databases, based on a service oriented architecture.
Data is passed from the various data sources through an enterprise application integration, where it is cleansed using Informatica IDQ, that runs data quality rules. The data is then stored in three main data stores: a persistent data store, an operational detail store and an active data warehouse.
“All information, whether it is a customer or an order, is put onto a messaging bus, that then disseminates it to the rest of the architecture within a sub-second response time, Leonard explained.