
Moving from the mainframe to a distributed SOA is hotel
group Starwood’s overall IT strategy. Helen Beckett
reports on how installing a middle-tier cache is playing a key
role
It is hard to put a mainframe out to pasture, however dated the
technology, when it continues to provide reliability and high
performance. In today’s hard-nosed business climate, this has
nothing to do with sentimentality and everything to with finding a
successor that can do the job as well.
Companies such as hotel group Starwood, which owns brands
including Le Meridien and Sheraton, have discovered that
implementing a middle-tier cache makes a distributed platform and
high performance a possibility. By putting the data closer to the
application, a cache can boost the number of transactions achieved
on smaller servers.
Starwood realised it had reached crunch point with its IBM MVS
mainframe when it became difficult to source skills. “It is hard
sometimes to find experienced programmers who know Cobol,” says
Song Park, director of pricing and availability technologies at
Starwood.
Being sufficiently agile to scale to meet new business demand
and to develop applications quickly that could bolt onto the
mainframe were also key drivers. The idea behind using smaller
servers was to gain the capability to scale in smaller increments
to cope with different spikes of traffic.
For these reasons, Starwood decided that its long-term strategy
had to be to “get off the mainframe”. The central reservation
system was the biggest and most commercially important mainframe
application, and Starwood decided to tackle this first. Park was
given the task of developing the pricing and availability function
of the legacy central reservation system as a service.
The project was part of an overall strategy to replace the
mainframe platform with a distributed, service oriented
architecture. Migrating to this model would enable greater
flexibility to respond to business changes. “If the pricing model
changes, for example, we can respond to that dynamically in a way
you could not with Cobol,” says Vanessa Lapins, senior director of
emerging systems development at Starwood.
Starwood considered two options: migrating data off the
mainframe DB2 database onto a distributed platform, or keeping data
on the mainframe and putting a layer of middleware in front to
handle the more complex enquiries. “Long term, we wanted to be off
the mainframe and so we decided to bite the bullet,” says Park.
The firm installed an Oracle database and built an application
around it to serve its sales channel partners. Starwood takes
business from multiple channels, including business-to-consumer
sites such as Travelocity and Expedia, Starwood’s own site, and a
host of non-affiliated agents.
The lion’s share of business comes from the global distribution
system (GDS), powered by GDS suppliers Amadeus and Sabre.
Connection to the hub requires suppliers to meet stringent service
level agreements, and to process hundreds of transactions a
second
A mid-tier web service was designed to enable different channels
to interrogate the booking data in any way required. “Everyone
wants to come in via web services and XML and to book using their
own particular format. This might be to satisfy a particular
clientele or to calculate prices differently to inform internal
revenue teams,” says Park.
The pilot was not an unqualified success, however. An ability to
scale is crucial to cope with the so-called “tsunami” effect, where
a marketing promotion may bring in a huge wave of traffic. But when
Starwood simulated loads in a test environment, the Oracle database
only reached 300 transactions per minute on two servers.
“The only way the Oracle database could have achieved the
required throughput and performance would have been to throw vast
amounts of hardware at it,” says Park. Additional hardware costs
and licensing fees were not an option, but gaining a throughput at
least equal to the IBM predecessor remained a prerequisite.
One part of the pilot that had worked unexpectedly well was the
caching component, and this revelation provided the way forward.
“We had custom-built a cache that could serve second- and
third-time requests directly out of the Oracle database. We found
that the response times were really incredible – less than 100
milliseconds serving hundreds of requests concurrently per second
on a four- CPU machine.”
Starwood decided to build a cache as an integral part of the new
distributed architecture and looked at a variety of
technologies.
Objectstore from Progress Software was chosen because of its
ability to persist cache – keep data available in cache memory to
speed accessibility. Most other caches achieve persistence of the
cache through a spill over of data onto disc. “I am sure some other
cache technologies also give you the options to offload the cache
onto disc on demand. Objectstore does this in real time, yet lets
us maintain in-memory performance numbers.”
While Starwood was busy building its data architecture, it took
comfort from the feedback of consultants that the same design
pattern had been adopted by financial traders. “Like stock traders
with the need to compute masses of finely-tuned transactions at
very high speeds, we have said, ‘Let’s bring the data as close as
possible to the application’,” says Park.
Achieving both objectives was accomplished by creating a
specialised middle tier. “We still have Oracle at the back end, but
it serves a different purpose. It is a database of records,” says
Park. Nor is Starwood planning to replace its relational database.
“We want to retain our investment in SQL people and tools.”
The Objectstore database middle tier has brought another
benefit. With its close allegiance to Java and web services
architecture, Park and his team had the opportunity to build a pure
object model with close dovetailing between query and cache.
“It means we could circumvent entirely the layer of
object-relational mapping,” says Park. This happens normally when
an application’s built-in objects have to query a database that
stores data in relational tables. This bridge between the two
worlds still has to happen at some place in the Starwood
configuration, but crucially, not in real time.
If the mapping is done in real time it increases the complexity
tenfold, says Park. Developers need to know not only how to pull
SQL from three tables, but from a slew of other technologies too,
such as database drivers and Java Entity Beans, in order to
represent data elements from the relational database. Additional
third-party tools can help with the mapping, but this requires yet
another skillset.
Instead, Starwood’s development team built an object model in
Java and plugged it into Objectstore. The mapping piece of the data
architecture, where the object model talks to the Oracle database,
is now a relatively minor task that is done offline.
The biggest challenge was getting used to a different mindset.
“There has not been an enormous amount of pain, other than getting
the object model right, which took about three iterations. The big
plus is that it lets us focus on the business problem and logic.”
The two main benchmarks were how naturally the cache performed and
how easy it was to query, and how easy the model was to support and
maintain.
The availability of the system has been cracked, but it is just
one piece of the puzzle, and the replacement of the mainframe
system with a distributed service oriented architechture remains to
be implemented. However, the ambition is to join up all channels to
the new, distributed and cache-enabled reservation system by the
end of the year.
The fruits of the IT team’s efforts will be realised later this
year when Starwood hopes to capitalise on the extensibility, or
greater granularity, of the system. Multiple room details can be
held as objects within the object model and queried and booked by
the user, says Lapins. “Users will be able to compose a reservation
using a host of details beyond whether a room is non-smoking or
not.”
Why use middle-tier cache?
Explaining the purpose of middle-tier cache memory, Clive
Longbottom, service director at analyst firm Quocirca, says
Windows-based systems require screen refreshes that mainframe
“green screen” applications do not. Legacy applications are able to
survive on minimal bandwidth and mainframes can serve data rapidly
in this environment.
In a distributed configuration, it is a problem getting the data
from the disc to the CPU, and a lot can be gained from in-memory
processing. Effectively this provides a memory in silicon rather
than magnetic memory. However, this does not solve bandwidth
latency issues, and in-memory databases are not cheap.
Rather than looking for a one-off solution, companies can tune
several different technologies and achieve many, small performance
improvements. The idea should be to “virtualise” – whether storage,
database or network – and to bring data closer to the memory.
Adding cache as a middle tier is a good option because it
decentralises the data and this tackles the latency problem.
The downside is that cache poses problems of data integrity. For
example, a booking transaction can be processed only to discover
that the main database sold that vacancy. And the “heartbeat”
response of requesting data confirmation from cache to database
requires additional CPU power.
Cache, therefore, needs lots of intelligence to make it work,
for example knowing which data to store locally. You do not want to
replicate the entire database – it is self-defeating– but too many
requests back to the database creates an overhead.
The object model underpins everything, but it is still important
to get the relational model right too, otherwise cache will
struggle to get data out of it on time. The business model, object
model, cache and relational database are all interdependent. You
need a lot of skill to get it right, says Longbottom.