Developer Case Study: Elasticsearch at LateRooms.com

This is a guest post for the Computer Weekly Developer Network blog written by Andy Lowry in his capacity as development team lead at LateRooms.com — Lowry has been writing software professionally for more than 15 years, in industries including defence, scientific instrument control and travel.

At LateRooms.com we need to get search right. The search bar is the centrepiece of our site and it is the main interface for our customers to find the hotel they want. We use the Elasticsearch distributed full-text search server to solve a number of problems — while we regularly talk about how we use it for logging, we don’t often discuss how we use it for search, so here goes.

Autocomplete on Elasticsearch

Last year the autocomplete feature at LateRooms was completely rewritten. The old system was slow and the results were not great. Part of the new implementation was Elasticsearch Completion Suggester feature, giving LateRooms the performance needed.

Laterooms.com, for rooms. Late ones, get it?

Laterooms.com, for rooms. Late ones, get it?

This allowed us to load our destination data and all our hotels into a single index. Each document in the index has a suggest field which we use to match the input text, display text which is what you see in the drop down, and some metadata about the entry.

Matching is done on the suggest field. Indexing is done on every permutation on the first five words of the search text. We needed to do this to allow matches where the words are out of order, so “Manchester City Centre” and “City Centre Manchester” would both match the same results. We also apply stop words for words that are common to hotel names and destinations.

This solution resulted in an index of approximately 1GB in size, and with our standard cluster of three machines all with 24 cores and 80GB RAM, this gives response times averaging around 15ms.

Search

Our search API powers the data on the Search Results pages of the website, our apps and a few other internal tools. The existing implementation was based around an SQL Server that had worked well in the past. However, we needed something more flexible and better targeted to address our problems. Elasticsearch provided that flexibility.

It is still a work in progress with the current state being a hybrid of the two systems. The focus is to move all functionality to Elasticsearch providing most value.

The implementation has two main indexes, one for destinations and the other one for hotels. The hotels index includes relatively simple information about the hotel including name, address, facilities and its geo location.

Destinations

Destinations are places such as cities, towns, counties, points of interest, train stations, airports – basically anywhere someone might be looking for a hotel. Each destination includes a name, a geoshape and some metadata.

The index contains 1.7 million destinations, most of which are UK postcodes. Some are indexed as a circle, and others by a polygon.

Sourcing the polygons is one of the biggest challenges. Freely available data sources such as OpenStreetMaps and Ordnance Survey have an incredible level of detail not needed for our searches. Although accuracy is one of our main focal points, strange as it sounds, most of this data is too accurate for us. In order to minimise index size and indexing time the polygon is reduced before indexing it.

While official boundaries and borders are great for administrations they aren’t great for finding hotels. Polygons are often needed to extend well beyond the official borders. LateRooms’ home city of Manchester is a great example:

This map shows the official boundaries of Manchester and the neighbouring city of Salford.

Many of the hotels (above) near the city centre of Manchester are actually in Salford. If a user is searching for Manchester hotels they expect to see those hotels listed even though they are not technically in Manchester. This turns out to be a very common situation. We could address this with a team of cartographers and a lot of effort but this would be very expensive. So we resolve this issue by having multiple shapes for each destination and A/B testing them until we find one which works best for our customers.

Our running A/B experiments are also stored in Elasticsearch. When we want to test a new shape we add it to our experiments index, including the information about what proportion of users are included in the experiment.

Text search

Text search is a little more complicated. LateRooms tries to find an appropriate destination by matching its name against the supplied text. Then if one is found a geoshape is used to query the hotel index.

If there is no matching destination, a direct match against the name and the address will find the hotel. This allows customers to find hotels by place and by name using the same search box.

The Future

Now that we have the data we need in Elasticsearch we see a lot of other features we can develop. Features such as guaranteeing results even when hotels are full, allowing users to draw their own search areas and suggesting popular areas for major cities.

 

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

-ADS BY GOOGLE

SearchCIO

SearchSecurity

SearchNetworking

SearchDataCenter

SearchDataManagement

Close