Antoinee - stock.adobe.com

Natural History Museum deploys sensor network to decode urban biodiversity with AWS

The Natural History Museum has deployed a network of sensors across its newly revamped gardens, which are on course to make it one of the most intensely monitored urban spaces in the world

The grounds of the Natural History Museum (NHM) in London have undergone a complete transformation in recent years that has served to ensure visitors are fully immersed in the wonders of the natural world as soon as they step onto the site.

As visitors enter the grounds via the London Underground tunnels, which also link the NHM to the other museums and local attractions dotted along Exhibition Road, they now do so along a rock-lined walkway, populated with ferns, cycads, and dinosaur models, as part of a walkthrough of the Earth’s evolutionary story.

To the left of the building, just a stone’s throw from the hustle and bustle of South Kensington, is the NHM Nature Discovery Garden, which opened to the public in the summer of 2024.

The space is teeming with flora and fauna, which provides a “living laboratory” for the museum’s 400 scientists, who are tracking the biodiversity of the space with the help of a network of sensors that are dotted around the place.

“The ambition with the museum garden sensors is the creation of this ‘always-on’ outdoor laboratory for collecting and ingesting urban environmental data,” NHM technology product manager Rachel Wiles tells Computer Weekly. “At the moment, we have a corpus of somewhere around 58,000 visual observations from the museum garden, but these sensors will do things beyond that.”

The sensors will capture environmental data through tracking the temperature and humidity of the garden’s soil and waterways, for example, and record extremely detailed acoustic data.

“You might hear bird song in the garden prominently, but the sensors are in tune enough to hear the flapping of an insect wing and pieces like that,” adds Wiles. “Our goal is to combine all the sensor data that we’ll be getting with the huge corpus of visual observation data that we’ve already got to produce this research platform, which our scientists can access and use.”

Connecting the dots

The Nature Discovery Garden’s data is collected from multiple sensors that include 25 Raspberry Pi devices that are linked together by a complex data network.

“There are sensors and there are the devices, and device boxes can have sensors in, which is also where the Raspberry Pi and all the computational power for the setup is,” Ed Baker, an acoustic biology researcher at the NHM, tells Computer Weekly. “Each of Raspberry Pi device boxes has some form of audio sensor in them, and can have a multitude of other environmental sensors [linked to] them.”  

As an example, Baker cites one of the devices that is used to record pond audio, which is also linked to a series of thermometers that are at different depths of the pond to assess how differences in water temperatures might affect what grows or lives there. The acoustic sensors are also equipped with multiple microphones so that sound can be picked up from various directions.

To help Baker and his team identify the source of the sounds being picked up, they are making use of machine learning models from Cornell Lab, including bird identification app Merlin, he says.

Acoustic biology researcher Ed Baker stands in the Natural History Museum's pond. He is wearing pond waders and is holding up a cable that has been submerged in the pond to capture audio data.
Acoustic biology researcher Ed Baker works with a Raspberry Pi device placed in the NHM’s pond to gain wildlife insights through audio data

“It has various lists of [bird] species, and there is one that can tell the difference between human voices, cars and everything else going on in the environment,” he says. “Most of the time, you are recording to avoid noise, but [we want to understand] how birds interact with the noise of urban environments, so we want to keep the noise. It’s very different to working in a nice, pristine forest or an acoustics lab, but we want to decompose the soundscape and then get some meaning from it.”

Presently, only modest amounts of the Raspberry Pi’s compute power in the device boxes is used, but there is potential for them to provide edge computing-like capabilities to aid in the processing of data collected by the sensors in the future.

“The gardens were really extensively remodeled for the first time since 1881 before it reopened last summer, and it’s probably going to be that long again before we get a chance to change anything,” says Baker. “So, we went for slight overkill on the computation power in the gardens for now, and on the infrastructure underpinning it all, so we do have some level of future proofing.

“For where we are now, the data transfer back [to where the data is processed) is really easy, so doing that right now on the edge isn’t important. As we expand out to some of our other sites, where we can’t dig up their gardens, we’ll revisit that.”

Sensor install and deployment

While the Nature Discovery Garden has been open to the public for more than year, the sensors themselves were switched on in mid-September 2025, after a period of sensor installation and testing.

“There have been some challenges with the sensor install, as it comes to managing timelines across the estate,” says Wiles.

For example, the terracotta brickwork on the exterior of the museum has been under restoration for some time, and the sensor install project has had to fit in with the timelines for that work, she says. The data collected by the sensors will be fed into an Amazon Web Services (AWS) product stack known as the Data Ecosystem Platform, which Wiles is the product lead for, having joined the project eight months ago.

A landscape shot of the curving pathway and pond of the Natural History Museum’s Discovery Garden in London. It shows a lush, green space, with a bench for visitors. In the background is both old and new architecture, made of brick and glass respectively.
The NHM Nature Discovery Garden allows the museum’s scientists to track the biodiversity of the flora and fauna

She describes her role as being the “bridge between the engineering and science teams”, as it her job to ensure the huge amounts of data being collected are processed efficiently.

“We had a testing week of the sensors recently,” she says, in response to a question about the scale of the data being collected for the project, “And we’ve had about 36,000 recordings per day over the whole sensor network, which are all fed into the Data Ecosystem. We expect that in just audio data alone, we’re going to generate around 20 terabytes of data a year, so there are big considerations to take in terms of the tech we use to process that and feed it into the system.”

As well as information gleaned from the garden’s sensors, there is also data collected during the museum’s BioBlitz events, whereby people are invited to record and share details of the flora and fauna they have spotted in the gardens during a defined period.

“Most of the [BioBlitz] data is coming in through an iNaturalist pipeline into the Data Ecosystem, and – in terms of the sensor data – what we’ll be doing is batch uploading that into the Data Ecosystem so we can check and monitor everything’s coming in as expected.”

Working through the challenges

With the sensors now switched on, however, there is a hope that uploads to the Data Ecosystem will happen with far more regularity.

“We’re moving on to an automation phase, to have that ‘always-on’ element and live views of what the data looks like,” says Wiles. “At the moment, it might be that we only have [the data] once a day, but we want to have this continuous flow of data coming in once we verify that this first set of sensor data is working as expected.”

There have also been “lots of technical considerations” to work through to ensure the information being fed into the Data Ecosystem Platform is being inputted consistently, given there is “all this data coming in to one place from many, many sources”, says Wiles.

“In the natural world, we have about two million observed species, but we have about 20 million different ways of naming those two million species, so we have had to build a taxonomic equivalence engine, which is an internal process that we use to make sure that the namings across data sets are equivalent.

“And that means our researchers have a very standardised, correct and robust way of making sure that they’re seeing all the records for a specific taxonomy, which will really help bring that data together, create connections and speed up research.”

Digging into the Data Ecosystem

As previously mentioned, the Data Ecosystem Platform is built exclusively from AWS technologies, including Amazon DocumentDB and Amazon S3 buckets for data storage, and the AWS Glue serverless integration offering to ingest the sensor data.

“We’re now using Amazon QuickSight to [assist with our] visual observations [data], which is fantastic because you can use Amazon Q to pivot and access that data to see new viewpoints that you haven’t particularly seen before, and do things like visualise it on a map, visualise it as a graph and unlock new insights,” says Wiles.

With the garden sensors now switched on, there will be additional data coming into the platform that NHM also plans to use the Amazon Q generative artificial intelligence (AI) to query, and there will be a role for Amazon Sagemaker in the setup. The latter technology is billed by AWS as combining machine learning with analytics capabilities to help users make sense of their data.

“We’re aiming to turn on Sagemaker as our primary analysis tool later this year,” says Wiles. “And once that’s done, we can have our team of researchers access all of the data that’s in the underlying Data Ecosystem, so they can do their research natively within Sagemaker.”

For Baker, however, the focus is now on using the sensors to get a year’s worth of data to build a baseline picture of what impact seasonality has on temperature patterns across the site, for example.

“I want to get a year’s baseline [of data] because the UK is very seasonal, so have to be sure that any differences you’re finding are not just down to seasonality,” he says. “We need solid data that needs solid interpretation so that we can draw up solid policies that mean we can introduce mitigations that make urban places like this nice for everyone.”

As an example, he cites the amount of noise pollution the gardens are subject to, on account of its location in the middle of South Kensington, and how that could be mitigated.

“There’s not much vegetation between us and the road, and in winter the noise pollution is a lot louder. And you could introduce interventions to tackle that and boost biodiversity, such as planting trees and shrubs, but which ones? Because we want to make urban spaces nicer for people and give nature a bit of a fighting chance.”

Read more about cloud-related environmental projects

Read more on Infrastructure-as-a-Service (IaaS)