Carmen study unites computer science with neuroscience

The University of Newcastle is one of 20 research teams working on a project to bring together neuroscience and computer science.

The University of Newcastle is one of 20 research teams working on a project to bring together neuroscience and computer science.

The four-year, £5m study, called Carmen, is looking at how computer science can help neuroscientists manage the vast quantity of data they use in their research.

Paul Watson, director of research at University of Newcastle, says, "The main problem is the data deluge. Neuroscientists will create 100 Tbytes of data from experiments over the next few years."

This data is generated by 100,000 neuroscientists, and comprises molecular, neurophysiological, anatomical and behavioural information. Data is expensive to collect but rarely shared in proprietary formats and locally described. "Each laboratory collects its own data, which is then stored in its own data format. So there is no way to share research," Watson says.

The result has been a shortage of analysis techniques that can be applied across neuronal systems limiting interaction between research centres with complementary expertise.

Clearly this limits the usefulness of all the research data of individual laboratories. Watson says, "We needed to put in place a system to store data and allow collaboration."

This is the idea behind Carmen. The aim of the project is to enable sharing and collaborative exploitation of data, analysis code and expertise that are not physically located in the same place. Watson says, "The architecture we've used is based on cloud computing. Rather than store data on your own computer, we store it on the internet."

Amazon and IBM already offer commercial cloud services offering basic computing services across the internet - but e-Science needs a set of higher-level services to support user needs.

"We want scientists to be able to analyse the data that is shared in the Carmen cloud," says Watson. From experience of previous e-science projects Watson has seen the benefits of encouraging users to create and share their own computer programs that can be combined together in order to run data analysis.

For example, a user may want to extract some data taken from human brain tissue, identify where spikes of activity occur, find interesting patterns of spikes and then produce a movie to visualise the results. In Carmen, each of these steps would be provided as a separate piece of code which users could select and combine as they require - for example there may be a range of spike identification algorithms, and they may wish to pick the one that is best for the type of data they are working with, he says.

For this to work, there needs to be a standard way to package the pieces of code so that they can be combined and run in the same way, irrespective of what they do internally. This is achieved using Web Services. Within Carmen, the workflow systems provide a graphical interface that allows users to treat Web Services as building blocks, chaining them together to make larger data analysis computations.

Watson says, "Over time, we will build up a library not just of data, but also of these building blocks, so encouraging sharing and re-use."

When users upload new Web Services they have written, these services are stored in a repository, which is controlled by a web services management and deployment tool called Dynasoar. When a user wishes to execute a Web Service, Dynasoar moves it from the repository, deploys it on an available computing node (if it isn't already deployed) and runs it," Watson explained.

Watson's team has been working with the Royal Infirmary in Newcastle to pilot Carmen. It is being used as part of a procedure to help patients with severe epilepsy. The procedure involves analysing brain tissue. Data from the analysis of the tissue is loaded into Carmen, and can now be accessed by experts.

Data analysis is used to guide the surgeon removing brain tissue. The system also allows online analysis by experts located in different parts of the country which, in the future, will enable experiment to be defined during data collection.

The project has been running for just over a year using a prototype cluster running at the universities of York and Newcastle.

Read more on Data centre hardware