Fujitsu and Japan's National Institute of Genetics are building what they expect will be the world's fastest database when it opens later this year.
A prototype of the system based on Fujitsu's Shunsaku XML database engine has already been completed and is undergoing in-house testing at the genetics institute, which is also known as Idenken in Japan.
Idenken's database is a repository for data from all genome projects conducted by Japan's government in addition to all public-domain data from the Japan Patent Office. It currently includes 35 million records including the DNA pattern of 39.8 billion bases and its size is doubling every year.
More than 10,000 users consult the database each day making speedy searches a top priority for Idenken. Its current system is based on a relational database and takes about 10 minutes to complete a two or three keyword search while the prototype system has already slashed the search time to about 5 seconds.
The secret to Shunsaku's speed is a search algorithm that means it does not require an index. Each search is done in real-time and new documents can begin appearing in search results as soon as they are added to the database, said Nick Hayashi, a spokesman for Fujitsu.
Given a database with static contents, a relational database and Shunsaku would be able to complete a search in about the same amount of time. However, the Idenken database is constantly growing and that means the relational database index always needs to be updated, said Hayashi.
Shunsaku is always working on the database in real time so such problems do not affect it, he said.
Part of the ongoing work between Fujitsu and Idenken will cover optimising Shunsaku, which was originally designed for high-speed processing of text searches, to better handle complex data such as that found in the biotechnology field.
"We created the prototype to copy the functions of the existing database and are adding functions to it," said Hayashi. "We are going to enhance it further and it may become faster, maybe 200 times faster than the current relational database."
Shunsaku is already available in Japan under the name Interstage Shunsaku Data Manager Enterprise Edition and Fujitsu plans to put in on sale in the US later this year, said Hayashi.
Martyn Williams writes for IDG News Service