Engineers at Mitsubishi's main research and development centre have
put parallel computing to use in developing an information
retrieval system that promises faster searches through large
amounts of data.
The system, unveiled on 6 February, is based on 16 PC servers
connected through a gigabit switch and controlled via a host PC. It
boasts a free keyword, full-text search of 100 billion characters
per second, according to developer Atsushi Murata.
Traditional information retrieval systems are faltering as the
amount of data increases, because of a bottleneck in getting the
data out of huge databases, said Murata. Some systems cache heavily
used data in memory to get around this, but response gets slower
again as soon as the memory cache is full. In contrast,
Mitsubishi's new system splits the data between PC servers, which
share the load, and leads to increases in search speed.
"We put the CPU near the storage so we can get data fast from
storage," said Murata.
In the demonstration system, the PC servers, which were
off-the-shelf 1U-height 1GHz Pentium III-based models produced by
Mitsubishi, had three 36Gbyte hard disk drives attached for a total
108Gbytes of storage space per server and 1.7Tbytes for the whole
system. It can be scaled up to 256 PC servers and 27Tbytes of data
and extra servers can be added without the need to update the
application software, said the company.
The servers run Linux while the host computer, which handles the
queries and delivers the results, is based on Windows 2000, said
Murata. The company has plans to put the system on sale later this
year in Japan. Pricing was not disclosed.