Microsoft and Google join forces on Covid-19 dataset

Data scientists and researchers are being encouraged to use data mining, artificial intelligence and machine learning to gain insights into the coronavirus pandemic

Microsoft Research and Google Cloud have joined forces as part of an initiative to open up datasets that help researchers combat the Covid-19 novel coronavirus.

Through the initiative, the Kaggle data science community acquired by Google cloud three years ago, Allen Institute for AI, Chan Zuckerberg Initiative (CZI), Georgetown University’s Center for Security and Emerging Technology (CSET), Microsoft and the National Library of Medicine (NLM) at the National Institutes of Health have collaborated to release the Covid-19 Open Research Dataset (Cord-19).

Michael Kratsios, US chief technology officer at The White House, called on the research community to collaborate to combat coronavirus. “Decisive action from America’s science and technology enterprise is critical to prevent, detect, treat and develop solutions to Covid-19,” he said.

“The White House will continue to be a strong partner in this all-hands-on-deck approach. We thank each institution for voluntarily lending its expertise and innovation to this collaborative effort, and call on the US research community to put artificial intelligence [AI] technologies to work in answering key scientific questions about the novel Coronavirus.

The collaborative effort includes a challenge aimed at encouraging researchers to provide new insights based on the reams of data being produced about coronavirus.

“We are issuing a call to action to the world’s artificial intelligence experts to develop text and data mining tools that can help the medical community develop answers to high-priority scientific questions. Cord-19 represents the most extensive machine-readable coronavirus literature collection available for data mining to date,” wrote Kaggle’s head of marketing, Anna Montoya, in a forum post.

“Cord-19 represents the most extensive machine-readable coronavirus literature collection available for data mining to date”
Anna Montoya, Kaggle

“This allows the worldwide AI research community the opportunity to apply text and data mining approaches to find answers to questions within, and connect insights across, this content in support of the ongoing Covid-19 response efforts worldwide. There is a growing urgency for these approaches because of the rapid increase in coronavirus literature, making it difficult for the medical community to keep up.”

According to Kaggle’s post on Twitter, the Covid-19 Open Research Dataset will give the worldwide AI research community the opportunity to use text and data mining approaches and natural language processing techniques to find answers to questions that support the ongoing coronavirus response efforts worldwide.

“There is a growing urgency for these approaches because of the rapid increase in coronavirus literature, making it difficult for the medical community to keep up,” wrote Kaggle.

Microsoft’s chief scientific officer, Eric Horvitz, added: “We need to come together as companies, governments and scientists, and work to bring our best technologies to bear across biomedicine, epidemiology, AI and other sciences. The Cord-19 literature resource and challenge will stimulate efforts that can accelerate the path to solutions on Covid-19.”

Read more about technology and coronavirus

  • CIOs need to consider the IT measures required to support their organisations as the government ramps up its response to the coronavirus.
  • As major cities around Europe enter the lockdown phase of the Covid-19 pandemic, people will increasingly rely on online services to stay in touch and to order groceries and other essentials.

Read more on Artificial intelligence, automation and robotics

CIO
Security
Networking
Data Center
Data Management
Close