Big BRAIN: Finding Connections in the Literature Flood with Euretos BRAIN
July 1, 2014
By Allison Proffitt
July 1, 2014 | Euretos is certainly not the first company to recognize the problems researchers have processing and keeping up to date with the latest research. While there’s ever more data pouring from researchers’ lab instruments, there’s also an ongoing flow to manage from journals and other publications; in the biomedical sciences, we average one and a half new publications per minute. That’s where Euretos hopes to help.
“We are basically transforming big data in the life sciences into what’s called ‘actionable knowledge’. Data itself doesn’t say anything,” explains Marco Wanders, Euretos’ head of sales. “We are comparing ourselves with an oil refinery. In the end, crude oil doesn’t mean anything… it is the end product that counts.”
The refinery is a cloud platform called BRAIN: the Bio Relations and Intelligence Network. The “crude oil” is data coming in from public data sources such as PubMed, ChEMBL, OMIM, GWAS Central, Google Scholar, Chemspider, Genbank, UniProt, and soon-to-be added sources like the European Patent Office, the US Patent and Trademark Office. Private datasets can be added as well, so that companies can use BRAIN’s capabilities in house on proprietary data. The company is in the process of launching an API.
The various datasets are joined in a way that creates a knowledge universe that is, “completely clean and de-duped,” Wanders says, so that “you are always finding the right things.”
To make sure that BRAIN always has the latest data, and that it’s accessible, Euretos has created a technology it calls interrelated data streets: on ramps for data sources to merge quickly onto the BRAIN superhighway. Data sources can be added in a matter of days, possibly weeks if highly complex, Wanders says. Existing data is refreshed almost real-time, Wanders says. Users never need to wait long for updates.
BRAIN, “gives you an enormously powerful tool to understand whether something could be even distantly related, and give you a research area or research direction you can explore,” Wanders says. “We are ensuring that formerly disconnected data sources are being accessible as a single [source].” BRAIN works to help researchers find connections between bits of data.
To start exploring, a researcher starts in the “left brain” or search mode. Single search terms could be anything from a disease to an enzyme, from a chemical compound to an author’s name, and are girded by a thesaurus of over 40 million life sciences terms. Or researchers can search by “nanopublications”, a concept promoted by the Concept Web Alliance.
“A nanopublication, in essence, is the smallest possible statement of knowledge. So basically A says something about B, or A does something with B,” explains Wanders. For instance, a researcher can query BRAIN for any connection between chronic immune activation and HIV Pathogenesis, he says. “BRAIN will then uncover any known and hidden relation between the two concepts even if never published before.”
Results that a researcher finds interesting can be stored in a workspace called the ‘knowledge lab’ to be investigated further and analyzed.
Analysis happens in the “right brain”, and there are several analysis options. BRAIN will find any known links between subjects, compare terms, or analyze trends found in the published research over time. BRAIN can also make predictions based on findings.
“There are specific prediction algorithms that are being developed in conjunction with Leiden University Medical Centre (LUMC) in the Netherlands,” Wanders says. When BRAIN has gathered and analyzed all of the data linking HIV and chronic immune activation, “You can, for instance, see that there’s a 79% chance that the two are somehow related,” he says. “That could be enough for a researcher to say, ‘Whoa… I had a hunch there, but now let’s further investigate that.’”
Learning to Ask New Questions
In this way, BRAIN helps identify research areas to explore or new research questions to ask, and that is one of the strengths of the system, says David Webb, an adjunct professor in the Department of Integrative and Computational Biology at The Scripps Research Institute.
Webb is a beta tester of Euretos program and has been using BRAIN for several months.
“If you use a traditional and typical search engine, [you are] essentially inventing your own algorithm as you go,” Webb says. “Using the advanced semantic algorithm, you will find relationships that you had no idea even existed, even within a field of which you’re an expert. And I found that just irresistible.”
After the novelty wore off, though, Webb found that the program actually challenged his thinking more than anything else. “The product itself is just remarkable. I mean, it will find for you relationships of all types that you weren’t even looking for. The problem you then encounter is to actually re-calibrate your own thinking about how you do searches for information to… get out what you’re really interested in.”
Webb has used other knowledge mining tools in the past—he mentions tools from Thomson Reuters and Papers for Mac for managing references—but he believes Euretos’ BRAIN tool is more comprehensive.
Webb has been testing the product for almost 6 months, and has incorporated it into his daily work. Over those months, Euretos has continued to update the user interface, making improvements, Webb says. After several months, “I now fancy myself pretty good at getting it to do what it is advertised to do and the more I use it the more I like it. In fact, as I think of research topics that I know relatively little about, my first thought is to go to Euretos-BRAIN software.”
Flood Management
If you want to keep up to date on a particular concept, BRAIN can do the updating for you, creating personalized alerts so you can watch a topic or a list of proteins. Or if even that seems like too much, Euretos is in the trial phase of launching a reports service. By entering a protein, gene, chemical compound, etc, users can with one click generate a report summarizing BRAIN’s findings on the topic.
It’s another way to stem the tide of data, something Euretos has been working on for a while.
Euretos was founded in 2012, after more than five years of collaboration between academic and private parties in The Concept Web Alliance. The Concept Web Alliance is committed to using a Semantic Web approach to organize the massive amounts of information flooding the biological sciences including storage, interoperability and analysis of such massive and disparate data sets. The Concept Web partners included LUMC, University of Amsterdam, Stanford, Yale, MIT, EBI, Harvard, Thomson Reuters, Nature, PLoS, and the Rockefeller Foundation.
The company has broad experience in bioinformatics, semantics, IT Supply, IT Integration, IT services, data networking, and big data decision support applications, explains Wanders. He says the BRAIN platform would be a good fit for “anything that is R&D intensive,” though the company is starting in the life sciences because most of the founders and scientific advisory board members come from a life sciences background.
Euretos has seed funding and three external investors, says Wanders, and the company is actively looking for other sources of investment.
The business model is based on unlimited user site licenses. “You can access it basically with as many people as you can,” Wanders says. The caveat is that users can all see the shared workspace. “In some companies that’s undesirable, then you can purchase an extra license and you can be completely separated from the other [users].”
The shared workspace, though, actually contributes to collaboration, Wanders believes. Researchers working in the same content areas can collaborate in one workspace, sharing their findings and both viewing the system’s analysis.
The collaborative nature can speed time to knowledge, the company believes. In one example Wanders shares, a group of four researchers had been reading 221 articles over the course of several weeks. They were able to reproduce the same findings in 10 minutes using BRAIN.
That kind of productivity might get addictive, says David Webb. “I imagine that this will be the kind of tool that—once you get good with it—you’re going to find it hard to put down.”