Catch of the day: A net full of trees

Easier way to create phylogenetic networks

NORWICH, ENGLAND - Jul 25, 2017 - Catch of the day: A net full of trees

SPECTRE, a new open-source software package, simplifies the complex business of creating phylogenetic networks and trees. It has been written by bioinformaticians at the Earlham Institute.

Visual representations of datasets are valuable for analysing the interrelatedness of species and for presenting findings for publication. However, networks in particular demand complex bioinformatics that challenges software developers and evolutionary biologists alike.

Popular tools for visualising non-treelike evolution use algorithms and data structures to create networks. However, so far there is a lack of high-quality open source software making it harder to reuse and manipulate code for new tools and projects.

Now SPECTRE makes the source code of popular programs openly available, enabling researchers to adapt code to their own needs. Software developers can also use parts of the code as building blocks for creating new methods. Bioinformaticians can also run the tools in High Performance Computing environments.

With SPECTRE, we hope to help speed up innovation by developers in phylogenetic methods and make it easy for biologists to visualise and analyse their datasets themselves, says Sarah Bastkowski from the Earlham Institute.

The package makes it possible to identify and understand conflicts in data caused by events such as horizontal gene transfer and hybridisation.

When I presented my work at conferences, delegates often told me how useful the algorithms and data structures I created could be if I was able to make them easily accessible. I decided to package them with methods that were already in use and make the source code openly available for the first time, says Bastkowski.

During her PhD, Bastkowski had to create her own algorithms, data structures and file parsers for working with split networks. With SPECTRE, researchers and software developers no longer have to start from scratch, saving time and avoiding some of the challenges traditionally associated with creating phylogenetic networks and trees.

Mapleson used his industry background as a software developer to ensure the software provides easy-to-use building blocks for developers and a user-friendly interface for biologists.

Not only can users create networks and trees using the tools, but they can also view, manipulate and save them as high quality graphics for use in publications.

The comprehensive library of algorithms, source code and programs in SPECTRE make it a valuable new resource for further new developments in the field, including by students, says Professor Vincent Moulton from the School of Computing Sciences at the University of East Anglia.

As people use the software, they will have their own ideas for how to improve it. These ideas can be incorporated into it, helping to ensure that the library will continue to grow and flourish, he says.

Some of the tools use distances between objects of study to measure relationships and could therefore be used in a number of other fields. For example, to study the evolution of languages or patterns in geographical landscapes.

SPECTRE website - http://www.earlham.ac.uk/spectre

Tools included in the software: NeighborNet, NetMake, QNet, SuperQ, FlatNJ, NetME.

About Earlham Institute

The Earlham Institute (EI) is a leading research institute focusing on the development of genomics and computational biology. EI is based within the Norwich Research Park and is one of eight institutes that receive strategic funding from Biotechnology and Biological Science Research Council (BBSRC) - £6.45M in 2015/2016 - as well as support from other research funders. EI operates a National Capability to promote the application of genomics and bioinformatics to advance bioscience research and innovation.

EI offers a state of the art DNA sequencing facility, unique by its operation of multiple complementary technologies for data generation. The Institute is a UK hub for innovative bioinformatics through research, analysis and interpretation of multiple, complex data sets. It hosts one of the largest computing hardware facilities dedicated to life science research in Europe. It is also actively involved in developing novel platforms to provide access to computational tools and processing capacity for multiple academic and industrial users and promoting applications of computational Bioscience. Additionally, the Institute offers a training programme through courses and workshops, and an outreach programme targeting key stakeholders, and wider public audiences through dialogue and science communication activities.