2016 Bio-IT World Best of Show People's Choice Award Contenders
March 29, 2016
Update: Voting is closed.
March 29, 2016 | Bio-IT World is pleased to announce the 2016 Best of Show competition with the Bio-IT World People’s Choice award.
The Best of Show Awards offers exhibitors at the Bio-IT World Conference and Expo an opportunity to showcase their new products. A team of expert judges views entries on site and chooses winners in four categories based on the product’s technical merit, functionality, innovation, and in-person presentations.
In addition to the four judges’ prizes, Bio-IT World presents a People’s Choice Award as well, which is chosen by votes from the Bio-IT World community. All of the Best of Show entries will be eligible for the People’s Choice Award. Voting will open at 5:00 pm ET on Tuesday, April 5, and will remain open until 1:00 pm ET on Wednesday, April 6.
The four awards named by the judges and the People’s Choice Award will be announced at a live event on the Bio-IT World Expo floor at 5:30 pm on Wednesday, April 6.
We are excited to have the community’s input again this year on the best new products on display at Bio-IT World. Watch the Bio-IT World Twitter account @BioITWorld and #BestofShow16 for the voting link next Tuesday at 5:00 ET.
-- The Editors
2016 Bio-IT World Best of Show Contenders
Finalist: Aspera, an IBM Company, Booth 261
Product Name: Aspera Files
asperasoft.com/software/aspera-files-saas/
Aspera Files is a single software-as-a-service (SaaS) offering that enables biomedical research organizations to quickly, easily, and securely exchange large datasets like DNA sequences between geographically dispersed research scientists. It enables high-speed delivery of large datasets over commodity networks and reliable high-speed transport directly into cloud storage on multiple platforms, making new analytical and diagnostic workflows possible.
Finalist: Aspera, an IBM Company, Booth 261
Product Name: Aspera Files
asperasoft.com/software/aspera-files-saas/
Aspera Files is a single software-as-a-service (SaaS) offering that enables biomedical research organizations to quickly, easily, and securely exchange large datasets like DNA sequences between geographically dispersed research scientists. It enables high-speed delivery of large datasets over commodity networks and reliable high-speed transport directly into cloud storage on multiple platforms, making new analytical and diagnostic workflows possible.
Sharing is as easy as drag-and-drop regardless of location and workspace, and data remains private with easy peering with trusted third parties. There are no file size or speed limits, and organizations may use any combination of storage types.
Game-changing capabilities include:
1. Fast and convenient data sharing and exchange regardless of file size, combining multiple on-premise and cloud storage systems. Users manage data by dragging files and directories between panes in the browser interface. Shared data appears automatically in the workspaces of target users with the appropriate permissions. 2. An easy-to-manage, intuitive experience for non-technical users, available on-demand. Organizations can establish a web presence and be operational immediately, with branded workspaces for project-based file sharing. 3. Direct transfer and storage of even the largest files in on-premise and new cloud object storage systems, including Softlayer SWIFT, Amazon S3, Azure BLoB, Google Storage, on-premise Aspera servers and SAN/NAS/local storage.
4. Easy, fast spin-up/spin-down of additional transfer bandwidth.
5. Complete data security within the organization, and with third-party partners and peered organizations.
6. A platform approach with APIs allows organizations to leverage native file sharing, transfer and scale-out properties within their own systems.
Avere Systems, Booth 536
Product Name: Avere Virtual Edge filer
http://www.averesystems.com/products/vfxt
The Virtual FXT Edge filer is a software-only product that runs in the public or private compute cloud alongside the applications, providing low latency access to the active data and enabling applications to run at maximum performance. The vFXT is simple to install and manage, provides best-in-class NAS functionality (including NFS and SMB), and clusters to deliver high availability, scalable performance and capacity. It enables organizations to take advantage of the flexibility and enormous scale of cloud computing with no radical changes to applications or storage infrastructure.
1. Fast and convenient data sharing and exchange regardless of file size, combining multiple on-premise and cloud storage systems. Users manage data by dragging files and directories between panes in the browser interface. Shared data appears automatically in the workspaces of target users with the appropriate permissions. 2. An easy-to-manage, intuitive experience for non-technical users, available on-demand. Organizations can establish a web presence and be operational immediately, with branded workspaces for project-based file sharing. 3. Direct transfer and storage of even the largest files in on-premise and new cloud object storage systems, including Softlayer SWIFT, Amazon S3, Azure BLoB, Google Storage, on-premise Aspera servers and SAN/NAS/local storage.
4. Easy, fast spin-up/spin-down of additional transfer bandwidth.
5. Complete data security within the organization, and with third-party partners and peered organizations.
6. A platform approach with APIs allows organizations to leverage native file sharing, transfer and scale-out properties within their own systems.
Avere Systems, Booth 536
Product Name: Avere Virtual Edge filer
http://www.averesystems.com/products/vfxt
The Virtual FXT Edge filer is a software-only product that runs in the public or private compute cloud alongside the applications, providing low latency access to the active data and enabling applications to run at maximum performance. The vFXT is simple to install and manage, provides best-in-class NAS functionality (including NFS and SMB), and clusters to deliver high availability, scalable performance and capacity. It enables organizations to take advantage of the flexibility and enormous scale of cloud computing with no radical changes to applications or storage infrastructure.
The vFXT works to:
• Connect on-premises storage (NFS and SMB) to cloud compute resources (cloud bursting), or
• Build a NAS for the cloud, optimizing the use of public cloud storage with its cloud compute counterpart (Cloud NAS), or
• Build both cloud bursting and Cloud NAS into a flexible enterprise hybrid infrastructure
Bina Technologies, part of Roche Sequencing, Booth 237
Product Name: Bina Genomic Management System
Bina.com
The Bina GMS is a platform for management of large-scale genomics projects and analysis pipelines that transform fastq files to annotated variants, all accessible from a user interface that meets the needs of multiple user types, including IT, bioinformaticians and bench scientists.
At its core, the Bina GMS offers an extensive suite of genomics tools within the Bina Read Alignment, Variant Calling and Expression (RAVE) software for germline DNA-Seq, RNA-Seq, and somatic mutation analyses. Resulting variant output can be further qualified and filtered using the Bina Annotation & Analytics Intelligence Module (AAiM) software for tertiary analysis, which brings together an extensive set of annotations.
Encapsulating the analytical layer are fit-for-purpose user interfaces that present genomic insights in a consumable manner. Bench scientists can run standard analyses through a user-friendly interface, while bioinformaticians can customize pipelines through command line modules.
Finally, data management tools allow organizations to track data and processes to ensure reproducibility while providing a platform that meets security and compliance requirements. Hybrid deployment options provide further flexibility, enabling data processing that scale to the thousands of samples through the cloud and allowing for collaboration among geographically-dispersed project team members.
Since 2015, we've made significant improvements to the UI and added several benchmarked + Bina authored analyses in the platform.
BioTeam, Booth 361
Product Name: BioTeam Appliance Galaxy Edition
http://bioteam.net/bioteam-appliance/galaxy-edition/
The BioTeam Appliance Galaxy Edition is a push-button solution that let’s researchers get up and running quickly with Galaxy. The Galaxy Appliance comes preinstalled with a production instance of Galaxy and some of the most commonly used bioinformatics tools and reference datasets. This powerful system is specifically configured with BioTeam's best-practices for computationally intensive scientific workloads. Most importantly, the Galaxy Appliance is an open system so researchers and can use the Appliance as their own high-performance informatics server, independent of Galaxy. It provides the best of both worlds to support both technically savvy informaticians and scientific users who need a graphical interface to do data analysis and manage their system. BioTeam provides ongoing support for the Galaxy Appliance, enabling researchers to minimize their IT burden and focus on their research. The Galaxy Appliance is used by researchers around the world for metagenomic, ChIP-Seq, RNA-Seq analysis and more. The Galaxy Appliance is a cost-effective data analysis solution that combines flexibility, so researchers can tailor their system, and simplicity, with push-button Appliance management and Galaxy instance.
Bluebee, Booth 426 & 428
Product Name: Genomics Platform 1.3.2
Bluebee.com
Bluebee addresses these genome analysis challenges by providing a highly scalable private cloud platform for accelerated processing of mass volumes of NGS data. That alone does not make us unique, but the innovative combination of HPC in the cloud, enhanced security features & local data processing, flexibility & user convenience alongside facilitating sharing & collaboration do.
By using advanced HPC techniques, Bluebee significantly accelerates data processing while at the same time solving the data throughput issue. Unlimited up-scaling of data analysis throughput is achieved through on-the-fly provisioning of computational and data storage capacity in the cloud. It is Bluebee’s belief that the need for faster and cheaper processing should not be addressed by using short-cut algorithms. Good science remains essential. With Bluebee it is perfectly feasible to continue using gold-standard algorithms, while still achieving the unparalleled throughput objectives. Bluebee has a highly secured platform with fast, affordable processing and provides full control for configuration, data-sharing and inter-institute collaboration.
To meet the most stringent regulations in terms of data residency and country-specific requirements Bluebee operates in geographically distributed high performance computing centers. This enables customers to process and store their data in the region where they operate their business, and even allow international players such as sequencing service providers and diagnostic test providers to roll-out data analysis and storage internationally over different local datacenters whilst keeping full control through one powerful user interface.
Cambridge Semantics, Booth 333
Product Name: Anzo Smart Data Lake
http://www.cambridgesemantics.com
New smart data tools are rapidly overcoming the common challenges presented by the newly emerging data lake, such as harmonizing the data and making it available to business users. This process has previously been labor intensive, requiring many dedicated hours of skilled data scientists and IT staff.
With Cambridge Semantics’ award-winning Anzo Smart Data Lake (SDL) solution, customers are enjoying the benefits of immediate insights from big data analytics. The solution makes it easy to semantically link, analyze and manage diverse data, structured and unstructured, at big data scale, and makes it available for self-service consumption by business users. The graph models in Anzo SDL provide users with self-service data discovery, analytics and visualization capability across all entities and relationships in the data lake.
The Anzo SDL software is built to fit within a company’s existing IT ecosystem. Though deployable as an end-to-end solution, the architecture also affords integration with existing Hadoop or other data lake environments.
Innovative life science companies are leveraging the benefits of Cambridge Semantics’ smart data solutions across the entire R&D lifecycle to reduce time-to-market, lower their regulatory risks, and optimize their pipeline investments. These solutions are transforming the R&D landscape through better competitive intelligence, site intelligence and selection, clinical trial data integration and discovery, scientific data integration and collaboration, pharmacovigilence and safety surveillance, and real-world, evidence-driven clinical trial design. Cambridge Semantics is putting big data analytics into the hands of R&D teams for immediate data insights and business value.
Finalist: Cleversafe, Booth 554
Product Name: IBM Cloud Object Storage
http://www.cleversafe.com
Cleversafe, an IBM company, is the object storage market share leader, providing on-premise, public cloud and hybrid cloud solutions that solve petabyte, exabyte-and-beyond storage challenges, with unprecedented choice, control and efficiency. Relied upon by the world’s largest data repositories, provides enterprise-grade security, reliability and simplified storage management.
IBM’s Cloud Object Storage system combines technology from the recently acquired Cleversafe with IBM Cloud to deliver healthcare providers and life science organizations a fast, flexible, hybrid cloud storage solution. As a dedicated storage system, the solution provides a single-tenant system running on dedicated servers in the IBM Cloud. Available as an IBM managed service or as a self-managed cloud solution, this approach gives clients access to object storage with no need for extra hardware or data center space.
Additional IBM Cloud Object Storage services will be available in the second quarter, including:
• Nearline will provide a cloud infrastructure for infrequently accessed data at a lower cost than most off-premise options. Nearline is ideal for archive, back-up, and other workloads delivered across select IBM Cloud data centers.
• Standard will provide a higher performance public cloud offering based on proven Cleversafe technology with new S3 API interfaces. This service is ideal for a wide range of high performance applications written to use the S3 object storage API.
This solution delivers proven interoperability and unified management between storage software and certified hardware platforms from a single interface. This seamless integration enables higher system availability and simplified system management.
Copyright Clearance Center, Booth 453
Product Name: Rightfind XML for Mining
Copyright.com/xmlformining
RightFind XML for Mining is a cloud-based solution that gives researchers direct access to full-text scientific, technical, and medical content in XML format. The solution enables users to identify and download full-text article collections from multiple publishers – including publications to which they subscribe and articles that fall outside company subscriptions – through a single source.
Users can retrieve a corpus of XML articles using a Boolean search query, by submitting existing articles for similarity comparison, or by providing article DOIs or PMIDs obtained via the user’s preferred search platform. Users can either download metadata and abstracts at no cost for unsubscribed articles, or acquire the full-text XML articles using a budget decision engine that enables efficient content purchasing at scale for text mining. All articles are text-mining-ready and can be imported into the user’s software of choice (e.g., Linguamatics I2E, IBM Watson).
XML for Mining employs a simple Web interface and can also be accessed directly through a RESTful API for integration with other programs and interfaces.
Finalist: Core Informatics, Booth 437
Product Name: Platform for Science Marketplace
coreinformatics.com/platform-for-science/
The Platform for Science Marketplace is the first of its kind catalog of pre-configured scientific applications designed to flexibly support the changing needs of today's laboratories, integrate data sources, and speed informatics deployments. The PFS Marketplace apps are built on Core Informatics' stable, extensible and user-friendly Platform for Science database, and are compatible with version 5.1 and above. The apps in the Platform for Science Marketplace integrate seamlessly with one-other and with Core Informatics’ products, including Core LIMS, ELN, SDMS and Core Collaboration, for customers engaged in all phases of scientific product development and innovation.
Finalist: Core Informatics, Booth 437
Product Name: Platform for Science Marketplace
coreinformatics.com/platform-for-science/
The Platform for Science Marketplace is the first of its kind catalog of pre-configured scientific applications designed to flexibly support the changing needs of today's laboratories, integrate data sources, and speed informatics deployments. The PFS Marketplace apps are built on Core Informatics' stable, extensible and user-friendly Platform for Science database, and are compatible with version 5.1 and above. The apps in the Platform for Science Marketplace integrate seamlessly with one-other and with Core Informatics’ products, including Core LIMS, ELN, SDMS and Core Collaboration, for customers engaged in all phases of scientific product development and innovation.
The Core SDK (Software Development Kit), which includes Core Informatics' RESTful APIs (Application Programming Interfaces), and Platform for Science training materials enable scientists and software developers to build individually configured applications on top of the platform. End users can configure their own applications, or purchase applications created by Core Informatics, instrument vendors, Independent Software Vendors (ISVs), or other users.
Existing applications include configurable solution sets for Biopharmaceutical Drug Discovery (small molecule and biologics), Personalized Medicine and Genomics (including Next-Gen Sequencing - NGS) and Biobanking laboratories. The breadth and depth of available solutions continues to expand, with new applications being added for customers across industries and scientific capabilities such as Analytical Testing, Animal Studies, Bioprocessing, High Throughput Screening, Protein Engineering and more. Apps are also being co-developed with partners, such as Affymetrix and Biomatters, to create consistent plug-and-play integrations for tools and workflows.
Finalist: Cycle Computing, Booth 461
Product Name: CycleCloud 5.5
http://cyclecomputing.com/
The CycleCloud orchestration suite manages the provisioning of cloud infrastructure, orchestration of workflow execution and job queue management, automated and efficient data placement, full process monitoring and logging, all within a fully secure process flow. CycleCloud easily leverages multi-cloud environments moving seamlessly between internal clusters, Amazon Web Services, Google Cloud Platform, Microsoft Azure and other cloud environments.
The solution provides a web-based GUI, a command line interface, and a set of APIs to define cloud-based clusters. Once defined according to policies set by system administrators, CycleCloud can auto-scale clusters by instance types, maximum cluster size, and costing parameters. It rapidly deploys everything from modest sized systems of 64-6,400 cores to systems that rank as some of the fastest computers in the world (156,000+ cores), while validating each piece of the infrastructure to insure a complete and robust environment. Additionally, it syncs in-house data repositories with cloud locations in a policy / job driven fashion. This enables data driven batch submissions where compute infrastructure is only provisioned when data is in place, saving costs and improving efficiency.
CycleCloud capabilities include:
• Provision, manage, and orchestrate cloud infrastructure from multiple providers
• Dynamic scaling of large computation, Big Data, Big Compute and HPC workloads
• Utilization reporting, logging, and auditing capabilities
• Provision, manage, and orchestrate cloud infrastructure from multiple providers
• Dynamic scaling of large computation, Big Data, Big Compute and HPC workloads
• Utilization reporting, logging, and auditing capabilities
CycleCloud can automatically scale to meet the needs of applications by elastically provisioning infrastructure based on user-defined templates. These templates offer complete control of the operational characteristics allowing dynamic optimization of workloads over multiple dimensions such as cost, scale, resiliency, and performance.
Dassault Systems, BIOVIA, Booth 416
Product Name: ScienceCloud Hybrid Cloud Platform 2016
https://www.sciencecloud.com/
BIOVIA ScienceCloud is an ISO27001-certified, cloud-based information management/collaboration environment supporting globally-networked drug discovery R&D. It hosts applications for registering and managing chemical/biological entities, assay data management, inventory handling and IP capture. In 2015 BIOVIA released a cloud development environment based on BIOVIA Pipeline Pilot, a set of associated component collections and a deployment environment for desktop and mobile devices supporting “hybrid cloud.” This environment includes the following tools:
• ScienceCloud Authoring: A cloud-based version of the industry-standard Pipeline Pilot scientific workflow application for authoring and managing scientific services. Teams can implement automated services to validate, clean and synchronize cloud data with on prem. Pipeline Pilot supports a visual programming paradigm for constructing complex data flows and application extensions.
• ScienceCloud Project Data Collection: Enables creation of services for data upload, business rules standardization, scientific data validation and data synchronization with on-premises databases.
• ScienceCloud Project Documents Collection: Enables creation of services for uploading, downloading and searching a cloud-based document management system.
• ScienceCloud Notifications Collection: Enables creation of services for cloud and on-premises applications to generate notifications to bolster project communication.
• ScienceCloud Publication: Tools for testing, validating and publishing services in a public cloud environment.
• ScienceCloud Web Tasks: An environment for deploying data transfer tasks to project scientists, both in desktop environments and mobile devices.
• ScienceCloud Tasks: A mobile application for executing project-related tasks on smart phones and tablets.
• ScienceCloud Protocol Exchange: A website where scientific developers can share services with the external user community to drive rapid innovation.
Finalist: Dassault Systems, BIOVIA, Booth 416
Product Name - The Living Heart Model
http://www.3ds.com/heart
Advanced imaging modalities 3DCTA and DT-MRI were used to define the physical attributes of the heart structure, while the conservation laws of continuum mechanics capture the physical response. The LHM contains well-defined anatomic details including internal structures (e.g., heart valves, chordae tendineae, coronary arteries and veins) and proximal vasculature (e.g., aortic arch, pulmonary trunk, and SVC). Muscle fiber orientations, which vary across the surface and thickness of the heart are included, as are anatomically accurate representations of special cardiac electrical channels (bundle of His and Purkinje network). Cardiac contraction is driven by waves of electrical excitation traveling across the heart to generate physiologically observed wave propagation patterns. The mechanical behavior of heart tissue uses an anisotropic hyperelastic formulation for passive behavior and a time-varying elastance model for the active response. The LHM benefits from SIMULIA’s extensive library of nonlinear material behaviors and experience with biological materials. A closed system of fluid cavities and fluid links models blood flow. The fluid and solid models are directly coupled and the systemic and pulmonary circuits are endowed with vascular compliances and flow resistances that can be modified to simulate exercise, hypertension, and other physiological states. Optionally, 3D blood flow modeling is available using smoothed particle hydrodynamics (SPH) for computational efficiency or by coupling the LHM with traditional CFD solvers if needed. To allow users the optimal balance between accuracy and efficiency, the LHM is available with three mesh variants and computation times range from 4 to 24 hours on a 64cpu workstation.
Dell in conjunction with Appistry, Booth 224
Product Name: GenomePilot v3.7.1
http://www.appistry.com/genomepilot
GenomePilot simplifies the analysis and processing of NGS data by integrating easy-to-use analytics with a patented high-performance computing infrastructure into a turnkey, single solution that can be executed and managed by anyone working with NGS data. GenomePilot is a client/server application configured as a single-user workstation on Dell PowerEdge T630 or as a higher-capacity processing platform with eight PowerEdge R730xd servers.
GenomePilot integrates three key components: analytics, compliance, and infrastructure. The analytics component simplifies how an organization works with NGS data, making the informatics and datasets more approachable. Through the guided, point-and-click interface, experienced and non-experienced users can build, configure, and run a NGS pipeline in minutes, choosing from industry-standard processes and tools: Sequencing Data Preparation (Tools FastQC, Cutadapt, RevertToFastQ), Alignment (BWA mem, BWA aln), Sample Preparation (Qualimap, GATK Depth of Coverage, MarkDuplicates, GATK IndelRealignment, GATK BaseQualityScoreRecalibration), Variant Discovery (GATK HaplotypeCaller, SAMtools Mpileup, FreeBayes, Pindel, ControlFREEC, Contest+MuTect, SomaticIndelDetector, SomaticSniper), Phasing (GATK PhaseByTransmission), Merge Variants, (vcf-merge + Appistry scripts), and Variant Analysis (SnpEff, SnpSift). Users can run analyses on a FASTQ or BAM input file and perform analyses for gene panel, exome, whole genome, tumor/normal, tumor/unmatched normal, and trio. For laboratories working with clinical data, GenomePilot delivers full compliant functionality providing version control for repeatable and compliant pipeline execution. Full auditing capabilities are available through the GenomePilot interface, ensuring compliance. GenomePilot sits on top of a unique and patented high-performance computing architecture delivering automated scheduling, work management, and dedicated processing with the ability to easily scale as sample volumes grow.
Finalist: DEXSTR, Booth 220
Product Name: Inquiro 2.2
http://www.dexstr.io/
Inquiro is a unique Scientific Knowledge Management Solution that provides an innovative approach based on the systematic use and exploitation of your scientific metadata. Our solution collect, manage, integrate and share R&D unstructured data as part of translational approach. Inquiro delivers greater insights from research to pre-clinical stage, simply through the benefits of being able to find and reuse data. It can ensure data is not only contextualized but also of a much high quality and enables traceability too.
Our solution is based on five main axes:
• Store and organize your unstructured data: Inquiro uses a Big Data storage engine that is scalable, resilient and secure, and centralizes the storage of all types and sizes of files.
• Capture the scientific context by integrating tools for manual and automatic curation of your data. Contextualizing your data will decrease the risk of data loss and promote reuse.
• Expand large scale collaboration thanks to Inquiro which goes beyond the “folder/file” paradigm by using your metadata to dynamically reorganize your files according to scientific criteria. This mechanism makes it possible to construct a 360° view of the scientific knowledge of your organization.
• Identify your data and connect them thanks to the use of a search engine that is specifically designed for scientific data, to reveal correlations between your data.
• Integrate your instrumentation and applications to Inquiro thanks to an API and existing connectors that offer semantic interoperability of data sets in order to facilitate the integration and analysis of your data.
Finalist: Eagle Genomics Ltd, Booth 548
Product Name: eaglediscover
http://www.eaglegenomics.com
At BioIT, Eagle Genomics is announcing the release of new solution, eaglediscover, to directly enable pharmaceutical and biotech R&D executives and scientists to most effectively exploit their scientific data (internal, collaborator, and public data). eaglediscover builds upon the data sharing and collaboration capabilities that eaglecore, which was brought to market in 2014. eaglediscover brings to the industry a unique capability to attribute economic and scientific value to data through a statistical and probabilistic measurement framework and a learning-based biocuration engine. This approach effectively yields “smart data”.
eaglediscover is an expert-guided learning system that solves the industry problem by enabling the exploration of the meta-data rather than the raw data itself. Industry standard and proprietary ontologies are used to bootstrap the system. In this way, a contextually-relevant meta-data catalog is created, with the data being statistically scored for relevance based on the scientific and business questions being posed. eaglecore may then be employed to manage the data, e.g. enabling the generation of study-specific data marts.
• Capture the scientific context by integrating tools for manual and automatic curation of your data. Contextualizing your data will decrease the risk of data loss and promote reuse.
• Expand large scale collaboration thanks to Inquiro which goes beyond the “folder/file” paradigm by using your metadata to dynamically reorganize your files according to scientific criteria. This mechanism makes it possible to construct a 360° view of the scientific knowledge of your organization.
• Identify your data and connect them thanks to the use of a search engine that is specifically designed for scientific data, to reveal correlations between your data.
• Integrate your instrumentation and applications to Inquiro thanks to an API and existing connectors that offer semantic interoperability of data sets in order to facilitate the integration and analysis of your data.
Finalist: Eagle Genomics Ltd, Booth 548
Product Name: eaglediscover
http://www.eaglegenomics.com
At BioIT, Eagle Genomics is announcing the release of new solution, eaglediscover, to directly enable pharmaceutical and biotech R&D executives and scientists to most effectively exploit their scientific data (internal, collaborator, and public data). eaglediscover builds upon the data sharing and collaboration capabilities that eaglecore, which was brought to market in 2014. eaglediscover brings to the industry a unique capability to attribute economic and scientific value to data through a statistical and probabilistic measurement framework and a learning-based biocuration engine. This approach effectively yields “smart data”.
eaglediscover is an expert-guided learning system that solves the industry problem by enabling the exploration of the meta-data rather than the raw data itself. Industry standard and proprietary ontologies are used to bootstrap the system. In this way, a contextually-relevant meta-data catalog is created, with the data being statistically scored for relevance based on the scientific and business questions being posed. eaglecore may then be employed to manage the data, e.g. enabling the generation of study-specific data marts.
eaglediscover can be deployed as either a public, private or as a hybrid cloud-based solution. Users interact with the system through an advanced web-based conversational interface that has been designed to enable rapid exploration of the data to identify the most valuable data sets.
At BioIT, Eagle will be demonstrating eaglediscover, operating on the ICGC data set.
EMC, Booth 249
Product Name - EMC DSSD D5
http://www.emc.com/dssd
DSSD D5 delivers ultra-dense, high-performance, highly available, and very low latency shared flash to up to 48 rack servers. It connects to the servers through redundant, active-active I/O modules. Each I/O module contains 48 PCIe Gen3 x4 lane ports. Client cards installed in standard PCIe Gen 3x8 server slots connect to the I/O modules via dual hot pluggable PCIe Gen3 x4 cables, an industry first.
NVMe over PCIe is employed to enable parallel access to thousands of flash die across 36 flash modules (FMs). Each FM is connected to the world’s largest PCIe mesh via two separate PCIe Gen3 x4 lane connections, providing up to 8 GB/s of throughput to each FM. Applications direct memory access (DMA) data to/from the FMs through DSSD’s Flood Client which supports block, object, and 3rd party integrated plugins. DSSD’s I/O stack is a revolutionary cut through design, which replaces the traditional CPU store and forward method.
DSSD D5 can be populated by 18 or 36 FMs that are each 2 or 4 TB, and provides up to 144 TB of raw flash storage (can be half or fully populated with either 2 TB or 4 TB FMs). Each D5 also includes dual redundant, field replaceable I/O and Control Modules, as well as redundant and field replaceable fans and power supplies.
In summary DSSD D5 provides up to 144 TB of raw flash in a 5 U chassis with up to 10 million plus IOPS, up to 100 GB/s throughput and approximately 100 microseconds latency.
Finalist: EMC, Booth 249
Product Name – EMC MetaLnx v1.0
http://www.emc.com
EMC MetaLnx is an open-source web application developed for IT administrators, data engineers, and research investigators to capture, manage and apply metadata to research data collections. It is designed to work alongside iRODS (The Integrated Rule-Oriented Data System). It provides an intuitive graphical interface that supports iRODS administrative actions, collection management, and metadata management without requiring users to memorize individual iCommands. MetaLnx Templates give users the power to define, capture, edit and apply metadata to data, users and workflows. Users can search metadata and have granular control of permissions to data. Users can assign permissions on data collections, objects, groups and other users. The application allows administrators to monitor the system health, manage users, storage resources, and content. Data engineers can use the metadata tools to automate the extraction of existing metadata from various genomics data types (e.g., BAM, VCF, etc.) and additional annotation via structured sources. MetaLnx empowers research investigators to self-manage research data collections. Anyone with a basic understanding of a file browser can manage data collections, perform searches, and append additional metadata annotations either manually or via MetaLnx Templates. Users can perform large and bulk file uploads that eliminate traditional web browser file size limitations. MetaLnx requires Apache Tomcat, iRODS 4.0.3 or later runtime API, MySQL or PostgreSQL, Java. MetaLnx runs on CentOS 7 and Debian 7. Python 2.6 or later is for the RMD service. The MetaLnx web application runs on Safari, Firefox and Chrome. MetaLnx will be freely available from EMC Code (http://emccode.github.io).
Exostar, Booth 448
Product Name - SecureShare v3.3
http://www.exostar.com/SecureShare/Secure_Collaboration_Life_Sciences/
Exostar’s SecureShare is a cloud-based, Software-as-a-Service, multi-tenant solution. It promotes information sharing in a highly-secure collaborative operating environment. SecureShare is built on a Microsoft SharePoint 2013 foundation tuned to meet the functional/security needs of the life science industry.
SecureShare’s architectural/deployment approach means the solution scales to perform as the number of organizations/individuals working together grows from 5 to 5,000 and beyond. A single instance of the solution is hosted at Exostar, allowing parties to be onboarded in as little as two days, not the weeks or months typically experienced by pharmas like Merck. End-users only need Internet access and a Web browser to use SecureShare. They receive single sign-on access to the solution and any back-end applications and services connected to it in a portal-style configuration.
SecureShare offers all the functionality organizations need to collaborate both internally and with external partners. Individuals can create team sites to serve as hubs of activity. The solution’s integration with Microsoft Outlook and Office makes it easy to build calendars, schedule meetings, and establish document libraries. SecureShare supports additional content creation via wikis, blogs, and newsfeeds, as well as integration to other Microsoft products such as Designer, InfoPath, and Visio.
SecureShare includes pre-configured and customizable workflows and document version control/check-in/check-out for information exchange that meets business process requirements. For information protection, the solution offers federated, claims-aware authentication to control access, along with encryption at-rest and in-transit functionality and audit logs to track compliance with corporate and regulatory standards.
Finalist: Genestack, Booth 137
Product Name: Genestack Platform
https://genestack.com/
Genestack is universal enterprise-level genomics applications platform. It is a next generation operating system for big data problems, designed to run on heterogeneous compute architectures (cloud, cluster, PC, custom hardware), with bioinformatics-specific features. It helps build interactive applications and flexible computational pipelines within a secure collaborative ecosystem. Our “smart file” based virtual file system makes tools compatible across data format limitations and simplifies computations.
Genestack includes numerous computational pipelines for common workflows: whole genome and exome analysis, quality control, variant calling and annotation, transcriptomics for differential gene/isoform expression, transcriptome assembly, isoform discovery, methylation analysis and others. Interactive apps built and available on Genestack include a novel genome browser with computable tracks, interactive variation explorer for on-the-fly filtering and analysis of variants across populations, visual tools for quality control assessment and outlier detection, and others.
The platform has a powerful metadata system, making use of controlled vocabularies to help users annotate and harmonise data. We index public data from repositories worldwide, and map it to major ontologies. Our data browser can search data across private and public domains; the format-independent architecture makes it easy to combine and compute on data from diverse sources.
An SDK and powerful APIs exist for building Genestack applications. We collaborated with hospitals to put our platform on local servers and deliver patient reports, and with companies on interactive applications for exploring complex datasets. It is easy to get started and build and distribute amazing apps.
HGST, Booth 560
Product Name - HGST Active Archive System
HGST.com
The HGST Active Archive System is a modular object storage system that transforms silos of data storage into cloud-scale active archives. For data that requires long-term retention with easy and fast retrieval, the HGST Active Archive System provides unprecedented levels of accessibility, scalability, simplicity, and affordability.
Limitless Scalability with Linear Performance Scaling
The fully-integrated modular architecture starts at 1.5PB with capacity on demand, up to 4.7PB of raw storage in a single rack—with limitless scale-out. Just add additional storage racks to increase capacity and performance. Each rack delivers up to 3.5GB per-second throughput to clients, and the aggregate available performance scales with additional capacity. A single system can scale from one data center to many geographically dispersed locations, each one added as you need it with the same simplicity and scalability.
Highest Availability with Unbreakable Durability
Guaranteed data availability and integrity are essential. Patented HGST technology provides unmatched performance at greater than 15 nines data durability. The unique three-geo design capability ensures strong data consistency, so multi-location data centers can survive an entire data center outage yet still guarantee continuous availability of customer data.
Guaranteed data availability and integrity are essential. Patented HGST technology provides unmatched performance at greater than 15 nines data durability. The unique three-geo design capability ensures strong data consistency, so multi-location data centers can survive an entire data center outage yet still guarantee continuous availability of customer data.
Simple to Install and Use
This fully integrated rack-level system is up and running in minutes. Each unit is vertically integrated with object storage software, networking, servers, and storage in a standard 42U rack. Roll it into place, connect the power, configure the network connections, and the system is online, presenting an S3-compliant object interface that easily integrates with existing S3-aware applications.
IDBS, Booth 561
Product Name: E-WorkBook Connect
IDBS.com/en/platform-products/e-workbook/e-workbook-connect/
E-WorkBook Connect is delivered via the Cloud as Software-as-a-Service. Although it is a companion application for the E-WorkBook Platform, it has been designed from the ground up with an optimized multi-tenanted architecture hosted on Amazon Web Services. As a multi-tenanted application, it has been designed to be both highly performant and scalable. Because it has a ‘zero’ locally installed footprint, its mobile friendly web interface can be accessed by end-users anywhere with internet access on either traditional PCs or mobile devices using the most popular browsers (Safari, Chrome, Internet Explorer, Firefox).
E-WorkBook Connect provides a flexible licensing model that does not penalize you for adding new projects or repurposing user licenses for new accounts. B2B Collaboration demands that businesses be nimble and agile, being able to quickly establish new projects with minimal delay. To this end, we have designed E-WorkBook Connect so that it can be managed entirely by business users, without the need for intervention from corporate IT or us (IDBS), to allow teams to quickly set up new projects, invite new users and start collaborating within minutes.
Although E-WorkBook Connect can be used on its own, it is best deployed as a companion Portal for your existing corporate instance of E-WorkBook. Together, E-WorkBook + Connect provides a complete ecosystem for managing both internal and external R&D data lifecycle. Externally generated content can be retrieved back into E-WorkBook for analysis, review, reporting and long term IP retention.
Finalist: Illumina, Booth 161
Product Name: BaseSpace Cohort Analyzer
https://www.nextbio.com
BaseSpace Cohort Analyzer allows users with little experience or no access to bioinformatics resources to aggregate and analyze cohorts of patients by integrating complete clinical records with genomics data.
The platform is designed to integrate omics data with multi-dimensional and high variance phenotypic data and allow clinicians and researchers to launch population-level data analytics. We provide access to advanced analytical methods across thousands of patients with clinical and molecular data in real time. Cohorts of patients can be stratified and compared at the molecular and clinical level, in order to discover biomarkers, evaluate patients' response to therapies and inform recruitment of patients for clinical trials.
BaseSpace Cohort Analyzer has three key components:
• Optimized pipeline for incorporating patient records and molecular data in a secure private cloud. Clinical and molecular data is standardized and normalized to allow integration with thousands of patients from disparate public and private studies.
• Intuitive workflows to select patient cohorts based on clinical, molecular, project-based, and study-based criteria.
• Analytical and visualization tools within a simple interface that enables one-click generation of reports for one to thousands of patients.
Illumina, Booth 161
Product Name: BaseSpace Suite
http://illumina.com
To address these issues, Illumina developed BaseSpace Suite, a comprehensive, streamlined and fully integrated informatics solution to support end-to-end genomic sequencing. BaseSpace Suite consists of four key components built on an integrated software platform, and the suite as a whole is tightly integrated with Illumina sequencing instruments.
This fully integrated rack-level system is up and running in minutes. Each unit is vertically integrated with object storage software, networking, servers, and storage in a standard 42U rack. Roll it into place, connect the power, configure the network connections, and the system is online, presenting an S3-compliant object interface that easily integrates with existing S3-aware applications.
IDBS, Booth 561
Product Name: E-WorkBook Connect
IDBS.com/en/platform-products/e-workbook/e-workbook-connect/
E-WorkBook Connect is delivered via the Cloud as Software-as-a-Service. Although it is a companion application for the E-WorkBook Platform, it has been designed from the ground up with an optimized multi-tenanted architecture hosted on Amazon Web Services. As a multi-tenanted application, it has been designed to be both highly performant and scalable. Because it has a ‘zero’ locally installed footprint, its mobile friendly web interface can be accessed by end-users anywhere with internet access on either traditional PCs or mobile devices using the most popular browsers (Safari, Chrome, Internet Explorer, Firefox).
E-WorkBook Connect provides a flexible licensing model that does not penalize you for adding new projects or repurposing user licenses for new accounts. B2B Collaboration demands that businesses be nimble and agile, being able to quickly establish new projects with minimal delay. To this end, we have designed E-WorkBook Connect so that it can be managed entirely by business users, without the need for intervention from corporate IT or us (IDBS), to allow teams to quickly set up new projects, invite new users and start collaborating within minutes.
Although E-WorkBook Connect can be used on its own, it is best deployed as a companion Portal for your existing corporate instance of E-WorkBook. Together, E-WorkBook + Connect provides a complete ecosystem for managing both internal and external R&D data lifecycle. Externally generated content can be retrieved back into E-WorkBook for analysis, review, reporting and long term IP retention.
Finalist: Illumina, Booth 161
Product Name: BaseSpace Cohort Analyzer
https://www.nextbio.com
BaseSpace Cohort Analyzer allows users with little experience or no access to bioinformatics resources to aggregate and analyze cohorts of patients by integrating complete clinical records with genomics data.
The platform is designed to integrate omics data with multi-dimensional and high variance phenotypic data and allow clinicians and researchers to launch population-level data analytics. We provide access to advanced analytical methods across thousands of patients with clinical and molecular data in real time. Cohorts of patients can be stratified and compared at the molecular and clinical level, in order to discover biomarkers, evaluate patients' response to therapies and inform recruitment of patients for clinical trials.
BaseSpace Cohort Analyzer has three key components:
• Optimized pipeline for incorporating patient records and molecular data in a secure private cloud. Clinical and molecular data is standardized and normalized to allow integration with thousands of patients from disparate public and private studies.
• Intuitive workflows to select patient cohorts based on clinical, molecular, project-based, and study-based criteria.
• Analytical and visualization tools within a simple interface that enables one-click generation of reports for one to thousands of patients.
Illumina, Booth 161
Product Name: BaseSpace Suite
http://illumina.com
To address these issues, Illumina developed BaseSpace Suite, a comprehensive, streamlined and fully integrated informatics solution to support end-to-end genomic sequencing. BaseSpace Suite consists of four key components built on an integrated software platform, and the suite as a whole is tightly integrated with Illumina sequencing instruments.
The first component of BaseSpace Suite is Clarity LIMS. Clarity is used to manage and track samples throughout the entire workflow, reducing errors and ensuring the traceability necessary in regulated environments.
Once generated, sequence data is automatically uploaded into Sequence Hub (previously BaseSpace Cloud), a safe and secure cloud environment to hold the increasing volume of sequence data. Sequence Hub has over 70 apps that can be pipelined together to perform secondary analysis, including customer-developed pipelines. Sequence Hub is also a collaboration environment allowing researchers to share data and analyses to advance genomic knowledge.
Once secondary analysis has been done, variant calling, based on rules defined by users, identifies variants of interest in the sample. BaseSpace Suite provides Variant Interpreter (currently shipping in Beta) for this task.
The final key stage of genomic data analysis is to move from the individual patient to a cohort view. BaseSpace Cohort Analyzer allows users without advanced bioinformatics experience, or no access to bioinformatics resources, to aggregate and analyze cohorts of patients by integrating complete clinical records with genomics data.
BaseSpace Suite provides the streamlined, comprehensive acquisition and analysis capabilities necessary to derive greater and higher quality answers from genomic data.
InterpretOmics, Booth 521
Product Name: iOMICS Research 4.0
http://interpretomics.co/
iOMICS Research is an end-to-end biological big data analytical platform, which provides researchers easy to use applications for comprehensive genomic knowledge discovery, from bacteria, crop to cancer. It enables users to analyze raw data from different types of NGS and Microarray experiments, and enables them to conduct advanced functional and integrative analysis for mapping molecules to their functions. All these functionalities are packaged within 16 apps, available through the App Store. The apps interact with the Data Store, where users store and manage their data and results. A real-time dashboard provides status and statistics of runs. All iOMICS Research apps are automated to enable researchers to conduct end-to-end data analysis in three simple steps, namely, Create, Analyze and Visualize, where they create the project, input analysis parameters and finally visualize results. Users can repeat and rerun their analysis in part or whole at any point within the workflow, enhancing the repeatability, testability, and accuracy of results. The flexible design also allows easy integration and customization of proprietary external tools and databases. Applications are powered by algorithms optimized for genomics big-data analysis, applying complex mathematical and statistical models to produce robust results. Downstream and functional analysis are designed according to the complexities of the biological system to provide relevant actionable results. iOMICS Research is available in both cloud pay-per-use and on-premise versions. The on-premise version of iOMICS Research requires minimum 8 CPU cores, 16 GB RAM, and 2TB secondary storage. The cloud version provides a more affordable solution for collaborative research.
Lab7 Systems, Booth 145
Product Name: Lab7 Enterprise Science Platform
http://www.lab7.io
The flagship product from Lab7 Systems is the Enterprise Science Platform (ESP), a horizontally-integrated comprehensive software solution to manage data-intensive laboratories. The Lab7 ESP is a centralized lab data management platform that helps organizations and individual researchers track samples, process data, produce reports, and manage workflows and analysis pipelines – overall, increasing sample throughput and freeing space for new opportunity in the lab. The goal of the software is to enable all of the functional processes to occur under a single umbrella, allowing everything to be tracked from sample submission through results reporting.
The ESP is agnostic to scientific workflow, data-generating technology, and computing infrastructure. The platform can scale from laptop to clusters to the cloud, depending on the user’s requirements. As a Web-based application, the Lab7 ESP requires nothing more than a recent, HL5 compliant Web browser such as Safari, Chrome, or Internet Explorer 9. It leverages existing job schedulers, file systems, and user models to integrate seamlessly into existing compute environments. The Lab7 ESP supports all major job schedulers, including PBS, SLURM, SGE/OGE, and Platform LSF.
Monocl Software, Booth 144
Product Name: Monocl EGO
http://monocl.com
We have spent the past three years (in a cave) developing a groundbreaking a new software analytics platform from scratch. The first product based on this platform, Monocl EGO, was launched at the end of 2015 following 6+ months of beta testing with major pharmaceutical companies, instrument manufacturers and device companies.
Monocl EGO is a highly sophisticated SaaS expert analytics platform specifically designed for Life Science professionals. We have gathered essential information about millions of experts in one place and adapted it to your work flow, so that you don’t have to. Monocl EGO enables you to understand scientific experts, their capabilities and relationships in a completely new way. It enables you to identify new opportunities, prioritize activities and develop business relationships in a much smarter and more cohesive manner.
Within seconds from logging in, you can find relevant experts in any research area of interest. Carefully designed work spaces enables you to systematically prioritize among relevant experts and the software even proposes experts that are similar to important stakeholders that you are already working with. This lets you expand your professional network beyond people you and your organization may already know.
Monocl EGO currently contains 6 million expert profiles based on 20+ million publications and 200000+ clinical trials. We are currently implementing social impact analytics to enable users to understand the influencing capacity and social fingerprint of individual experts based on their activity and presence in social media, news, conferences and a wide range of other online sources.
ONTOFORCE NV, Booth 551
Product Name: DISQOVER
https://app.disqover.com/#login/
DISQOVER has an extremely scalable, fast cloud architecture enabling semantic search with unlimited user replication and data sharding. Our script and images server hosts most common cloud infrastructures like AWS or on premise RedHat. Installation is done in minutes to hours if you need a private setup for internal data integration.
ONTOFORCE NV, Booth 551
Product Name: DISQOVER
https://app.disqover.com/#login/
DISQOVER has an extremely scalable, fast cloud architecture enabling semantic search with unlimited user replication and data sharding. Our script and images server hosts most common cloud infrastructures like AWS or on premise RedHat. Installation is done in minutes to hours if you need a private setup for internal data integration.
Linking your private to external or third party data doesn’t need any more copying data in data lakes or warehouses. DISQOVER federated search unifies and links disparate internal, third party and external data very fast and makes it scalable. Our public DISQOVER contains more than 100 data sources and links with your internal data in seconds.
Ease of use is key and you can add new data in hours and move to production in days. Our admin console for user management and our data integration console make all of that really simple.
Optimal user experience is a must. More than 2 years of research in design and human interface techniques make search over vast amounts of various data possible in one uniform and easy manner. Everyone becomes a data scientist with DISQOVER and minimal 10-15 YouTube video training is needed to get users started.
And there are other functionalities saving searches, rerun them later, alerts for data updates, reporting, sharing, collaborating … and we started opening and documenting our API and mining unstructured data.
DISQOVER has passed architectural reviews and security assessments in medium size to multinational companies. We will share testimonials at our booth.
PerkinElmer Informatics, Inc, Booth 216
Product Name: PerkinElmer Signals
https://www.perkinelmer.com/informatics
PerkinElmer Signals is a cloud-based data integration platform that consumes information from many sources and relates those data to scientifically meaningful concepts. Just as there are patterns within scientific workflows, there are patterns within biological data. Specifically the data can be categorized into three distinct types: Raw Data, Measurement Data, and Scientific Entities. The design of PerkinElmer Signals utilizes distinct technologies to handle each of those fundamental data types. By recognizing the different requirements of each data type, we are freed to use the most appropriate informatics technologies and integrate a polyglot database back-end into an integrated whole.
PerkinElmer Informatics, Inc, Booth 216
Product Name: PerkinElmer Signals
https://www.perkinelmer.com/informatics
PerkinElmer Signals is a cloud-based data integration platform that consumes information from many sources and relates those data to scientifically meaningful concepts. Just as there are patterns within scientific workflows, there are patterns within biological data. Specifically the data can be categorized into three distinct types: Raw Data, Measurement Data, and Scientific Entities. The design of PerkinElmer Signals utilizes distinct technologies to handle each of those fundamental data types. By recognizing the different requirements of each data type, we are freed to use the most appropriate informatics technologies and integrate a polyglot database back-end into an integrated whole.
PerkinElmer Signals is hosted in Amazon Web Services (AWS) and is provided as a SaaS product. The platform enables translational scientists to connect to their data sources, identify cohorts of interest, and analyze that data using TIBCO Spotfire. The API layer connects users to public data sets such as GEO or tranSMART and adds data from instruments and other sources in the Amazon Cloud. The data is mapped, normalized and searched in the cloud before it is seamlessly imported into TIBCO Spotfire for analysis.
PetaGene Ltd, Booth 552
Product Name: BayesQual 1.0
http://www.petagene.com
We are proud to launch BayesQual at Bio-IT World 2016. BayesQual is a drop in replacement for the BQSR stage of genomics pipelines. It is a command line tool that is easy to deploy into existing pipelines.
PetaGene Ltd, Booth 552
Product Name: BayesQual 1.0
http://www.petagene.com
We are proud to launch BayesQual at Bio-IT World 2016. BayesQual is a drop in replacement for the BQSR stage of genomics pipelines. It is a command line tool that is easy to deploy into existing pipelines.
BayesQual improves upon GATK's BQSR by adjusting quality scores using a Bayesian model of sequencing error. The resultant files actually have better genotyping accuracy than with GATK's BQSR, and the BAM files are also 3-4x smaller. When stored as CRAM files they are over 8x smaller than the BAM file from GATK’s BQSR, and 5-6x smaller than the CRAM equivalent. By replacing GATK BQSR with BayesQual, users benefit from improved genotyping accuracy, as well as much smaller storage footprints.
BayesQual is available for RedHat, CentOS, Fedora and Ubuntu-based distributions. It operates on SAM, BAM and CRAM file formats. It requires 24GB of memory, and processes raw NGS data at 20-40MB/sec on a quad-core i7.
Finalist: PetaGene Ltd, Booth 552
Product Name: PetaSuite 1.0
http://www.petagene.com
PetaSuite is a set of complementary software tools that significantly reduce the size and cost of NGS data for storage and transfer. PetaSuite lets researchers and clinicians continue using their FASTQ, BAM, and CRAM files in their existing tools and pipelines, but benefit from a reduced backend storage footprint. It can integrate into most existing storage infrastructures to provide transparent compression.
Finalist: PetaGene Ltd, Booth 552
Product Name: PetaSuite 1.0
http://www.petagene.com
PetaSuite is a set of complementary software tools that significantly reduce the size and cost of NGS data for storage and transfer. PetaSuite lets researchers and clinicians continue using their FASTQ, BAM, and CRAM files in their existing tools and pipelines, but benefit from a reduced backend storage footprint. It can integrate into most existing storage infrastructures to provide transparent compression.
Unlike generic storage software, PetaSuite understands the internals of genomics files. For lossless storage, PetaSuite offers cost reductions of up to 4:1 compared to BAM or gzipped FASTQ files. When used with our revolutionary Bayesian approach to genomic quality score compression, genotype accuracy is preserved or even improved while reducing storage size by 5:1.
For example, on Illumina Hi-Seq-X 30x WGS human sample NA12878, the original FASTQ.GZ files are 73.73GiB in size, whereas with PetaSuite this is reduced to 13.69GiB (5.3x smaller). Moreover, it is still accessible in FASTQ format for pipelines to use.
PetaSuite consists of several complementary software tools:
• FasterQ: FASTQ compression at 140MB/sec (4-core i7), smaller than CRAM, uses 4GB of RAM. Streaming compression/decompression for file transfer acceleration.
• BayesCal: revolutionary approach to quality score refinement for BAM/CRAM/FASTQ, calculates a more complete Bayesian estimation of sequencer error. Improves compressibility by 2-3x while preserving/improving genotyping accuracy. It requires 24GB of memory.
• PetaVFS: virtual file system that provides high performance random access BAM/FASTQ virtual files representing CRAM/FasterQ compressed data. It also can split out internals of NGS data across storage tiers for lossy and lossless access.
Precision for Medicine, Booth 344
Product Name: PATH Analytics Platform
http://www.precisionformedicine.com
The PATH Analytics Platform provides a unique combination of novel, proprietary algorithms for genomic analysis and knowledge generation housed within a secure, web-based application. The platform has three key components: PATH Select, PATH Stratify and PATH Explore. PATH Select is an ensemble based machine learning and artificial intelligence algorithm that has been specifically engineered to leverage the hierarchical nature of genomic data and perform high-throughput feature selection. PATH Stratify is a hierarchical Bayesian framework with latent variable estimation that is geared toward further refining the feature space while simultaneously estimating complex genomic signatures for use in directly testing for a subgroup of patients with enhanced treatment response. PATH Explore is the point-and-click web-based user interface and knowledge generation tool, providing access to interactive data visualization, customizable and downloadable tables, listings and figures (TLFs), gene- and variant-level annotation, and insights from a variety of integrated databases and bioinformatics tools.
• FasterQ: FASTQ compression at 140MB/sec (4-core i7), smaller than CRAM, uses 4GB of RAM. Streaming compression/decompression for file transfer acceleration.
• BayesCal: revolutionary approach to quality score refinement for BAM/CRAM/FASTQ, calculates a more complete Bayesian estimation of sequencer error. Improves compressibility by 2-3x while preserving/improving genotyping accuracy. It requires 24GB of memory.
• PetaVFS: virtual file system that provides high performance random access BAM/FASTQ virtual files representing CRAM/FasterQ compressed data. It also can split out internals of NGS data across storage tiers for lossy and lossless access.
Precision for Medicine, Booth 344
Product Name: PATH Analytics Platform
http://www.precisionformedicine.com
The PATH Analytics Platform provides a unique combination of novel, proprietary algorithms for genomic analysis and knowledge generation housed within a secure, web-based application. The platform has three key components: PATH Select, PATH Stratify and PATH Explore. PATH Select is an ensemble based machine learning and artificial intelligence algorithm that has been specifically engineered to leverage the hierarchical nature of genomic data and perform high-throughput feature selection. PATH Stratify is a hierarchical Bayesian framework with latent variable estimation that is geared toward further refining the feature space while simultaneously estimating complex genomic signatures for use in directly testing for a subgroup of patients with enhanced treatment response. PATH Explore is the point-and-click web-based user interface and knowledge generation tool, providing access to interactive data visualization, customizable and downloadable tables, listings and figures (TLFs), gene- and variant-level annotation, and insights from a variety of integrated databases and bioinformatics tools.
The platform is a highly scalable, HIPAA compliant, cloud-based solution with high-performance computing powered by Amazon Web Services’ EC2 and S3 infrastructures. The hosted environment has global security certifications and compliance verifications for SOC 2 Type II and SOC 3; and compliance with ISO 27001 standard, SSAE 16 / SAE 3402, NIH Approved System Security Plan, OMB Circular A-130, NIST IT System Security and FIP 140-2 level 3 Certified Data Encryption. The PATH framework also provides the necessary agility to integrate other service options such as secondary NGS pipelines, access to public or proprietary databases, and advanced annotation and logical interpretation workflows.
Prysm, Booth 533
Product Name: Prysm Enterprise
http://www.prysm.com/
With Prysm Enterprise, the newest offering in Prysm’s Visual Workplace portfolio, applications, content, video conferencing and the web are all combined into cloud-based visual workspaces where anyone can create, edit, share and store work, and then go back and re-access the saved workspaces later from any location via the cloud. The offering includes:
• Prysm Cloud: application server(s), dedicated or multi-tenant option
• Prysm Application Suite: Software that enables users to collaborate and store content in cloud-based workspaces
• Prysm Displays: with standard sizes of 65”, 85”, 98”, 117” and 190”, as well as custom sizes to fit any size meeting room(s).
• Prysm Mobile: Web browser access for any mobile device including Apple iOS, Android, and Windows
• Services & support: One point of service and support contact and service supplier
• A new SaaS-based pricing model for the software
Psyche Systems Corporation, Booth 513
Product Name: NucleoLIS 2.0
http://www.psychesystems.com
Molecular testing is an incredibly complex process. You can now streamline the diagnostically diverse complexities of Molecular testing. NucleoLIS is a fully automated solution for PCR, Immunology, FISH, Karyotyping and DNA sequencing designed to support workflow, deviations, relaxing and the complex reporting requirements of the Molecular lab. It is a windows based, client/server application built in .NET for ease of integration, standardization, and a scalable platform.
Prysm, Booth 533
Product Name: Prysm Enterprise
http://www.prysm.com/
With Prysm Enterprise, the newest offering in Prysm’s Visual Workplace portfolio, applications, content, video conferencing and the web are all combined into cloud-based visual workspaces where anyone can create, edit, share and store work, and then go back and re-access the saved workspaces later from any location via the cloud. The offering includes:
• Prysm Cloud: application server(s), dedicated or multi-tenant option
• Prysm Application Suite: Software that enables users to collaborate and store content in cloud-based workspaces
• Prysm Displays: with standard sizes of 65”, 85”, 98”, 117” and 190”, as well as custom sizes to fit any size meeting room(s).
• Prysm Mobile: Web browser access for any mobile device including Apple iOS, Android, and Windows
• Services & support: One point of service and support contact and service supplier
• A new SaaS-based pricing model for the software
Psyche Systems Corporation, Booth 513
Product Name: NucleoLIS 2.0
http://www.psychesystems.com
Molecular testing is an incredibly complex process. You can now streamline the diagnostically diverse complexities of Molecular testing. NucleoLIS is a fully automated solution for PCR, Immunology, FISH, Karyotyping and DNA sequencing designed to support workflow, deviations, relaxing and the complex reporting requirements of the Molecular lab. It is a windows based, client/server application built in .NET for ease of integration, standardization, and a scalable platform.
Qumulo, Booth 349
Product Name: QC208
http://qumulo.com/products/specifications/#tabnav
The Qumulo QC208 hybrid storage appliance is the second hardware product in Qumulo’s Q-series portfolio. Qumulo Core, the world’s first data-aware scale-out network-attached storage (NAS), is available on QC208 for capacity-optimized large-scale deployments.
Qumulo QC208 is a 4U commodity hardware appliance that provides 208TB of raw HDD capacity and 6.2TB of raw SSD capacity per node to give users a higher density, capacity-optimized, lowest cost per GB data-aware, scale-out NAS solution. A minimum four-node cluster provides 832TB of raw storage capacity at a low cost per terabyte and can be scaled out simply by adding additional QC208 nodes.
Finalist: SAP, Booth 116 & 118
Product Name: SAP Medical Research Insights 2.0
https://icn.sap.com/projects/sap-medical-research-insights.hl
SAP Medical Research Insights, powered by SAP HANA, is a browser-based application designed for use in medical and clinical research. This application combines structured and unstructured clinical information from various sources, such as clinical information systems, tumor registries, biobank systems, and even text documents like physicians’ notes. With this application, users can filter and group patients according to different attributes, which can be customized for different research purposes. In addition, users can perform Kaplan-Meier estimations on the fly to compare survival rates between different cohorts or individuals. Users can also explore patient data of an individual or a cohort in a variant browser that presents a circular view of the whole genome data set, with the ability to interactively zoom in to regions of the genome that are of interest. In addition, this application offers a comprehensive overview of each patient’s medical history in a chronological timeline, making it easy to access information on any level of detail.
SciBite, Booth 129
Product Name: TERMite 5.9
SciBite.com
SciBite offer a complete semantic services platform that can be used as a data analytics solution by end users and also as a ‘pluggable’ component to transform existing IT infrastructures into more scientifically aware systems.
The core system is an API that scans scientific text and rapidly identifies the key concepts stated, such as drugs, proteins, companies, targets, outcomes, measures. In doing this, the unstructured text is transformed into ontology-based indexed data.
Built in Java, it will happily run from a personal laptop, real and virtual servers and in the cloud. It’s incredibly simple to install as a stand alone RESTful service that can be accessed by a multitude of clients and applications.
Whilst the platform is first and foremost a powerful API, it can easily be expanded and SciBite offer a series of end-user applications for specific use cases including relationship extraction (drugs causing side effects, phenotypes seen in disease states, regulatory authority decisions etc.), transforming the way project teams share documents and browser-based tools that perform data-integration “on the fly” and connect social networks of colleagues with shared interests.
Supporting the platform, is a hand-curated reference library with over 80 industry-focused vocabularies containing over 20 million synonyms that use public identifiers aiding data integration and covering a wide range of Life Science topics including Drugs, Pathology, Pharmaceutical Sciences, commercial and business activities. Many-fold enriched over any publicly available alternatives, these form a crucial part of managing the synonymous and ambiguous language found in unstructured scientific text.
Finalist: Seven Bridges, Booth 454
Product Name: The Cancer Genomics Cloud
http://www.cancergenomicscloud.org
The CGC allows researchers to immediately and securely access the complete TCGA dataset on the cloud, perform analyses, visualize the results, and do so collaboratively. The dataset includes raw and processed data from whole genome, whole exome, RNA, microRNA and bisulfite sequencing studies, as well as array-based studies. TCGA data has two tiers: Controlled Access, with which patients are potentially identifiable, and Open Access data, in which they are deidentified, are available. Users’ dbGap approvals are automatically associated with their CGC account using eRA Commons credentials and data can be easily found and used. Private data can also be imported to the CGC using one of several rapid mechanisms.
Researchers using the CGC can interactively query TCGA data based on cases, file types, and their associated clinical metadata with the data browser. A visual case explorer allows users to browse the mutation status and expression levels of a gene in all patients with a particular disease. Users can privately analyze their own data alongside TCGA using pre-built bioinformatics workflows. Moreover, the CGC software development kit allows users to port their own tools to easily run in a cloud environment. Every analysis is fully reproducible as the CGC captures the data, parameters, and specific version tool of a tool for every execution. Project management tools allow teams to collaborate simply, securely, and transparently.
Signet Accel, Booth 340
Product Name: Avec
http://www.signetaccel.com
Avec is a proven, ready-to-deploy commercial federated data integration platform purpose-built to bring true interoperability to the healthcare ecosystem and address the fundamental issues surrounding data sharing. Initiated at The Ohio State University, it was developed and perfected with an investment of more than 13 years and $20 million.
Avec connects disparate data as it is and where it is, regardless of its origin. It enables analysis of complex, distributed healthcare data in a highly secure manner and protects ownership and control of data at each site. It doesn’t require changes to the process of collecting data, how it’s stored, where it’s stored, how it’s structured or what language it speaks.
1. Lightweight/easy to deploy: Requires minimal infrastructure investment, and can be deployed on premise and/or in the Cloud;
2. Scalable: Scales and evolves gracefully to support changing biomedical big data standards, types, and models, independent of any and all data generating technologies;
3. Compatible: Offers proprietary structural mapping and is compatible with all standard vocabularies;
4. Secure: Security mechanisms enable data stewards/owners to maintain uncompromising control over their data and determine how it is shared and for what purpose, addressing ownership, stewardship, and valuation concerns;
5. Spans boundaries: Able to create dynamic data-sharing and collaborative analytics ecosystems that span traditional geographic, temporal, and organizational boundaries;
6. Maximizes current investments: Cost distribution ensures each participant only supports expenses directly aligned with their participation, and costs and values therein.
7. Allows investigators to ask more questions and receive more complete answers.
Simulations Plus, Inc., Booth 51
Product Name: ADMET Predictor 8.0
http://www.simulations-plus.com/Default.aspx
ADMET Predictor 8.0 is a sophisticated, Windows based software program that is used by medicinal and computational chemists, toxicologists, and DMPK (drug metabolism and pharmacokinetic) scientists. It includes tools to analyze high throughput screening data, cluster compounds by scaffold, explore structure activity relationships, visualize data, create quantitative structure activity relationship (QSAR) models, generate structural analogs, propose novel scaffolds, predict ADMET properties (over 140), generate metabolites, and predict cytochrome P450 kinetic properties. Our models have been consistently ranked number one in peer reviewed journal articles on head to head comparison with our competitors. Our pKa training sets have been greatly expanded through collaboration with Bayer HealthCare, resulting in significant improvements to model accuracy. Our cytochrome P450 models include predictions for Michaelis-Menten kinetic parameters that enable prediction of the percent yield of each metabolite.
Starfish Storage, Booth 238
Product Name: Starfish
http://www.starfishstorage.com
Starfish is a suite of software modules that interact with one another to create a holistic, managed storage environment.
At the heart of Starfish is the Core Catalog Server which is a database that tracks the contents of conventional file storage devices and cloud-style object stores. Users and applications are able to associate metadata with files and directories, thus building awareness of the business and scientific value of the files. Starfish employs state of the art techniques to synchronize its database with file systems scaling into the billions of objects and multiple Petabytes of capacity.
The metadata system enables the other major Starfish modules, which include:
• Rules Manager / Jobs Engine -- Starfish enforces rules by running scheduled batch jobs based on metadata values in the catalog. Jobs are run in parallel across a multitude of agents. Agents have built in functionality such as data migration and hash calculation, but they can also run custom code.
• Report Engine -- Starfish can generate much more meaningful reports than traditional tools because metadata values are used to define the results set. Starfish enables very specific chargeback or show-back reports as well as detailed utilization and trending reports.
• Namespace Solutions -- Starfish metadata is the foundation for a global namespace that provides unique identifiers for all files and directories.
All of these modules interact through a RESTful API. Thus, Starfish serves as a middleware for both home-grown solutions and for any data management solution that references files stored on conventional file systems and object stores.
Finalist: The iRODS Consortium, Booth 148
Product Name: iRODS 4.1 with Cloud Browser 1.1
https://irods.org
iRODS is open source data management software for storing, searching, organizing, and sharing files and datasets that are large, important, and complex. Thousands of businesses, research centers, and government agencies worldwide use iRODS for flexible, policy-based management of files and metadata that span storage devices and locations.
iRODS is based on four main data management concepts: data virtualization, which provides a global namespace for accessing data stored across different technologies and systems; data discovery based on user- and system-generated metadata; workflow automation that processes, moves, and controls access to data based on any relevant event trigger; and secure collaboration, allowing data to be shared in place from remote namespaces, with the consistency of the locally-provided interface.
Titian Software, Booth 223
Product Name: Mosaic SampleBank
http://www.titian.co.uk/mosaic-products/mosaic-samplebank/
Titian's Mosaic SampleBank is a fully optimised, pre-configured software package for small molecule and biological inventory management. SampleBank enables a seamless start-up to manage full inventory tracking coupled with sample ordering and workflow management all with this coupled with capability integrate with an extensive range of instrumentation and automated stores.
• Report Engine -- Starfish can generate much more meaningful reports than traditional tools because metadata values are used to define the results set. Starfish enables very specific chargeback or show-back reports as well as detailed utilization and trending reports.
• Namespace Solutions -- Starfish metadata is the foundation for a global namespace that provides unique identifiers for all files and directories.
All of these modules interact through a RESTful API. Thus, Starfish serves as a middleware for both home-grown solutions and for any data management solution that references files stored on conventional file systems and object stores.
Finalist: The iRODS Consortium, Booth 148
Product Name: iRODS 4.1 with Cloud Browser 1.1
https://irods.org
iRODS is open source data management software for storing, searching, organizing, and sharing files and datasets that are large, important, and complex. Thousands of businesses, research centers, and government agencies worldwide use iRODS for flexible, policy-based management of files and metadata that span storage devices and locations.
iRODS is based on four main data management concepts: data virtualization, which provides a global namespace for accessing data stored across different technologies and systems; data discovery based on user- and system-generated metadata; workflow automation that processes, moves, and controls access to data based on any relevant event trigger; and secure collaboration, allowing data to be shared in place from remote namespaces, with the consistency of the locally-provided interface.
Titian Software, Booth 223
Product Name: Mosaic SampleBank
http://www.titian.co.uk/mosaic-products/mosaic-samplebank/
Titian's Mosaic SampleBank is a fully optimised, pre-configured software package for small molecule and biological inventory management. SampleBank enables a seamless start-up to manage full inventory tracking coupled with sample ordering and workflow management all with this coupled with capability integrate with an extensive range of instrumentation and automated stores.
Mosaic SampleBank helps you to:
• Revolutionise your workflows with error free supply chain management
• Organise your samples as a corporate resource rather than existing in silos
• Free your Scientists from tracking down samples to focusing on research
• Get the most efficient use from your automation and other capital equipment
• Provide a scalable solution able to grow to meet your future demands
• Encapsulate best practice for your sample management requirements, based on over 15 years of industry knowledge
• Hit the ground running – it’s fast to deploy and simple to implement
Mosaic SampleBank functionality includes:
• Inventory tracking to see exactly which samples are where at any time
• Comprehensive audit trail to add integrity to your process
• Powerful, intuitive ordering interfaces for a guaranteed and seamless sample supply chain
• Automation integration to reduce human errors in sample preparation and associated data handling
• In-built sample workflow management for optimised manual or automated sample preparation
• Sample properties for multiple sample types e.g. tissue, cells, antibodies, can be recorded in the same system
• Web-based access means SampleBank is easy to deploy.
Univa Corporation, Booth 235
Product Name: Univa Grid Engine Container Edition
http://www.univa.com/
Univa Grid Engine Container Edition software accelerates the processing of massive amounts of data and sophisticated analyses to increase productivity in biomedical research and drug develop processes. Its ability to scale to large clusters and high throughput workloads, coupled with its ease of application integration, makes it the solution of choice for hundreds of companies in the Bio-IT World community including life sciences, pharmaceutical, clinical, and healthcare.
Provide container orchestration and maximum utilization of IT resources:
• Workload Scheduling
• Enterprise-Grade Management
• Container Orchestration
• Complete Job Control
• Docker Support
• Pre-Emption Capability
• Proven Scale
Wiley, Booth 534
Product Name: Wiley Spectra Lab 1.0
http://www.wileyspectralab.com
Launching March 7th, Wiley Spectra Lab is an expert spectral data system that uses empirical spectral data and advanced software to help chemists, toxicologists, and life scientists confidently identify chemical substances. Customize your spectral search to meet your needs with combinations of over 175 spectral databases sourced from Wiley, Bio-Rad Sadtler, and others to provide the focus required by technique and analyte.
Finalist: Wiley, Booth 534
Product Name: ChemPlanner 1.0.4
http://www.chemplanner.com/
Wiley ChemPlanner can make creating routes faster and easier. Using a combination of novel reactions and curated information, ChemPlanner delivers computer-aided synthesis design backed up by millions of empirical reactions. Wiley ChemPlanner builds the route for you! Simply plug in your target compound and your starting material and ChemPlanner delivers a wide variety of diverse and viable routes in a matter of minutes.
The tool launched in September 2015, is currently sold as Software as a Service (SaaS) solution and hosted on secure servers, with a local installation version coming this year.
Supported browsers are Firefox 30+, Chrome 35+, Internet Explorer 9+, Safari 6+, Opera 23+ Note: Java not required.
Finalist: WuXi NextCODE, Booth 528
Product Name: WuXi NextCODE Exchange/Simons Simplex Collection portal (v1)
http://www.wuxinextcode.com
The SSC comprises nearly 10,000 whole exomes, at raw read resolution, and thousands of phenotypic variables in some 2600 families. Each family includes one child with an autism spectrum disorder and their unaffected parents and siblings.
Through this portal, all of the SSC data - in full resolution, not just variant sets - as well as standard and bespoke reference data can be used by researchers without specialist informatics expertise over ordinary internet connections. Users interact with the data not by choosing from a menu of unconnected tools, but directly through WuXi NextCODE’s interfaces. These include the Clinical Sequence Analyzer, for interpreting individual genomes or families, and the Sequence Miner interface, which runs case-control queries of any scope using both built-in menus or more advanced query tools to define both genomic and phenotypic parameters.
Among the built in capabilities of these interfaces are de novo and paralog detection; carrier analysis; filters for allele frequency and variant impact prediction; and variant aggregators to increase the statistical power for identifying rare variants. The results of queries are viewed in the context of all major public reference data and SFARIGene and other ASD gene lists, and any variant can be instantly visualized in raw sequence. Import and merge functionality using the same tools enables the incorporation and simultaneous interrogation of the SSC with researchers’ own data.
The data is stored in WuXi NextCODE’s elastically scalable, HIPAA-compliant cloud powered by DNAnexus.