To the many attendees and speakers: we can’t thank you enough for spending time with us in Cambridge, Mass.! For those who could not join us in person, below is a short summary about how OmicSoft’s content and software products are being used by pharma and biotech scientists to develop the next round of drug candidates and therapies. And in case you missed it, don’t forget to check out our day 1 highlights as well.
We were delighted to have Paul Jung, senior biology data scientist from AbbVie, give a presentation about how his group has utilized OmicSoft technology to build an ’omics data hub for his company. He laid out the challenge facing many IT teams in pharma and biotech: maximizing the value of disparate data sets, gathered with different platforms, and represented in diverse formats. His group is focused on enabling cross-platform analysis for microarray, proteomics, NGS, and chemical biology data. To start with, they addressed the need for centralized, secure storage; a controlled, common vocabulary; and flexible access control. They chose Array Studio in the cloud, developed a controlled vocabulary, and made it possible for AbbVie scientists to use custom annotations and to import external analysis files. Jung demonstrated some impressive applications of Array Studio, such as plotting compound potency versus ’omics data to power early discovery efforts. To date, he said this approach has been used to process more than 26,000 samples from AbbVie projects, and that his team has helped establish 14 internal Lands and seven internal clinical Lands data sets along the way.
QIAGEN’s Matt Newman offered a look at the custom curation services available for OmicSoft users. We have found that many customers would like additional content added to Lands for greater coverage of a certain disease or subject area, but they simply don’t have the internal resources to perform the strict and methodical curation necessary. It’s a tedious process — we can attest to that, since we do it every day! To help users get past this phase and into the more interesting science it facilitates, we can take on custom curation projects either for specific data sets or for a particular subject area. We’ll select and retrieve the most useful and relevant data, perform statistical modeling, extract metadata, establish fields, assign it to curators, edit the results, and then handle data merging and QC before publishing it to Lands. This way, users can have confidence that every bit of data they encounter in OmicSoft tools has been through the same rigorous curation, quality control, and review process for the best scientific results. Newman also noted that in 2018 we’ll be expanding these services to include analysis as well; this request has been frequently heard from customers at companies without lots of bioinformatics expertise.
Also, our own Nirav Amin presented several step-by-step case studies showing how to use OncoLand. Those can be viewed on our wiki.
We’ll be releasing videos and other materials to provide more details about the OmicSoft User Group Meeting. Stay tuned!
In case you couldn’t join us in Cambridge, Mass., for the event, we’ve rounded up some highlights from day 1 to give you a sense of how OmicSoft’s products are being used by pharma and biotech researchers in drug discovery every day, and where the future is taking us as we integrate analytic software and curated content with the capabilities of QIAGEN’s product molecular biology offerings. We’d like to thank all the speakers and attendees who have made the meeting a success so far!
The agenda for day 1 included three great customer and collaborator presentations. Principal scientist Xiang (Sean) Yao from Janssen, a division of Johnson & Johnson, has been using OmicSoft tools for 10 years. “I’m a true believer in data integration,” he told attendees, recalling his team’s realization that they didn’t have the capacity internally to fully process and integrate their gene expression data in an efficient and cost-effective manner. He was an early adopter of OmicSoft’s Array Server, Land tools, and cloud computing services for comprehensive target profiling, systematic gene annotation, and data mining. These tools have been key enablers for curating hundreds of public and internal studies in relevant drug discovery areas, such as neuroscience and cardiovascular disease, and allowed his team to supplement public data sources with samples covering underrepresented tissue types in particular. Having access to so many studies gives Yao and his team greater confidence in results, because they can see at a glance whether intriguing results occur consistently in many projects, or are outliers from a single experiment. He said that OmicSoft tools have streamlined data analysis and made it possible to focus internal efforts on late-stage analysis.
From Merck, Associate Principal Scientist Jianchao Yao spoke about using OmicSoft tools for RNA-seq data pipelines. He described a robust workflow that runs thousands of RNA-seq projects, estimating that OmicSoft technology has automated about 80 percent of that process and is used for nearly all of these projects. He also offered advice to anyone dealing with a massive amount of data, such as carefully controlling metadata and establishing a mindset of good data integration.
OmicSoft collaborator Jun Ye, CEO of Sentieon, talked about the lightning-fast algorithms his team has built to accelerate the GATK and related variant calling pipelines used for precision medicine. By re-engineering these pipelines, Sentieon was able to run its version of GATK as much as 50 times faster — a dramatic savings not only in time but also in computational cost. This approach, which also offers consistent results with higher accuracy, is now powering alignment and variant calling for NGS data and enabling large-scale pairwise correlations in OmicSoft. Users can start a job in the OmicSoft interface, automatically process the data with Sentieon, and have results deposited directly into a Land tool for further analysis.
We also had several great talks from people on the QIAGEN Bioinformatics team. OmicSoft founder Jack Liu spoke about the synergies of becoming part of the QIAGEN family and introduced LandPortal, a new web client currently in early access designed to expand OmicSoft tool use to non-bioinformaticians and to facilitate easy sharing of results. He also mentioned recent improvements to existing tools: more than 53,000 samples have been added this year, and rat has been added as a model. Looking ahead, Liu said Array Studio will be harmonized with the new web portal, the Array Suite platform will see a major upgrade, and deeper integration with other QIAGEN tools such as Ingenuity Pathway Analysis® (IPA) will continue to advance.
An early example of that integration is the already powerful connection between OmicSoft and IPA, described in a pair of talks from Joseph Pearson and Jean-Noel Billaud. Using a study of anti-PD1 resistance in melanoma for illustration, they showed how RNA-seq data processing could be performed in Array Studio followed immediately by interpretation in IPA. Once in IPA, users can explore key signaling pathway activities, biological processes, splice variants, and more. For this data, Billaud presented results indicating upregulation of the epithelial-to-mesenchymal transition in melanoma non-responders, suggesting likely tumor progression to metastasis.
Finally, QIAGEN’s Frank Schacherer shared how the OmicSoft acquisition fits into the big-picture view of the company’s long-term strategy for bioinformatics. The addition of OmicSoft’s deep drug discovery informatics capabilities to the portfolio of solutions from Ingenuity Systems, CLC bio, and BIOBASE has allowed QIAGEN to help users get from the very first steps of sample prep all the way to extracting new knowledge from an experiment. That Sample to Insight approach inspired what is now the largest bioinformatics portfolio in the world, helping users publish more than 30,000 papers and adding 3,500 new data elements or findings every day. Schacherer said QIAGEN is committed to the idea that the goal isn’t algorithms, it’s knowledge — and helping users accelerate their journey to new hypotheses.
We’ve still got another day of great talks to go, and will update the blog with more highlights soon!