deRSE19 - Konferenz für ForschungssoftwareentwicklerInnen in Deutschland

»Common Workflow Language (CWL)-based software pipeline for de novo genome assembly from long- and short-read data«
2019-06-05, 14:30–15:00, Haus A56: Konferenzsaal

For reliable gene prediction and post-genomic analyses, reference quality genome assemblies are essential. Here, we created an automated pipeline for the de novo-assembly of genomes from PacBio long-read and Illumina short-read data using common workflow language (CWL). This pipeline integrates and enables an automated installation and execution of a host of software tools, overcomes the challenges of achieving repeatability and reproducibility of assembly results, and offers a platform for the re-use of the workflow and the integration of diverse data sets. We achieved assemblies that meet the high standards set by the National Human Genome Research Institute (NHGRI), to underpin accurate gene predictions and expanded genomic analyses.