Latest improvements for QIAGEN CLC Genomics Server
QIAGEN CLC Genomics Server 23.0.5
Release date: 2023-09-20
Shared with workbenches
Improvements
- Detect and Refine Fusion Genes has a new option allowing fusions of overlapping genes on opposite strands to be reported.
- Previously, when annotation tracks were exported to BED format files, the Score column in the exported file contained only 0 values. Now, if the annotation track contains a Score column, those values are reported in the Score column of the exported file. (This does not affect expression tracks, where the expression value is exported as the score.)
- VCF Import can import VCF files with an unexpected number of values in CLCAD2 or AD. This includes VCF files produced by VarScan2.
- Various minor improvements
Bug fixes
- Fixed an issue that could cause Copy Number Variant Detection (CNVs) to give wrong results when targets were overlapping and coverage tables were used as control mappings, see https://digitalinsights.qiagen.com/technical-support/faq/important-clc-notifications/copy-number-variant-detection-cnvs-can-give-wrong-results-when-targets-overlap-and-coverage-tables-are-used-as-controls/.
- Fixed an issue causing Annotate with Nearby Gene Information to report incorrect nearest-gene information for the last gene (3') on a given chromosome.
- Fixed an issue causing Detect and Refine Fusion Genes to fail if the provided mRNA track contained transcripts annotated with priorities and the track was imported using the GFF3 importer.
- Fixed an issue causing Demultiplex Reads to fail with an error if the Edit/Up/Down buttons in the wizard were used when no tag was selected and the Reset button had earlier been pressed.
- Fixed an issue causing SAM/BAM import to fail when the provided reference element contained one or more circular sequences, but these sequences were not marked as circular in the SAM/BAM file and one or more reads mapped with unaligned ends at the beginning of the read.
- Fixed an issue causing Standard Import of GenBank format to import qualifiers' values as annotations surrounded by quotes. The surrounding quotes are now removed.
- Fixed an issue causing GFF3 export to fail when sequence annotations included features with incorrectly formatted frame qualifiers. Now, such frame qualifiers are ignored.
- Fixed an issue that could cause QC for Sequencing Reads to fail when provided with more than one sequence list, and one or more of those sequence lists contained very few sequences.
- Fixed an issue causing Create Sample Report and Combine Reports to fail, if an input report was named “report”.
Data related updates
From September 19, 2023, Download Pfam Database downloads Pfam 36.0. This update also affects download using earlier versions of the CLC Genomics Server.
Plugin notes
Import Immune Reference Segments, delivered by Biomedical Genomics Analysis Server Plugin and CLC Single Cell Analysis Server Extension, can now import V segments in IMGT format that end in the conserved amino acid. Previously, these segments were silently ignored.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 23.0.5.
- QIAGEN CLC Genomics Workbench 23.0.5
- QIAGEN CLC Main Workbench 23.0.5
- QIAGEN CLC Command Line Tools 23.0.5
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 23.0.1, 23.0.2, 23.0.3 and 23.0.4, QIAGEN CLC Main Workbench 23.0.1, 23.0.2, 23.0.3 and 23.0.4, and QIAGEN CLC Command Line Tools 23.0.1, 23.0.2, 23.0.3 and 23.0.4, can also connect to QIAGEN CLC Genomics Server 23.0.5.
CLC Server Command Line Tools
Compatibility
CLC Server Command Line Tools 23.0.5 is the corresponding client for QIAGEN CLC Genomics Server 23.0.5.
CLC Server Command Line Tools 23.0.5 can also act as a client for the QIAGEN CLC Genomics Server 23.0.1, 23.0.2, 23.0.3 and 23.0.4.
QIAGEN CLC Genomics Server 23.0.4
Release date: 2023-05-22
Shared with workbenches
Improvements
- Download BLAST Databases is more resilient to interrupted connections and similar issues when downloading large databases.
Bug fixes
- Fixed an issue where workflows containing a BAM export element could not be launched from CLC Genomics Workbench 23.0.3 to run on a CLC Genomics Server due to an error reported after selecting an export destination in the launch wizard ("The parameter 'Export destination' File not found.")
- Fixed an issue causing workflows to fail if they contained multiple Filter on Custom Criteria elements connected to a single downstream element, and one or more of the Filter on Custom Criteria outputs was empty.
- Fixed an issue causing QC for Read Mapping to report the number of unaligned ends instead of the number of reads with unaligned ends. This could cause “read count” and “% of all mapped reads” to be too high.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 23.0.4.
- QIAGEN CLC Genomics Workbench 23.0.4
- QIAGEN CLC Main Workbench 23.0.4
- QIAGEN CLC Command Line Tools 23.0.4
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 23.0.1, 23.0.2 and 23.0.3, QIAGEN CLC Main Workbench 23.0.1, 23.0.2 and 23.0.3, and QIAGEN CLC Command Line Tools 23.0.1, 23.0.2 and 23.0.3, can also connect to QIAGEN CLC Genomics Server 23.0.4.
CLC Server Command Line Tools
CLC Server Command Line Tools 23.0.4 is the corresponding client for QIAGEN CLC Genomics Server 23.0.4.
Compatibility
CLC Command Line Tools 23.0.4 is the corresponding client for QIAGEN CLC Genomics Server 23.0.4.
CLC Command Line Tools 23.0.4 can also act as a client for the QIAGEN CLC Genomics Server 23.0.1, 23.0.2 and 23.0.3.
QIAGEN CLC Genomics Server 23.0.3
Release date: 2023-04-19
Server specific
Bug fixes
- Fixed an issue affecting external application configurations, where an unlocked parameter in an exporter could not be mapped to a linked post-processing or high-throughput importer parameter.
Shared with workbenches
Improvements
- The SAM and BAM exporters have a new option relevant where there is one or more circular reference sequences. The new option, "Export reads spanning the origin of circular chromosomes as unmapped", is checked by default, making the default behavior of these exporters match that of CLC Genomics Server 22.x and earlier. This update changes the default behavior of these exporters relative to CLC Genomics Server 23.0.1 and 23.0.2. In those versions, reads that span the origin are exported as extending beyond the end of the reference. That behaviour corresponds to unchecking the new option.
- Import of PacBio SAM/BAM files with Platform Model (PM) set to HIFI are imported as HiFi reads without having to check the "Mark as HiFi reads" option.
- Producing an Amino Acid Track is now optional in Amino Acid Changes.
Bug fixes
- Fixed an issue affecting the homopolymer trimming options of Trim Reads. When enabled, homopolymers that started with 9 identical bases followed by a different base were not trimmed. Other homopolymers were trimmed as expected. This update may affect the number of reads trimmed in a given dataset, and thus could lead to differences in results from downstream analyses, relative to earlier software versions.
- Fixed an issue causing Detect and Refine Fusion Genes to fail on certain data sets.
- Fixed an issue causing RNA-Seq Analysis to fail when reads mapped to a gene located close to the origin of a circular chromosome.
- Fixed an issue causing SAM/BAM export to fail when reference sequence names contained commas, brackets or other characters not in the set of allowed characters according to the SAM format specification. These characters are now replaced by an underscore in the exported file.
- Fixed an issue causing import of SAM/BAM files to fail when they contained a Platform (PL) but no Platform Model (PM) in the header. This affected the PacBio importer, the Ion Torrent importer and Standard Import of reads from SAM/BAM files.
- Fixed an issue where lines in pdfs containing history information were not wrapped, resulting in the ends of long lines not being present in the exported document.
- Fixed an issue that caused VCF Export to fail when exporting fusions that had two or more filter criteria listed in the Filter column.
- Fixed an issue that caused Low Frequency Variant Detection, Fixed Ploidy Variant Detection, and Basic Variant Detection to fail when the end of a mapped read supported a deletion, and there was support in other reads for a variant at the subsequent position. This issue has only been observed for RNA-Seq data where splicing combined with primer trimming could lead to this situation.
- Fixed an issue causing Extract Reads to not correctly extract reads overlapping annotated regions that cross the origin of circular chromosomes when the type of overlap was set to "Span region" or "No overlap".
Plugin notes
Fixed an issue affecting Immune Repertoire Analysis, delivered by Biomedical Genomics Analysis Server Plugin, and Single Cell V(D)J-Seq Analysis, delivered by CLC Single Cell Analysis Server Extension. The tools failed if there were reads where the region that aligned to a C segment was contained within the region that aligned to a J segment.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 23.0.3.
- QIAGEN CLC Genomics Workbench 23.0.3
- QIAGEN CLC Main Workbench 23.0.3
- QIAGEN CLC Command Line Tools 23.0.3
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 23.0.1 and 23.0.3, QIAGEN CLC Main Workbench 23.0.1 and 23.0.2, and QIAGEN CLC Command Line Tools 23.0.1 and 23.0.2, can also connect to QIAGEN CLC Genomics Server 23.0.3.
CLC Server Command Line Tools
CLC Server Command Line Tools 23.0.3 is the corresponding client for QIAGEN CLC Genomics Server 23.0.3.
Compatibility
CLC Command Line Tools 23.0.3 is the corresponding client for QIAGEN CLC Genomics Server 23.0.3.
CLC Command Line Tools 23.0.3 can also act as a client for the QIAGEN CLC Genomics Server 23.0.1 and 23.0.2.
QIAGEN CLC Genomics Server 23.0.2
Release date: 2023-02-13
Shared with workbenches
Improvements and bug fixes
- The runtime of Amino Acid Changes has been significantly improved.
- Fixed an issue in the Trim Reads report where the number of “Trimmed (broken pairs)” was not reported per sequence list provided as input, but were instead added together incrementally. The number of reported “Trimmed reads” decreased correspondingly. The issue would occur when paired reads from more than one sequence list were trimmed and broken read pairs were produced.
- Fixed a rare issue that could cause Trim Reads to retain a wrong part of a read if the read was both trimmed based on quality scores and adapter read-through.
- Fixed an issue causing the Demultiplex Reads tool to always demulitplex based on a sequence structure of "barcode, sequence". Adjustments to the tag list, such as adding a linker or placing the barcode at the end, were ignored. This issue did not affect the tool when run in a workflow context.
- Fixed an issue that could cause Detect and Refine Fusion Genes to fail on Windows when either the dataset was large or fusion genes with many possible transcripts were detected.
- Fixed an issue that could cause VCF Export to fail when exporting filtered annotation tracks that were empty.
- Fixed an issue causing download of the QIAseq xHYB Viral Panels reference data set to fail on Windows.
- Fixed a rare issue where Rebuild Index could not repair a corrupt search index.
- Various minor bug fixes
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 23.0.2.
- QIAGEN CLC Genomics Workbench 23.0.2
- QIAGEN CLC Main Workbench 23.0.2
- QIAGEN CLC Command Line Tools 23.0.2
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 23.0.1, QIAGEN CLC Main Workbench 23.0.1 and QIAGEN CLC Command Line Tools 23.0.1 can also connect to QIAGEN CLC Genomics Server 23.0.2.
CLC Server Command Line Tools
CLC Command Line Tools 23.0.2 is the corresponding client for QIAGEN CLC Genomics Server 23.0.2.
QIAGEN CLC Genomics Server 23.0.1
Release date: 2023-01-17
Shared with Workbenches
Improvements and bug fixes
- Fixed an issue affecting Trim Reads, where the wrong part of a read was retained if the read was both trimmed to a fixed length and also trimmed by another method from the opposite end of the read.
- Fixed an issue affecting Trim Reads when both adapter trimming using a trim adapter list and fixed length trimming were selected. This issue could cause the resulting trimmed reads to be shorter than expected.
- Fixed an issue where fusion plots created by Detect and Refine Fusion Genes were omitted in the report and were not accessible via the fusion track table.
- Fixed an issue where workflows containing a Branch on Coverage element would fail for read mappings with no zero coverage regions when using reports output by QC for Read Mapping.
- Fixed an issue causing Annotate with GFF/GTF/GVF file to fail when the option "Ignore duplicate annotation" was checked.
- Fixed an issue causing Standard Import of GenBank format to stall if qualifier names spanned more than one line.
- Various minor improvements
Please see the release notes for CLC Genomics Workbench 23.0, below, for a full list of changes since the last general release of this software.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 23.0.1.
- QIAGEN CLC Genomics Workbench 23.0.1
- QIAGEN CLC Main Workbench 23.0.1
- QIAGEN CLC Command Line Tools 23.0.1
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server.
CLC Server Command Line Tools
CLC Command Line Tools 23.0.1 is the corresponding client for QIAGEN CLC Genomics Server 23.0.1
Compatibility
CLC Command Line Tools 23.0.1 is the corresponding client for QIAGEN CLC Genomics Server 23.0.1
CLC Command Line Tools 23.0.1 can also act as a client for the QIAGEN CLC Genomics Server 23.0.1
QIAGEN CLC Genomics Server 23.0
Release date: 2023-01-17
Server specific
New features and improvements
- Contents of import/export directories can be browsed from the web interface in the Browse server import/export directories tab under Element info.
- Contents of AWS S3 buckets accessible using AWS Connections configured in the CLC Server can be browsed from the web interface in the Browse S3 locations tab under Element info. Data can be uploaded to S3, downloaded from S3 and deleted in S3 from this area.
- In External Applications, a static script can be specified using the new parameter type: Included script. A script provided using this option becomes accessible to the external process at runtime. This enables integration scripts or extensive parameter files to be included in the External Application and injected into the execution context, rather than being an external dependency. For containerized External Applications this may be the injected integration that enables the direct use of a public available container.
- Files from AWS S3 can now be selected for the External file parameter type of External Applications.
Other improvements
- Search functionality has been substantially improved. Please see the "Important change to search indexes - action needed" section below about changes to search indexes related to this improvement. Indexes for all CLC Locations can be rebuilt using the "Rebuild all indexes" button under Configuration | Main Configuration | File system locations.
- Admin level access to the audit log can be granted to specified groups. The ability to broaden access beyond admin users to installing and configuring workflows, configuring and enabling external applications, and viewing the CLC Server queue, was introduced with version 22.0.
- The message returned upon successful login to the CLC Server now includes information about the connection (username, the CLC Server description, and encryption status). Previously the return message was "Login successful".
- The full names of graphics exporters are listed when configuring External Applications. Previously, the name "Graphics" was used for each of these.
- A search box has been added to several locations in the web client where long lists are presented, for example, in the Algorithms section under Global Permissions.
- Active CLC File System Locations are listed in alphanumeric order in the web administrative interface. Previously they were listed in the order they were added.
- Apache Tomcat has been updated to version 9.0.65.
- Various minor improvements
Bug fixes
- Fixed an issue where CLC Workbenches could not interact with elements stored on a CLC Server if those elements were created using tools provided by a plugin that was no longer installed on the CLC Server.
- Fixed an issue where text in installer screens was not visible when installing the software in 'dark mode' on Linux.
- Various minor bugfixes
Changes
- AWS account details are now entered into AWS Connections. This term replaces the earlier term: "S3 locations". An AWS region can be specified. When upgrading from an earlier version with AWS account information already configured, the region will be set to the default for the specified AWS partition. For AWS Standard, this is us-east-1. The region can be updated by editing the connection. The region setting is primarily relevant if you plan to submit analyses from a CLC Server with the Cloud Server Plugin installed to run on a CLC Genomics Cloud setup.
- The Core tasks area under the Global Permissions tab has been removed. Standard Data Import is now listed under the Algorithms section with the name "Import Standard Data". The Data Export setting under Core tasks was legacy functionality, only relevant to External Applications with exporters configured in CLC Genomics Server 9.x and earlier. Permissions previously set for both standard import and legacy exporters are retained.
- The Java version bundled with CLC Genomics Server 23.0 Java 17.0.4, where we use the JRE from the Azul OpenJDK builds.
Important change to search indexes - action needed
- Search functionality has been substantially improved. Associated with this, indexes for all CLC Server data locations must be rebuilt after upgrading to 23.0. If they are not, searches for elements in these locations will not find any results, and data associations to CLC Metadata Tables will not be registered. Indexes built using version 23.0 are placed in a folder called "searchindex2" in the installation area of the CLC Server.
- Old search indexes are not automatically deleted. They can be left in place without detrimental effect, or deleted manually. They are found in the folder "searchindex" in the installation area of the CLC Server.
Functionality retirement
- Boolean compound parameters in External Applications. These were made legacy with version 21.0 and are no longer supported in External Application configurations.
Shared with CLC Workbenches
New tools
- Create K-medoids Clustering for RNA-Seq finds clusters of features, e.g., genes/transcripts/miRNAs etc, whose expressions behave similarly, for example first increasing over time and then decreasing. The tool produces a Clustering Collection which contains a Sankey plot showing how these features move between clusters under different conditions, for example different treatments. A line graph representation of features from individual clusters or pairs of clusters is present as well.
New tools coming from plugins
- Detect and Refine Fusion Genes - Find fusion genes in RNA-Seq data by identifying potential fusions and then refining that list by evaluation of the evidence for each fusion. This is an updated version of the tool formerly distributed in the Biomedical Genomics Analysis Server Plugin. The updates made are listed in an Improvements section below.
- Target Region Coverage Analysis - Analyze and compare coverage from multiple samples. This tool was formerly distributed in the Biomedical Genomics Analysis Server Plugin..
- Create Consensus Sequences from Variants – Create consensus sequences from a variant track and a reference sequence. This tool was formerly distributed in the Biomedical Genomics Analysis Server Plugin.
- Annotate with GFF/GVF/GTF file - Add annotations from a GFF, GVF or GTF format file onto sequences, individual or in sequence lists. This tool was formerly distributed in the Annotate with GFF file sever plugin.
Other new functionality and improvements
RNA-Seq Analysis and miRNA analysis tools
- Substantial speed improvements to RNA-Seq Analysis. Reads that map to multiple transcripts or genes will be distributed differently than earlier due to different choices of random seed in the new implementation. The algorithm is still deterministic.
- Transcripts are no longer renamed in Transcript Expression (TE) output unless renaming is necessary to avoid duplicate names. Previously, transcripts were renamed to the gene name plus a number e.g. "BRCA_1". This change means that TE tracks in this version of the software cannot typically be used together with TE tracks generated using older versions to produce Heat Maps, PCA plots, Expression, etc.
- Reports UMI fragment counts when relevant. UMI counts are included in the Fragment statistics section of the report if the input reads are annotated with UMIs by tools from the Biomedical Genomics Analysis plugin, and if the library type is set to 3' sequencing for RNA-Seq Analysis.
- Venn diagrams support four and five groups. Previously up to 3 were supported. Tooltips indicate which groups are part of a specific intersection.
- Quantify miRNA:
- Handles custom databases containing duplicated names.
- Does not allow custom databases containing sequences longer than 60bp. This avoids misallocation of reads to sequences that are similar to small RNAs.
- When adding multiple inputs to Extract IsomiR Counts, the extracted expression tables contain an entry for the combined set of IsomiRs identified among the samples, making them compatible for analysis in Differential Expression in Two Groups and Differential Expression for RNA-Seq.
Differential Expression for RNA-Seq and Differential Expression in Two Groups
- A new option for creating a subset has been added to the miRNA Statistical Comparison Table produced by Differential Expression for RNA-Seq and Differential Expression in Two Groups.
- It is possible to downweigh outliers. This option is disabled by default and recommended only when the results seem enriched for genes that are expressed at anomalously high levels in a small proportion of samples.
- The Max Group Means column of Statistical Comparison Tracks and Tables now shows TPM instead of RPKM. Note that this column is used for filtering data in tools such as Create Heat Map for RNA-Seq and the Pathway Analysis tool of the Ingenuity Pathway Analysis plugin.
Detect and Refine Fusion Genes
This is an updated version of Detect and Refine Fusion Genes, formerly distributed in the Biomedical Genomics Analysis Server Plugin. The updates listed here are relative to the version distributed with Biomedical Genomics Analysis Server Plugin 22.2.
- Fusions will not be called for overlapping genes.
- Novel exon boundary improvements:
- Options have been expanded to allow for detecting fusions with a single fusion partner ("Detect with novel exon boundaries") as well as detecting those with 2 fusion partners ("Allow fusions with novel exon boundaries in both genes")
- The "Detect exon skippings" option supports detection of fusions with novel exon boundaries.
- An option has been added to omit non-significant breakpoints from the report.
- A minimum Z-score can now be specified for use when evaluating evidence for a fusion.
- Speed improvements
- The option "Allow fusions with novel exon boundaries in both genes" now defaults to false to reduce the number of false positive fusions. Setting it to true is useful for exhaustive searches of novel fusions.
- Changes to the maximum number of equivalent matches to the reference allowed for a single read to be retained:
- When remapping reads to a fusion chromosome, the maximum number is now 30. Previously it was 10.
- When searching for unaligned ends, the maximum number remains unchanged, as 10.
- The option "Maximum number of hits for a read" has been removed. It's value was ignored in previous versions.
- Fusions from mRNA transcripts without an associated gene in the Gene track are not used when detecting fusions. mRNA transcript features must have a gene id in one of the following columns to be matched with the associated gene: "Parent", "gene_id" or "gene_name".
- Fixed an issue where paired end reads were treated as single end reads when the option to "Only use fusion primer reads" was enabled.
- Fixed an issue where unaligned ends could be too long or too short for reads containing insertions and deletions. This change may lead to small differences in results compared to earlier versions, expected to be due to a decrease in false positive and false negatives reported.
Bisulfite mapping
- Map Reads to Bisulfite Reference speed improvement. This is data dependent, with about a 50% improvement likely for most data sets. This speed up might change the details of results very slightly.
- Call Methylation Level speed improvement. This speedup might, in some cases, change results very slightly.
- Import of read mappings from SAM/BAM now use methylation information from the optional SAM tags XR for read conversion and XG for reference conversion. The recognized values are "CT" and "GA". Support for these tags is added so that information is not lost if a bisulfite mapping is exported and then re-imported.
- Export of read mappings to SAM/BAM format now includes details on bisulfite conversion. These are specified using the SAM tags XR for read conversion and XG for reference conversion. The possible values of these tags are "CT" and "GA". This is provided for increased compatibility with third party tools.
Import and export
- VCF Import:
- Supports symbolic alleles for inversions (<INV>), insertions (<INS>), deletions (<DEL>) and tandem duplications (<TANDEM:DUP>). Symbolic alleles that do not contain sequence information or are longer than 100,000 base pairs are imported to annotation tracks instead of variant tracks. Previously symbolic alleles were not imported.
- Improved handling of variants with multiple loci encoded in the same vcf record.
- VCF Export supports symbolic allele representation for insertions (<INS>), deletions (<DEL>) and tandem duplications (<TANDEM:DUP>). (Inversions (<INV>) were already supported.) With the exception of deletions, variants in annotation tracks are always exported as symbolic alleles. Deletions in annotation tracks and variants in variant tracks above a specified size are also exported as symbolic alleles. The default size is 1000 bp, which corresponds with the QCI Interpret requirement that InDels > 1000 bp must be represented as symbolic alleles.
- The PacBio importer supports HiFi reads.
- The read length when exporting to FASTQ format files has been increased from 524,288 bp to 16,777,216 bp.
- SAM/BAM Mapping Files importer:
- Performance improvements
- The circular flag of references is now retained.
- Import Tracks from File has been updated to show a warning if the file is not imported.
- GFF3 Export retains the case of attribute headers. Previously, all headers were adjusted to lower case during export.
- The history information of elements imported using Standard Import includes the specific importer used (e.g. "CSV table importer", "Fasta Importer", etc).
- Standard Import can be used to import files from AWS S3 locations.
- When exporting images to bitmap-based formats, the Screen resolution and High resolution options are now bounded so the maximum supported number of pixels will not be exceeded.
Various
- Read mapping speed on Apple Silicon processors has been improved. Read mapping results are not affected by this. Tools benefiting from this change include Map Reads to Reference, RNA-Seq Analysis, Map Reads to Contigs and Map Bisulfite Reads to Reference.
- Branch on Coverage - a new workflow control flow element where the downstream processing of read mappings can be controlled based on coverage values within reports.
- Barcodes can be preconfigured in Demultiplex Reads elements in workflows.
- Demultiplex Reads has been updated to:
- Report barcodes without any matched reads
- Show the barcodes names in the history.
- Workflow Export elements can be preconfigured to export to locations on AWS S3.
- When Low Frequency Variant Detection, Fixed Ploidy Variant Detection or Basic Variant Detection was used with a mapping realigned using Local Realignment with a guidance variant track, it was possible for partial insertions to be called. Now, the full insertion must be present within at least one, individual read for it to be reported.
- QC for Targeted Sequencing:
- Can report coverage statistics per gene.
- Supports analysis of read mappings generated by RNA-Seq Analysis.
- Annotate with Exon Numbers:
- Can add exon numbers to elements in annotation, expression and statistical comparison tracks. Previously only variant tracks could be annotated with exon numbers.
- Adds exon numbers when input elements start outside an exon but still overlap the exon.
- Adds all exons when multiple exons overlaps a single input element.
- Allows annotation with exons from only one transcript or CDS.
- Filter on Custom Criteria can be used to filter Statistical Comparison Tracks, Statistical Comparison Tables, IsomiR tables, and miRNA Seed Tables.
- Reports from Create Sample Reports and Combined Report generated using RNA-Seq reports now include the percentage of reads mapped to exons in the Fragment counting statistics table.
- In Create Sample Report, the percentage of target region positions with coverage above a set threshold can be used as a QC metric.
- QC for Sequencing Reads processes only the first 100,000 base pairs in long reads. Before the tool would fail when provided with very long reads.
- When Annotate with Overlap Information is included more than once in the same workflow, columns with overlap information are now always added in the same order. Previously, concurrency issues could cause column order to be different between different runs.
- Local Realignment no longer realigns reads into regions with no coverage, such as introns in RNA-Seq read mappings.
- Remove Duplicate Mapped Reads uses an improved method to identify duplicate reads when handling paired end reads. In general, this improvement results in slightly more reads being considered duplicates.
- The options for extracting reads according to their location relative to features in an overlap track have been expanded in Extract Reads. Previously reads had to lie fully within an annotated region to be extracted. Now, in addition to that condition, options are provided for extracting any overlapping reads, extracting only reads that fully span annotated regions or extracting all reads except those that overlap with annotations in the overlap track.
- Assemble Sequences to Reference supports alignment of reads that span the origin of a circular reference.
- Secondary Peak Calling has a new option "Peak detection stringency".
- The report from Copy Number Variant Detection (CNVs):
- Includes a table showing the number of genes affected by CNV calls.
- Contains new coverage plots at genome and chromosome levels.
- The Trim Reads report now includes statistics for the number of reads in intact pairs and in broken pairs.
- Updated restriction site database to REBASE 2022-06-30.
- The Identify Known Mutations from Mappings output channel names when used in a workflow have been improved. The elements produced by the tool have not been changed.
- While viewing data, in most situations, tooltips can be suppressed by holding down the Ctrl key. Similarly those tooltips can be displayed immediately, instead of a moment after the mouse cursor stops moving, by holding down the Shift key.
- Various minor improvements
Bug fixes
- Low Frequency Variant Detection, Fixed Ploidy Variant Detection and Basic Variant Detection:
- Fixed an issue that in very rare cases caused insertions to be called twice. Now, the same insertion is always only included once in the variant track.
- Fixed an issue in the remove pyro-error variants filter. Previously, the frequency threshold for removing pyro-error variants was ignored and more variants than intended were removed. The filter is generally only used for Ion Torrent data. This fix may result in a small improvement to the precision of variant detection.
- Fixed a rare issue affecting variant calling in very low coverage regions, where a variant could be reported that was not present in any single read in the mapping.
- Fixed an issue causing Map Reads to Reference to fail if a masking track covering a whole chromosome was provided as input.
- RNA-Seq Analysis
- Fixed an issue where reads were not counted as unique for a transcript in the GE track table, if the read could map in multiple ways to the same transcript, but only to that transcript.
- Fixed an issue that could lead to an IndexOutOfBounds error when the option "Calculate expression for genes without transcripts" was selected, and two or more genes had the same name, and at least one of these has no transcripts, and the Region column of the table view of the gene track contains the text "join", ">", or "<" (i.e., the genes have splice structure, or uncertain end positions).
- Fixed an issue where the gene identifier would be removed from the statistical comparison track and tables produced by the Differential Expression for RNA-Seq tool when it was not recognized to be an Ensembl gene identifier.
- Fixed an issue in Differential Expression in Two Groups and Differential Expression for RNA-Seq that affected the estimation of dispersion estimates including information from nearby genes. This leads to slightly different p-values produced by by these 2 tools.
- Fixed an issue affecting Extract Consensus Sequence where annotations transferred from the reference sequence to the consensus sequence could be wrongly positioned if the read mapping had an insertion in a region that was removed due to low coverage.
- Fixed an issue where, if two genes had the same name and overlapped, their transcripts might become assigned to only one of the genes. The fix only applies when the gene and transcript annotations are imported from GFF3.
- Fixed an issue affecting the naming of outputs from Local Realignment when the tool was provided with multiple read mappings as input and not run in batch mode. Each resulting realigned read mapping is now named after the corresponding input. Previously all the realigned read mappings were named after the first read mapping in the set of inputs.
- QC for Sequencing Reads
- Fixed an issue in the report where the graph for R1 nucleotide contributions would be truncated to only show the same number of nucleotides as the R2 plot.
- Fixed an issue where the median read length in the supplementary report could be incorrect when the number of reads was very low. The median reported in the graphical report was correct.
- Amino Acid Changes
- Fixed an issue causing the output from to be named after the reference data instead of the input data.
- Fixed an issue that caused the transcripts and proteins listed in the Coding region change and Amino acid change columns in the annotated variant track output to be inconsistently ordered.
- Fixed an issue in the Trim Reads report, where the number of reads under “No trim” could be incorrect when "Remove fixed number of bases” was enabled.
- Fixed an issue causing Show Enzymes Cutting Inside/Outside Selection to give wrong results when the selection crossed the junction of a circular sequence and a desired number of cut sites outside the selection was not specified.
- Fixed an issue in VCF Export, where specified minimum ploidy was not always enforced for complex variants. The issue would only occur when an allele had first been removed from a locus to adhere to the specified maximum ploidy.
- Fixed an issue where the wrong entry in a trim adapter list would be opened for editing if the list had been sorted or filtered.
- Fixed a rare issue in K-means/medoids clustering where a gene could be output in multiple clusters. This would occur when genes with identical expressions were chosen to be medoids, and so would only happen when K was comparable to the number of genes with unique expressions across samples.
- Fixed issues with Quantify miRNA where:
- It would fail on paired reads if using spike-ins.
- Opening a sequence list to view it would cause this tool to fail if that same sequence list had been used as input.
- In the report from Create Sample Report the value column in the summary table is coloured green or yellow according to whether the threshold is met. Previously, the threshold column was coloured.
- Workflow related
- Fixed an issue affecting the location of outputs generated from a workflow element that was also linked to a Collect and Distribute element. In cases where the output folder name was defined using the {input} or {2} placeholder, these outputs were sometimes all saved to the first folder created, instead of to different folders as intended.
- Fixed an issue where default names were applied to outputs from Output elements attached directly to an Iterate element in workflows, even when naming placeholders had been configured.
- Fixed an issue affecting workflows with nested Iterate elements where results from the outer level of iteration flowed into a Distribute and Collect element. Any output elements generated in the inner iteration, which should have saved, were lost.
- Fixed an issue where unlocked options for on-the-fly importers in a workflow would be locked if the Input element was re-opened for editing.
- Fixed an issue affecting hyperlinked table entries, where html tags were sometimes included as text in the information exported to Excel or CSV formats.
- Fixed an issue where text in installer screens was not visible when installing the software in 'dark mode' on Linux.
- Various other minor bug fixes
Legacy tools
The following tools are now legacy tools and will be retired in a future version of the software:
- QIAGEN GeneReader importer (Legacy)
Functionality retirement
The following tools have been retired:
-
- Compare Sample Variant Tracks (Legacy)
- Empirical Analysis of DGE (Legacy)
Plugin notes
Plugin retirements
- Annotate with GFF file server plugin. The tool Annotate with GFF/GVF/GTF file is now available directly in the server.
- Haplotype Calling Server Plugin (beta). Functionality from this plugin is now in the Biomedical Genomics Analysis Server Plugin.
Compatibility
The follow are the corresponding client applications for CLC Genomics Server 23.0:
-
-
- CLC Genomics Workbench 23.0
- CLC Main Workbench 23.0
- CLC Command Line Tools 23.0
-
CLC Server Command Line Tools
Please see the CLC Genomics Server 23.0 listings above for the details about the new tools and features listed here.
New tools
New tools and functionality
- create_kmedoids_for_rnaseq
New tools previously included in plugins
- annotate_with_gff (previously distributed in the Annotate with GFF file plugin)
- consensus_from_variants (previously distributed in the Biomedical Genomics Analysis plugin)
- detect_and_refine_fusion_genes (previously distributed in the Biomedical Genomics Analysis plugin)
- target_region_coverage_analysis (previously distributed in the Biomedical Genomics Analysis plugin)
New and updated options for existing tools
- differential_expression_rna_seq
- option added: --downweight-outliers
- differential_expression_two_groups
- option added: --downweight-outliers
- download_sra
- option removed: --aspera-limit
- option removed: --enable-aspera
- extract_overlapping_reads
- option removed: --in-interval
- option added: --overlap-type
- process_tagged_sequences
- option added: --barcode-values
Barcode structure and barcode values are now provided in separate parameters:
--barcode-1 "linker type#MULTIPLEX_BARCODE#fixedLength#3;linker type#MULTIPLEX_SEQUENCE#maxLength#1000" --barcode-2 "linker type#MULTIPLEX_BARCODE#fixedLength#3;linker type#MULTIPLEX_SEQUENCE#maxLength#1000" --barcode-values "a/b#AAA/GGG#ATA/ATA"
Previously, structure and values were provided in the same parameter.
- secondary_peak_calling
- option added: --peak-slope-stringency
- statistics_target_regions
- option added: --create-gene-coverage-track
- option added: --genes
Other updates
Importers
- ngs_import_pacbio
- option added: --hifi-reads
- option removed: --only-sequencing-zmws
- option remove: --read-hq-regions
Improvements
- Basic data operations such as copying, can be carried out on data elements created using plugins.
Commands retired
- compare_sample_variant_tracks
- empirical_analysis_dge
Bugfixes
- Fixed an issue where text in installer screens was not visible when installing the software in 'dark mode' on Linux.