MobiVision V(D)J Outputs Overview

Output result file

The default output files of mobivision vdj are as follows, with a total of 17 files, among which the SAMPLEID_outs directory is automatically generated by the software and does not need to be specified by the user. In addition, the output result file SAMPLEID_outs file that we are most concerned about will also be generated. The prefix SAMPLEID represents the ID name of its own sample. The result file of SAMPLEID_outs contains 15 files, and the specific files are explained as follows:

  • 1._flagdone is the flag file that the task runs successfully, and it will be automatically output after the mobivision vdj task is completed;

  • 2._log is the log file generated during the task running;

  • 3.SAMPLEID_airr_arrangement.csv represents the contig rearrangement results in airr format;

  • 4.SAMPLEID_all_contig_annotations.bed represents the annotation results of all contigs in bed format

  • 5.SAMPLEID_all_contig_annotations.csv represents the annotation results of all contigs in csv format

  • 6.SAMPLEID_all_contig_annotations.json represents the annotation results of all contigs in json format

  • 7.SAMPLEID_all_contig.fasta is the fasta sequence file of all contigs

  • 8.SAMPLEID_all_contig.fasta.fai is the index file of all contig fasta sequences

  • 9.SAMPLEID_all_contig.fastq represents the fastq file containing sequence quality information in the contig sequence

  • 10.SAMPLEID_clonotypes.csv represents the result file of clonotypes

  • 11.SAMPLEID_filtered_contig_annotations.csv represents the annotation result file of the filtered contigs

  • 12.SAMPLEID_filtered_contig.fasta represents the sequence file of the filtered contig in fasta format

  • 13.SAMPLEID_filtered_contig.fasta.fai represents the index file of the filtered contig fasta sequence

  • 14.SAMPLEID_filtered_contig.fastq represents the fastq sequence file of the filtered contig

  • 15.SAMPLEID_metrics_summary.csv represents the analysis summary file in csv format

  • 16.SAMPLEID_Report.html is a quality control report in html format, which can visualize the data quality results and facilitate users to intuitively judge the quality of the library

  • 17.SAMPLEID_Report.json represents the quality control report in json format

Quality control report

After the mobivision vdj analysis is completed, an html quality control report will be generated. BCR sequencing, TCR sequencing, and hybrid library construction sequencing will generate corresponding html quality control reports, and their contents are roughly the same. Here we will introduce the html report of BCR and the html report of TCR respectively. In terms of content composition, both reports are composed of six parts: Overview, Sample, Cells, Sequencing & Enrichment, VDJ Annotation, and Clonetypes.

1. 1. TCR html report

01 Overview

In the html report of TCR, the first line of the report contains the above three indicators. These three indicators respectively represent the number of T cells estimated based on TCR data, the average number of reads per cell, and the number of cells containing valid V-J pairs. Users can use these three indicators to judge the complexity and sequencing depth of the sequencing library , so as to evaluate whether the constructed library meets expectations.

02 Sample

The Sample column contains the following information:

  • Sample name
  • Reference genome name
  • Library building kit name
  • Process version name of the analysis software

03Cells

In the html report of TCR, the left picture of the Cells column is the Barcode Rank Plot, and the right side is the cell-related indicators. The figure on the left describes the quantitative relationship between Barcodes and UMI Counts. The horizontal axis represents the label sequence number of UMI Counts from high to low, and the vertical axis represents the number of UMIs corresponding to each cell label. Compared with the indicators in the Overview at the beginning of the report, the average expression of TRA UMI and TRB UMI in each cell and the effective reads in each cell are also described here. For the specific interpretation of these indicators, users can click the question mark in the upper right corner to obtain more detailed help information (other columns can also obtain help information in the same form). The following is the detailed help information after clicking the question mark:

04 Sequencing & Enrichment

On the left side of the Sequencing & Enrichment column are the three indicators of the read alignment, respectively indicating the percentage of the reads aligned to the V(D)J gene, aligned to TRA, and aligned to TRB in all reads. The right side is the sequencing quality index, from top to bottom are the number of reads sequenced, the percentage of Q30 bases in barcodes, the percentage of effective Barcodes, the percentage of Q30 bases in RNA sequencing fragment Read1, and the percentage of Q30 bases in RNA sequencing fragment Read2 and the percentage of Q30 bases in UMI.

05 VDJ comments

The VDJ annotation column contains 11 annotation indicators, corresponding to the number of paired clonotypes, cells containing TRA contigs, cells containing TRB contigs, cells containing TRA contigs spanning the V-J region, and cells containing TRA contigs spanning the V-J region cells containing a viable spanning V-J pair, cells containing a viable spanning (TRA,TRB) V-J pair, cells containing a TRA contig with annotated CDR3, cells containing a TRB contig with annotated CDR3 Percentage of cells, cells containing viable TRA contigs versus cells containing viable TRB contigs. Regarding these indicators, if users have any questions, please remind them again, and you can click the question mark in the upper right corner to get more detailed help information. As follows, the specific help information for the VDJ comment section:


06 Clonetype

Clonetype is mainly divided into two parts. The first part is a histogram of the cell ratio of the top 10 most abundant clonotypes. The second part is the ID of the top 10 most abundant clonotypes, the amino acid sequence, frequency and A table of percentages of the whole.

2. BCR html report

Compared with the html report of TCR, the content framework of the BCR report is roughly the same. However, since BCR is aimed at indicators such as IgH, IgK, and IgL, and TCR is aimed at indicators such as TRA and TRB, there are still differences in specific evaluation indicators between the two. These differences are mainly distributed in the Cells, Sequencing & Enrichment, VDJ Annotaition and Clonetypes sections. The specific display is as follows:

03 Cells

Compared with the indicators on the right side of the Cells column of TCR, the indicators on the right side of the Cells column of BCRs mainly replace the median values of TRA UMIs and TRB UMIs per cell with IGH UMIs, IGK UMIs and IGL UMIs per cell median.

04 Sequencing & Enrichment

Compared with the left index in the Sequencing & Enrichment column of TCR, the left index in the Sequencing & Enrichment column of BCRs also replaces the fragments aligned to TRA and the fragments aligned to TRB with the fragments aligned to IGH , the fragments aligned to IGK and the fragments aligned to IGL.

05 VDJ Annotation

Compared with TCR's VDJ Annotation, BCR's Annotation index mainly replaces the annotation evaluation indexes of TRA and TRB with the annotation evaluation indexes of IGH, IGK and IGL.

06 Clonetype

Compared with the table of the ID of the top 10 most abundant clonotypes in the VDJ clonotype of TCR, the amino acid sequence of CDR3s, the frequency and the overall ratio, the biggest difference between the VDJ clonotype table of BCR and it is the CDR3s amino acid There are differences in the chain, BCR is composed of IGH, IGK and IGL, and TCR is still composed of TRA and TRB.