

Parsing T- and B-cell receptor segments from IMGT database to a flexible plain-text format

Getting raw sequences

Instructions for downloading raw IMGT files:

Running the software

Get the compiled binaries and run the software as java -jar segmentparser.jar [options] imgt_raw_file output_prefix.

The following options can be selected:

Output files include:

SegmentParser generates a tab-delimited table with species name, gene and segment id, nucleotide sequence and the reference point position: 0-based coordinate of first nucleotide after conserved Cys for Variable segments and before first nucleotide before conserved Phe/Trp for Joining segments. The metadata table provided with results lists all species and genes and tells if there are any V/D/J segments associated with them (0 or 1 in corresponding row).

Note that CDR2.5 is a putative MHC-binding region of TCR V segment, defined in a recent work of Paul Thomas lab (Dash et al. Nature 2017).