Exploring TCR TiRP Scores for Treg Identification and Population analysis

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

This paper examines the application of the TiRP score, a likelihood score for Treg cells, and its underlying features for predicting T cell phenotypes (Treg or Tconv) and T cell population dynamics. These features were investigated using a random forest classifier, clustering and using the Bray-Curtis statistic. Non-overlapping TCR sequences from a reference dataset were used to train and validate the random forest model, but the predictions were not significantly better than random, indicating that the TiRP score and its underlying features are insufficient to act as a definitive classifier. Afterwards, hierarchical clustering was employed to investigate Tregness patterns in the Emerson dataset based on various features such as age, gender, CMV status, and race, but no clear patterns emerged. Additionally, Bray-Curtis (BC) similarity scores between the Emerson dataset and reference datasets were calculated, showing equally high dissimilarity compared to Treg and Tconv populations, indicating that the BC scores were non-informative on the nature of the donor T cell repertoire. In addition, the BC scores did not exhibit significant changes with age or gender. Overall, the TiRP score proved inadequate for population dynamics and classifier models due to overlapping TCR, low diversity and the inherent noise in the scoring.

Keywords

Citation