Deep Learning Architectures For the Prediction of YY1-Mediated Chromatin LoopsAhtisham Fazeel Abbasi; Muhammad Nabeel Asim; Johan Trygg; Andreas Dengel; Sheraz Ahmed
In: Xuan Guo; Serghei Mangul; Murray Patterson; Alexander Zelikovsky (Hrsg.). Bioinformatics Research and Applications, 19th International Symposium, ISBRA 2023, Proceedings. International Symposium on Bioinformatics Research and Applications (ISBRA-2023), October 9-12, Wroclaw, Poland, Pages 72-84, Lecture Notes in Computer Science (LNC), Vol. 14248, ISBN 978-981-99-7073-5, Springer, 2023.
YY1-mediated chromatin loops play substantial roles in ba- sic biological processes like gene regulation, cell differentiation, and DNA replication. YY1-mediated chromatin loop prediction is important to un- derstand diverse types of biological processes which may lead to the de- velopment of new therapeutics for neurological disorders and cancers. Existing deep learning predictors are capable to predict YY1-mediated chromatin loops in two different cell lines however, they showed lim- ited performance for the prediction of YY1-mediated loops in the same cell lines and suffer significant performance deterioration in cross cell line setting. To provide computational predictors capable of performing large-scale analyses of YY1-mediated loop prediction across multiple cell lines, this paper presents two novel deep learning predictors. The two pro- posed predictors make use of Word2vec, one hot encoding for sequence representation and long short-term memory, and a convolution neural network along with a gradient flow strategy similar to DenseNet archi- tectures. Both of the predictors are evaluated on two different benchmark datasets of two cell lines HCT116 and K562. Overall the proposed predic- tors outperform existing DEEPYY1 predictor with an average maximum margin of 4.65%, 7.45% in terms of AUROC, and accuracy, across both of the datases over the independent test sets and 5.1%, 3.2% over 5- fold validation. In terms of cross-cell evaluation, the proposed predictors boast maximum performance enhancements of up to 9.5% and 27.1% in terms of AUROC over HCT116 and K562 datasets.