Prediction of Liver Toxicity using Machine Learning to aid Drug Discovery

D. Brunnsåker

Master's Thesis, Chalmers University of Technology, Feb 2020.

This thesis proposes a method of predicting drug incuded liver injury using transcriptomic data from the toxicogenomical databases TG-GATEs (Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System) and DrugMatrix with the help of various machine learning algorithms. The possibility of using the toxicological database CMap in cooperation with the NCI60 human tumor cell lines screen to make prediction models for in vitro cytotoxicity using the same methodology was also investigated. It was found that transcriptomic data can indeed be used to predict liver injury in rat with very high accuracy. The in silico models developed in this project also outperforms similar existing solutions on completely external testing-sets, generating models successfully predicting four different histopathologies: Liver-necrosis, fibrosis, hyperplasia and mitotic alterations. In vitro cytotoxicity was also predicted by the models with relatively high accuracy, more specifically on the cancer cell line A-549. The model was also evaluated on primary human hepatocytes exposed to hepatotoxic agents, finding dose-response relationships. Additional findings included the importance of selecting appropriate featuresets when predicting specific adverse effects and also the applicability of synthetic oversampling techniques in collaboration with transfer-learning when used on transcriptomic data.

A reprint is available as PDF.

Further publications by Daniel Brunnsåker.