Hepatotoxicity is a leading cause of attrition in the drug development process. Traditional preclinical and clinical studies to evaluate hepatotoxicity liabilities are expensive and time consuming. With the advent of critical advancements in high-throughput screening, there has been a rapid accumulation of in vitro toxicity data available to inform the risk assessment of new pharmaceuticals and chemicals. To this end, we curated and merged all available in vivo hepatotoxicity data obtained from the literature and public resources, which yielded a comprehensive database of 4089 compounds that includes hepatotoxicity classifications. After dividing the original database of chemicals into modeling and test sets, PubChem assay data were automatically extracted using an in-house data mining tool and clustered based on relationships between structural fragments and cellular responses in in vitro assays. The resultant PubChem assay clusters were further investigated. During the cross-validation procedure, the biological data obtained from several assay clusters exhibited high predictivity of hepatotoxicity and these assays were selected to evaluate the test set compounds. The read-across results indicated that if a new compound contained specific identified chemical fragments (ie, Molecular Initiating Event) and showed active responses in the relevant selected PubChem assays, there was potential for the chemical to be hepatotoxic in vivo. Furthermore, several mechanisms that might contribute to toxicity were derived from the modeling results including alterations in nuclear receptor signaling and inhibition of DNA repair. This modeling strategy can be further applied to the investigation of other complex chemical toxicity phenomena (eg, developmental and reproductive toxicities) as well as drug efficacy.
All Science Journal Classification (ASJC) codes