TY - JOUR
T1 - Analysis of publically available skin sensitization data from REACH registrations 2008-2014
AU - Luechtefeld, Thomas
AU - Maertens, Alexandra
AU - Russo, Daniel P.
AU - Rovida, Costanza
AU - Zhu, Hao
AU - Hartung, Thomas
N1 - Funding Information:
T. L. was supported by NIEHS training grant (T32 ES007141). Support from the EU Horizon 2020 project EUToxRisk is gratefully appreciated
PY - 2016
Y1 - 2016
N2 - The public data on skin sensitization from REACH registrations already included 19,111 studies on skin sensitization in December 2014, making it the largest repository of such data so far (1,470 substances with mouse LLNA, 2,787 with GPMT, 762 with both in vivo and in vitro and 139 with only in vitro data). 21% were classified as sensitizers. The extracted skin sensitization data was analyzed to identify relationships in skin sensitization guidelines, visualize structural relationships of sensitizers, and build models to predict sensitization. A chemical with molecular weight > 500 Da is generally considered non-sensitizing owing to low bioavailability, but 49 sensitizing chemicals with a molecular weight > 500 Da were found. A chemical similarity map was produced using PubChem's 2D Tanimoto similarity metric and Gephi force layout visualization. Nine clusters of chemicals were identified by Blondel's module recognition algorithm revealing wide module-dependent variation. Approximately 31% of mapped chemicals are Michael's acceptors but alone this does not imply skin sensitization. A simple sensitization model using molecular weight and five ToxTree structural alerts showed a balanced accuracy of 65.8% (specificity 80.4%, sensitivity 51.4%), demonstrating that structural alerts have information value. A simple variant of k-nearest neighbors outperformed the ToxTree approach even at 75% similarity threshold (82% balanced accuracy at 0.95 threshold). At higher thresholds, the balanced accuracy increased. Lower similarity thresholds decrease sensitivity faster than specificity. This analysis scopes the landscape of chemical skin sensitization, demonstrating the value of large public datasets for health hazard prediction.
AB - The public data on skin sensitization from REACH registrations already included 19,111 studies on skin sensitization in December 2014, making it the largest repository of such data so far (1,470 substances with mouse LLNA, 2,787 with GPMT, 762 with both in vivo and in vitro and 139 with only in vitro data). 21% were classified as sensitizers. The extracted skin sensitization data was analyzed to identify relationships in skin sensitization guidelines, visualize structural relationships of sensitizers, and build models to predict sensitization. A chemical with molecular weight > 500 Da is generally considered non-sensitizing owing to low bioavailability, but 49 sensitizing chemicals with a molecular weight > 500 Da were found. A chemical similarity map was produced using PubChem's 2D Tanimoto similarity metric and Gephi force layout visualization. Nine clusters of chemicals were identified by Blondel's module recognition algorithm revealing wide module-dependent variation. Approximately 31% of mapped chemicals are Michael's acceptors but alone this does not imply skin sensitization. A simple sensitization model using molecular weight and five ToxTree structural alerts showed a balanced accuracy of 65.8% (specificity 80.4%, sensitivity 51.4%), demonstrating that structural alerts have information value. A simple variant of k-nearest neighbors outperformed the ToxTree approach even at 75% similarity threshold (82% balanced accuracy at 0.95 threshold). At higher thresholds, the balanced accuracy increased. Lower similarity thresholds decrease sensitivity faster than specificity. This analysis scopes the landscape of chemical skin sensitization, demonstrating the value of large public datasets for health hazard prediction.
UR - http://www.scopus.com/inward/record.url?scp=84962821185&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84962821185&partnerID=8YFLogxK
U2 - 10.14573/altex.1510055
DO - 10.14573/altex.1510055
M3 - Article
C2 - 26863411
AN - SCOPUS:84962821185
SN - 1868-596X
VL - 33
SP - 135
EP - 148
JO - Altex
JF - Altex
IS - 2
ER -