Volume 13, Issue 1 p. 139-154
Health & Ecological Risk Assessment
Open Access

A comparative evaluation of five hazard screening tools

JM Panko,

Corresponding Author

JM Panko

Cardno ChemRisk, Pittsburgh, Pennsylvania, USA

Address correspondence to julie.panko@cardno.com

Search for more papers by this author
K Hitchcock,

K Hitchcock

Cardno ChemRisk, Pittsburgh, Pennsylvania, USA

Search for more papers by this author
M Fung,

M Fung

Cardno ChemRisk, San Francisco, California, USA

Search for more papers by this author
PJ Spencer,

PJ Spencer

Dow Chemical Company, Toxicology & Environmental Research and Consulting, Midland, Michigan, USA

Search for more papers by this author
T Kingsbury,

T Kingsbury

TKingsbury Consulting, San Ramon, California, USA

Search for more papers by this author
AM Mason,

AM Mason

American Chemistry Council, Washington, DC, USA

Search for more papers by this author
First published: 18 January 2016
Citations: 12

ABSTRACT

An increasing number of hazard assessment tools and approaches are being used in the marketplace as a means to differentiate products and ingredients with lower versus higher hazards or to certify what some call greener chemical ingredients in consumer products. Some leading retailers have established policies for product manufacturers and their suppliers to disclose chemical ingredients and their related hazard characteristics often specifying what tools to use. To date, no data exists that show a tool's reliability to provide consistent, credible screening-level hazard scores that can inform greener product selection. We conducted a small pilot study to understand and compare the hazard scoring of several hazard screening tools to determine if hazard and toxicity profiles for chemicals differ. Seven chemicals were selected that represent both natural and man-made chemistries as well as a range of toxicological activity. We conducted the assessments according to each tool provider's guidelines, which included factors such as endpoints, weighting preferences, sources of information, and treatment of data gaps. The results indicate the tools varied in the level of discrimination seen in the scores for these 7 chemicals and that tool classifications of the same chemical varied widely between the tools, ranging from little or no hazard or toxicity to very high hazard or toxicity. The results also highlight the need for transparency in describing the basis for the tool's hazard scores and suggest possible enhancements. Based on this pilot study, tools should not be generalized to fit all situations because their evaluations are context-specific. Before choosing a tool or approach, it is critical that the assessment rationale be clearly defined and matches the selected tool or approach. Integr Environ Assess Manag 2017;13:139–154. © 2016 The Authors. Integrated Environmental Assessment and Management published by Wiley Periodicals, Inc. on behalf of SETAC

INTRODUCTION

The growing movement for transparent disclosure of the toxicological hazard of any chemical ingredient in consumer products emphasizes list-based hazard identification and disregards considerations of exposure, product use, and life cycle impacts. From food and personal care products to building and construction materials, this movement now touches virtually every product category (Scruggs and Ortolano 2011; Goldsmith et al. 2014). Some leading retailers are establishing policies and processes for suppliers, including product manufacturers, to disclose chemical ingredients and their related toxicological hazard characteristics (referred to as “hazard” throughout the present article) in consumer products (Target 2013; Walmart 2014). To meet this demand, companies have developed software tools, lists, and frameworks (henceforward called “tools”) to score the “greenness” of a chemical within a product, where the chemical's toxicological hazard drives the score.

Organizations, including retailers and various business groups, often identify a suite of tools for evaluating chemicals in products, implying that any tool is equally informative for assessing hazards. For example, in its Implementation Guide for its Policy on Sustainable Chemistry, Walmart identifies 11 different tools in Appendix 3 – Guides for Alternatives Assessments that suppliers may use to advance the safer formulation of products using informed substitution (Walmart 2014). Similarly, the Green Chemistry and Commerce Council identifies 8 different tools in its Retailer Portal Database (GC3 2016), and the Organisation for Economic Co-operation and Development (OECD) Substitution and Alternatives Assessment Toolbox (SAAT) contains 36 tools identified for alternative assessment, along with 26 frameworks (OECD 2015). The tools, however, are not created equally and differ in several important ways, including the sources of information used to judge the hazards, application of hazard characteristics of a single chemical ingredient to the whole consumer product versus consideration of the chemical concentration in the product, or consideration of the chemical's functional role or possible human exposure. Consequently, chemical suppliers, product formulators, and brand owners face a challenging task: They must incorporate their respective data streams into a proprietary system specified by each retailer, which often use different hazard tools and metrics. Furthermore, some rating systems, such as LEED v4 (LEED 2014), require manufacturers to use particular tools. Confusion inevitably arises because a chemical ingredient's or product's hazard classification score can vary from tool to tool.

At a fundamental level, any tool should provide a credible hazard score consistent with the underlying science and available data. If 2 screening-level hazard tools share a common objective and are based on similar science and data, can similar results be expected? In this article, we present a case study that compares the scores for 7 chemicals in consumer products using 5 tools, 4 of which were identified in an earlier study (Gauthier et al. 2015). Two of these tools, however, contain separate modules, leading to a total of 8 assessments for each chemical. We also describe how each tool's settings and assumptions can affect the results.

MATERIALS AND METHODS

Chemicals evaluated

We identified a set of chemicals intended to cover a broad range of hazard characteristics. This set included man-made chemicals (ethylene glycol, dibutyl phthalate [DBP]), natural compounds (caffeine, citric acid, glycolic acid), a degradation metabolite (glycolic acid), an antimicrobial (benzisothiazolinone [BIT]), and a persistent and bioaccumulative flame retardant (1,2,5,6,9,10-hexabromocyclo-dodecane [HBCD]) (Table 1). As a group, these 7 chemicals, including some which have been referred to as “of concern” in the public dialogue, possess the following features: data availability, known and varied toxicity profiles, and presence in consumer products.

Table 1. Chemicals included in hazard assessment
CAS nr Chemical name Rationale
58-08-2 Caffeine Natural chemical
77-92-9 Citric acid Naturally derived preservative on DfE Safer Chemicals List
107-21-1 Ethylene glycol Synthetic chemical
79-14-1 Glycolic acid A natural metabolite, a degradation product of ethylene glycol
84-74-2 Dibutyl phthalate (DBP) Well characterized, on various ban lists
2634-33-5 Benzisothiazolinone (BIT) Antimicrobial
3194-55-6 1,2,5,6,9,10-Hexabromocyclo-dodecane (HBCD) Flame retardant, end-of-life issues (PBT)
  • CAS = Chemical Abstracts Service; DfE = Design for the Environment; PBT = persistent, bioaccumulative, toxic.

Hazard tool identification

In an earlier effort, we reviewed 32 chemical assessment tools (including automated software and frameworks or guidance) and evaluated them against a series of criteria (i.e., number of endpoints, capacity to evaluate large numbers of chemicals, intuitive user interface, transparency, and peer review) in 5 categories: screening and prioritization, database use, hazard assessment, exposure and risk assessment, and certification and labeling (see Table 3 in Gauthier et al. 2015). Four hazard analysis tools were selected from that evaluation on the basis of their high score as well as on the tool's popularity among users (Gauthier et al. 2015). We also selected a 5th tool, SciVera Lens, on the basis of its growing use in the marketplace (Table 2). Notably, in a separate effort, SciVera requested an evaluation using the criteria in Gauthier et al. and scored very highly.

Table 2. Overview of tools assessed
Tool Approach Creator Objective Data source Nr endpoints Scoring designation Process transparency Data gaps
DfE AA Framework USEPA Identify safer alternatives Toxicological data 16 Low Moderate High Very high Transparent Negative effect
GreenScreen full assessment Framework CPA Comparative hazard assessment and identification of chemicals of high concern Lists, toxicological data 20 BM4 best BM3 use but can improve BM2 use but search for alternatives BM1 avoid Transparent Negative effect
GreenWERCS GreenScreen List Translator List CPA license to GreenWERCS Quick assessment of known chemicals of concern Lists 20 LT-U = Undetermined due to insufficient data LT-P1 = avoid, possible BM1 LT-1 = BM1 avoid Transparent Negative effect
GreenWERCS Walmart Scoring Model List The Wercs Ltd/Walmart Incorporates Walmart policy indicators for evaluating chemicals in products Lists 9 0–2500 (preferable 2500–6000 (acceptable) 6000–8500 (avoid) Not transparent No effect
GreenWERCS GreenScreen Scoring Model List GreenWERCS licensed from CPA Incorporates the GreenScreen List Translator by CPA Lists 21 0–5000 (preferable) 5000–15 000 (acceptable) 15 000–20 000 (avoid) Transparent No effect or positive effect
GreenWERCS ChemRisk Model List GreenWERCS Customization of authoritative lists based on user preferences to identify chemicals of concern Lists 23 0–5000 (preferable) 5000–15 000 (acceptable) 15 000–20 000 (avoid) Transparent No effect or positive effect
GreenSuite adjusted model Analysis Chemical Compliance Systems Chemical database is one of a set of modules for risk-based chemical-use decision making Lists, toxicological data 28 used in this pilot 90%–100% (Grade A, green) 80%–89% (Grade B, yellow) 70%–79% (Grade C, orange) 0%–69% (Grade D, red) Semitransparent Negative effect
SciVera Lens Chemical Safety Assessment Analysis SciVera LLC Assess hazards of chemical ingredients in a product with risk assessment option Lists, toxicological data 22 Very high hazard (vh) High hazard (h) Insufficient data to determine hazard (nd) Unassessed (u) Moderate hazard (m) Low hazard (l) Also each designation may be followed by “e” indicating expert judgment was applied. Transparent Negative effect
  • CPA = Clean Production Action; DfE AA  = Design for the Environment Alternative Assessment Criteria for Hazard Evaluation; USEPA = US Environmental Protection Agency.

Each of the 5 tools differs in their approach to scoring hazard (lists, framework guidance, or expert analysis), method transparency, information sources, approaches to address data gaps, and fundamental assumptions in the evaluation process (Table 2). To understand the similarities and differences among the tools, we assessed each of the 7 chemicals in a manner consistent with each tool provider's operational procedures.

Table 3. Results of chemical screening tools comparison
CAS Nr Chemical name List-based Tools Framework Expert analysis
GreenWercs Walmart Scoring Model GreenWercs GreenScreen Scoring Model GreenScreen List Translator GreenWercs ChemRisk Scoring Model GreenScreen full assessment DfE AA GreenSuite adjusted weighting SciVera Lens
58-08-2 Caffeine 0 6100 LT-1 3200 H (BM2) High – reproductive and developmental toxicity 68% H
Carcinogenicity; Reproductive toxicity; Endocrine activity; Acute mammalian toxicity Reproductive toxicity Oral LD50; Reproductive effects Moderate Group I Human Water persistence; Water long-term effects; Soil persistence; Oral LD50; Reproductive effects; Neurotoxicity Acute oral toxicity
77-92-9 Citric acid 0 2200 LT-U 0 H (BM2) 71% H
Acute mammalian toxicity; Multiple endpoints Insufficient evidence to apply algorithm High Group II Human High – eye irritation Air persistence; Soil persistence; Eye irritation; Genotoxicity; Neurotoxicity Acute ocular toxicity; sensory irritation
107-21-1 Ethylene glycol 0 10 700 LT-P1 5300 vH (BM1) High - acute mammalian and reproductive and developmental toxicity 66% Me
Reproductive toxicity; Developmental toxicity; Endocrine activity; Acute mammalian toxicity; Systemic toxicity; Neurotoxicity; Flammability; Multiple endpoints Endocrine activity; Neurotoxicity Oral LD50; Eye irritation; RfD; Neurotoxicity; TLV High Group I Human Water exposure; Air persistence; Soil exposure; IDLH; Skin irritation; Eye irritation; Reproductive effects; Neurotoxicity; TLV Dermal irritation; endocrine disruption potential; neurotoxicity
79-14-1 Glycolic Acid 0 2200 LT-U 1200 vH (BM1) Very high – eye and skin irritation 79% vH
Acute mammalian toxicity; Multiple endpoints Insufficient evidence to apply algorithm Inhalation LC50 H (BM2) Moderate to High – Reproductive and developmental toxicity Water exposure; Air persistence; Soil exposure; Skin irritation; Eye irritation Dermal irritation
High Group I Human
84-74-2 DBP 1700 12 700 LT-1 12 500 vH (BM1) High – reproductive and developmental toxicity 64% H
Reproductive and developmental hazards; Endocrine disruptors; Hazardous waste; vPvB Carcinogenicity; Mutagenicity; Reproductive toxicity; Developmental toxicity; Endocrine activity; Acute mammalian toxicity; Systemic toxicity; Neurotoxicity; Skin sensitization; Acute and chronic aquatic toxicity; Multiple endpoints Reproductive toxicity; Developmental and neurotoxicity; Systemic toxicity; Skin sensitization; Acute aquatic toxicity Oral & Dermal LD50; Oral RfD; IDLH; inhalation LC50; Eye irritation; Reproductive effects; Carcinogenicity; RfD; Neurotoxicity High Group I Human/SVHC List Very High – acute and chronic aquatic toxicity Water toxicity; Air persistence; Soil exposure; IDLH; STEL or ceiling Developmental-reproductive toxicity; endocrine disruption potential; acute aquatic toxicity
2634-33-5 BIT 0 3200 LT-U 3200 U High – repeat dose toxicity 69% vH
Acute mammalian toxicity; Skin and eye irritation; Skin sensitization; Acute and chronic aquatic toxicity Insufficient evidence to apply algorithm Oral LD50; Skin irritation Carcinogenicity Very High – eye irritation and aquatic toxicity Water persistence; Water long-term effects; Water toxicity; Soil persistence; Soil exposure; Skin irritation; Eye irritation; Sensitizer Acute ocular toxicity
3194-55-6 HBCD (mixed isomers) 0 2500 LT-1 5000 vBT (BM1) High – developmental toxicity and persistence 83% vH
Persistence; Bioaccumulation; Multiple endpoints Persistence; Bioaccumulation Persistence Very High – acute and chronic aquatic toxicity and bioaccumulation Water persistence; Water toxicity; Air persistence; Soil persistence; Soil exposure Acute aquatic toxicity; bioaccumulation; chronic aquatic toxicity
  • BIT = benzisothiazolinone; BM1 = Benchmark 1 (avoid); DBP = dibutyl phthalate; DfE = Design for Environment Alternative Assessment Criteria for Hazard Evaluation; H = one or more endpoints scored a high hazard; H (BM2) = endpoints with a high hazard scored Benchmark 2; HBCD = 1,2,5,6,9,10-Hexabromocyclo-dodecane; IDLH = Immediately Dangerous to Life or Health; LT-1 = BM1 avoid; LT-P1 = avoid, possible BM1; LT-U = undetermined due to insufficient data; RfD = Reference Dose; STEL = Short-term Exposure Limit; SVHC = substance of very high concern; TLV = Threshold Limit Value; U = Undetermined due to insufficient data; vBT (BM1) = one or more endpoint scored with a very bioaccumulative and toxic score leading to a benchmark 1 (avoid) score; v H = one or more endpoints score with a very high hazard; vH (BM1) = one or more endpoint scored with a very high hazard leading to a benchmark 1 (avoid) score; vPvB = very persistent, very bioaccumulative.

Tool overview

All the tools in the present study performed their chemical hazard assessments using 1 of 3 methods—lists, framework, or expert analysis—and each was created for differing objectives, varying from a quick screen to a more in-depth hazard assessment. Knowing how a tool analyzes chemical hazard and the tool's objective is key to understanding why different tools may have different results for the same chemical.

Lists

Lists are the simplest chemical assessment mechanism. Each list-based tool specifies a number of chemical lists from several regulatory and/or nonregulatory sources. Scoring a consumer product with a list-based tool is a straightforward process; if a chemical ingredient is present on 1 or more of the tool's hazard lists, a score is applied. Some tools score just the chemical ingredients; others infer or apply the score for the most hazardous chemical to the product.

There were 2 list-based tools in this pilot: GreenScreen List Translator and the 3 versions of the GreenWERCS tool, including 2 “off the shelf” models (the GreenWERCS GreenScreen Scoring Model, the GreenWERCS Walmart Scoring Model) and 1 custom model (GreenWERCS ChemRisk model) (WERCS 2016) that was developed to compare a list-based version of the data-driven GreenSuite (CCS 2016) “adjusted” model. The authors worked directly with the GreenWERCS representatives to establish access to Web-based software and to design the custom model. Chemical-specific results reports were downloaded upon completion of each analysis.

The GreenWERCS GreenScreen List Translator automates CPA's list translation process as described by GreenScreen methods (CPA 2013) using input from a series of lists identified by Clean Production Action (CPA), which vary in quality from “authoritative” to “screening.” The GreenWERCS GreenScreen Scoring Model is built around the GreenScreen methods, but GreenWERCS uses a different scoring system than the GreenScreen full assessment. Walmart and GreenWERCS developed the GreenWERCS Walmart Scoring Model that is available by subscription. Each chemical score originates from Walmart-identified lists according to its Policy on Sustainable Chemistry in Consumables (Walmart 2014), and scores are broken into numerical ranges (Tables 2 and 5). The GreenWERCS Walmart Scoring Model contains a confidential “grey list,” defined by this retailer, and suppliers entering a product formulation containing these chemicals will receive a “Walmart High Priority Chemical” warning that notes this chemical is subject to reduction using substitution principles informed by Design for the Environment (DfE) Alternative Assessment for the Evaluation of Chemicals (USEPA 2011). The GreenWERCS ChemRisk Scoring Model was developed for the present pilot study as a list-based surrogate to compare to the data-driven analysis used in the GreenSuite adjusted model. Each endpoint from the GreenSuite adjusted model was entered into the GreenWERCS ChemRisk Scoring Model, a comparable list from the GreenWERCS database was chosen with assistance from GreenWERCS programmers, and the weighting for the categories and endpoints was set as closely as possible to mirror the GreenSuite adjusted model (Table 6). Some endpoints did not overlap completely. For example, whereas the GreenSuite adjusted model included 3 persistence endpoints, (air, water, soil), 1 overall persistence category was included in the GreenWERCS ChemRisk Scoring Model (Table 4).

Table 4. Comparison of endpoints used in the hazard tools by category
Category GreenScreen full assessment DfE AA GreenSuite adjusted model GreenWERCS GreenScreen List Translator GreenWERCS GreenScreen Scoring Model GreenWERCS ChemRisk Model GreenWERCS Walmart Model
Other Walmart Grey List
Acute mammalian toxicity Acute mammalian toxicity Acute mammalian toxicity Oral LD50 Acute mammalian toxicity Acute mammalian toxicity Oral LD50
Dermal LD50 Dermal LD50
IDLH IDLH
STEL or ceiling STEL or ceiling
Inhalation LC50 Inhalation LC50
Aquatic toxicity Acute aquatic toxicity Environmental toxicity and fate—aquatic toxicity Water toxicity Acute aquatic toxicity Acute aquatic toxicity
Chronic aquatic toxicity Water long-term effects Chronic aquatic toxicity Chronic aquatic toxicity
Bioaccumulation Bioaccumulation Environmental toxicity and fate—bioaccumulation Bioaccumulation Bioaccumulation
Persistence Persistence Environmental toxicity and fate—persistence Water persistence Persistence Persistence Persistence PBT chemicals
Water partitioning vPvB chemicals
Air persistence
Air partitioning
Soil persistence
Soil partitioning
Carcinogenicity Carcinogenicity Carcinogenicity Carcinogenicity Carcinogenicity Carcinogenicity IRIS carcinogenicity Known carcinogens
Suspected carcinogens
Mutagenicity Mutagenicity or genotoxicity Mutagenicity or genotoxicity Genotoxicity Mutagenicity or genotoxicity Mutagenicity or genotoxicity HA G&L neurotoxicants Mutagenicity
GHS-germ cell mutagenicity
Neurotoxicity Neurotoxicity single dose Neurotoxicity Neurotoxicity Neurotoxicity single dose Neurotoxicity single dose
Neurotoxicity repeat dose Neurotoxicity repeat dose Neurotoxicity repeat dose
Reproductive toxicity Reproductive toxicity Reproductive toxicity Reproductive toxicity Reproductive toxicity Reproductive toxicity GHS-toxic to reproduction Reproductive toxicity
Developmental toxicity Developmental toxicity Developmental toxicity Developmental toxicity Developmental toxicity
Endocrine activity Endocrine activity Endocrine activity Endocrine activity Endocrine activity
Systemic toxicity Systemic toxicity and organ effects – single dose RfC Systemic toxicity and organ effects – single dose Systemic toxicity and organ effects – single dose IRIS RfC
RfD IRIS RfD
Systemic toxicity and organ effects – repeat dose Systemic toxicity/organ effects – repeat dose TLV Systemic toxicity and organ effects – repeat dose Systemic toxicity and organ effects – repeat dose ACGIH TLV-C
Eye irritation Eye irritation Eye irritation Eye irritation Eye irritation Eye irritation Eye irritation
Skin irritation Skin irritation Skin irritation Skin irritation Skin irritation Skin irritation EU H315 causes skin irritation
Sensitization Skin sensitization Skin sensitization Sensitizer Skin sensitization Skin sensitization AIHA WEEL skin sensitizer
Respiratory sensitization Respiratory sensitization Respiratory sensitization Respiratory sensitization
Safety Reactivity Additional endpoints Water reactive Reactivity Reactivity EU R14 reacts violently with water Hazardous waste
Flammability Flammability Flammability GHS flammable gases, liquids, solids
Corrosive GHS corrosive to metal
Oxidizer GHS EU oxidizing gases category 1 GHS EU oxidizing liquids and solids Category 1–3
Explosivity EU GHS
Multiple Multiple endpoints Multiple endpoints
  • ACGIH TLV-C = American Conference of Governmental Industrial Hygienists Threshold Limit Value-Ceiling limit; AIHA WEEL = American Industrial Hygiene Association Workplace Environmental Exposure Level; DfE = Design for Environment Alternative Assessment Criteria for Hazard Evaluation; EU H315 = European Union Hazard Statement 315; EU R14 = European Union Risk Phrase 14; GHS = Globally Harmonized System; HA G&L = Hazard Grandijean & Landrigan; IDLH = Immediately Dangerous to Life or Health; IRIS = Integrated Risk Information System; RfC = Reference Concentration; RfD = Reference Dose; STEL = Short-term Exposure Limit; TLV = Threshold Limit Value; vPvB = very persistent, very bioaccumulative.

Frameworks

Frameworks apply a documented procedure for the technical evaluation and systematic assessment process, which is usually performed by experts in toxicology or trained analysts. Frameworks identify the endpoints for evaluation and use both lists and toxicological data. When data are not available, frameworks identify the use of models and read-across information. In the hands of a professional expert, frameworks increase the scientific rigor behind the hazard analysis. This pilot included 2 framework tools: GreenScreen full assessment (CPA 2013) and DfE (USEPA 2011). Both tools rely heavily on the Globally Harmonized System of Classification and Labeling of Chemicals (GHS) criteria (UN 2005) and related OECD testing methodologies identified in GHS (Supplemental Data Table S2).

Our analysis grouped GreenScreen and DfE together for 2 reasons. First, they use similar assessment methods, and second, in this pilot the GreenScreen full assessment report served as the basis for both GreenScreen full assessment and DfE analyses (unless otherwise noted). When available, we relied on published, verified GreenScreen full assessments (i.e., ethylene glycol; ToxServices 2013), or we used an existing in-house assessment from a previous project (i.e., benzisothiazolinone [BIT]; Janice Hammond, Dow Consulting Group, personal communication). When existing assessments were unavailable, we completed a GreenScreen full assessment report for each chemical using primary data sources as well as estimated values, analogs, and structural activity relationships (SARs) when needed. Similarly, we used US Environmental Protection Agency (USEPA) full DfE evaluations for DBP (USEPA 2014a) and HBCD (USEPA 2014b), and we performed the GreenScreen full assessment for these 2 chemicals as part of this project.

GreenScreen and DfE methods and documentation are publicly available and provide detailed information on data gap considerations, data hierarchy, and acceptable study guidelines for each program (CPA 2013 and USEPA 2011, respectively). The data used in the GreenScreen full assessment were documented using the GreenScreen report template specified in its methods document. Both GreenScreen full assessment and DfE rely on the professional judgment of an expert (usually a toxicologist) to classify the hazard for each endpoint. The GreenScreen methodology approached endpoint evaluation sequentially: first scoring persistence, bioaccumulation, cancer, mutagenicity, and reproductive toxicity, followed by other endpoints, for each chemical ingredient and known process manufacturing by-products and residuals (CPA 2013). If the chemical scores high or very high in any of these characteristics or if there are data gaps for any of these endpoints, the assessment stops, and the lowest, least preferred score (BM1) is assigned without further consideration of the remaining endpoints. For example, the GreenScreen full assessment reports for DBP and HBCD were stopped at an early stage in the evaluation because each chemical met the GreenScreen criteria for BM1. HBCD is on the European Chemicals Agency (ECHA)'s Candidate List of Substances of Very High Concern as a persistent, bioaccumulative, toxic (PBT), and DBP is on the European Union's (EU) carcinogen, mutagen, reproductive toxicant (CMR) list as a substance toxic to reproduction (Repr. Cat. 2: R61, may cause harm to the unborn child). The present pilot study did not evaluate manufacturing processes or residual or by-products, because this was a hypothesis-testing exercise and not a review of a specific manufacturer's chemical ingredient or product.

Expert analyses

The expert analyses category included tools that rely on toxicology professionals to conduct the hazard assessment for each chemical manually or through a software system tied to a database of scientific information to yield a hazard analysis based on scientific data. Similar to the framework approach, analysis tools use standard protocol for endpoint identification and weighting, data quality, and data sources, guided by a professional analyst. This pilot study included 2 expert analysis tools: GreenSuite and SciVera Lens (SciVera 2016). The authors contracted SciVera to perform the chemical hazard assessments for the 7 chemicals through their hazard analysis and assessment process and provided the substance data detailed reports on each chemical for our use in this analysis. Similarly, the authors worked directly with the GreenSuite analysts to establish access to Web-based software and to design the screening model. Chemical-specific results reports were downloaded upon completion of each analysis.

GreenSuite includes a set of modules that includes a proprietary database of approximately 28 000 chemicals and their associated hazard and physical–chemical characteristics. The 44 endpoints GreenSuite tracks are based on the 44 endpoints included in the standard National Science Foundation–Green Chemistry Institute–American National Standards Institute (NSF/GCI/ANSI) standard 355, which is based, in part, on the GHS hazard endpoints (NSF/GCI/ANSI 2011). The 44 endpoints include chronic and acute human health, ecological toxicity, and persistence and bioaccumulation for each environmental compartment as well as physical properties, safety parameters, and life cycle thinking metrics. In this pilot, we used an adjusted set of 28 endpoints and weighting designed to mimic the endpoints in DfE and GreenScreen full assessment (Table 4). We did not use the remaining 16 endpoints, including the life cycle endpoints, because they are not relevant for this comparison project (Supplemental Data Table S1). This article presents the results of our user-adjusted model. If the database lacks information for an endpoint, GreenSuite analysts search the primary literature to locate appropriate data for the chemical. GreenSuite also categorizes the data quality and “missing data” (Supplemental Data Table S4). GreenSuite uses a proprietary normalization process, which is available to clients, in which raw data for each endpoint are assigned values from 0 (most hazard) to 100 (least hazard). Normalization allows the score for each chemical in a consumer product to be added, affording an overall product hazard score once the assessment is complete. By default, GreenSuite sets a score of 65% to indicate high concern (Supplemental Data Table  3). That score can be adjusted to a different “default fail” score whenever a client uses GreenSuite. For this pilot study, we adjusted the default fail score to be 69%. Thus, this pilot project used the GreenSuite adjusted model that reflected the modified number of endpoints used, the “concern” score, and the endpoint weightings to better allow a comparison to the other tools in this pilot.

The SciVera Lens Chemical Safety Assessment tool is used by its clients to assess the hazard of chemicals within their products, but it also can be used to characterize risk. This tool generates a chemical profile report for each chemical and uses qualitative scores (low hazard, low hazard based on expert judgment, high hazard, etc.) which are also denoted by colored-coded circles. Chemical hazards are assessed across 22 endpoints, with 3 designated as core endpoints (carcinogenicity, mutagenicity, and reproductive toxicity). Both authoritative lists and/or toxicological literature are used. Similar to the framework and analysis tools, efforts to locate missing data from the toxicology literature, read across, structural activity, and other methods are routinely used prior to declaring missing data a data gap. The endpoints with the most hazard drive the overall chemical score. For chemicals that score very high or high hazard, clients can request an optional risk characterization using a hazard quotient approach. For this pilot project, SciVera performed the chemical hazard assessments under contract and provided detailed reports on each chemical for our use.

Tool scoring

Each tool used either a qualitative or quantitative endpoint scoring system based on the tool's standard method or protocol, weighting of endpoints, and handling of missing data. Most tools use a numerical score, typically as a percentage (GreenSuite) or integer (GreenWERCS, both GreenScreen tools), and tool developers in the present study use either high or low numbers to represent less hazardous chemicals. Some tools did not provide a score for the chemical; rather they provided a ranking of hazard for a specific toxicological endpoint (DfE, SciVera). Several tools gave value-based terms to the scores, for example, preferred or avoid (GreenWERCS, GreenScreen List Translator, and GreenScreen full assessment), usually based on the most hazardous endpoint score for the chemical. In this pilot project, we translated each tool's scoring system into a common scale of low, moderate, high, and very high hazard (Table 5).

Table 5. Comparison of scoring systems
Tool Tools' scoring system Score conversion used in pilot
GreenWERCS Walmart Scoring Model 0–2500 (preferable) Low
2500–6000 (acceptable) Moderate
6000–8500 (avoid) Very high
GreenWERCS GreenScreen Scoring Model 0–5000 (preferable) Low
5000–15 000 (acceptable) Moderate
15 000–20 000 (avoid) Very high
GreenWERCS GreenScreen List Translator LT-U = BM Undetermined due to insufficient data U
LT-P1 = avoid, possible BM1 High
LT-1 = BM1 avoid Very high
GreenWERCS ChemRisk Scoring Model 0–5000 (preferable) Low
5000–15 000 (acceptable) Moderate
15 000–20 000 (avoid) Very high
GreenScreen full assessment BM4 best Low
BM3 use but seek improvement Moderate
BM2 use but search for alternatives High
BM1 avoid Very high
U Undetermined due to insufficient data U
DfE Alternative Assessment Criteria for Hazard Evaluation Based upon score for endpoints of concern: Low, Moderate, High, Very high Low
Moderate
High
Very High
GreenSuite (adjusted weighting) 90%–100% (Grade A, green) Low
80%–89% (Grade B, yellow) Moderate
70%–79% (Grade C, orange) High
0%–69% (Grade D, red) Very high
SciVera Lens Low hazard (le) or (l) Low
Moderate hazard (me) or (m) Moderate
High hazard (h) or (he) High
Very high hazard (vh) or (vhe) Very High
Insufficient data (nd)
Unassessed (u)

Endpoints

All tools in this pilot project used toxicological endpoints to score or characterize the hazard for each of the 7 chemicals. A comparison of the endpoints across tools is presented in Table 4. Common to all tools are the endpoints of CMR toxicity, developmental toxicity, and PBT.

Table 6. Comparison of endpoint weightings for comparison of data-basedaa GreenSuite
versus list-based toolbb GreenWERCS ChemRisk Scoring Model
GreenSuite default model GreenSuite adjusted model GreenWERCS ChemRisk Model
Water persistence 10% Water score 33.34% Ecological score 33.34% Water persistence 25% Water score 50% Ecological score 25% DSL persistence 50%
Water exposure 10% Water exposure 25%
Water toxicity 50% Water toxicity 25%
Water long-term effects 30% Water long-term effects 25%
Air persistence 10% Air score 33.33% Air persistence 50% Air score 25%
Air exposure 10% Air exposure 50%
Air toxicity 50% Air toxicity 0%
Air long-term effects 30% Air long-term effects 0%
Soil persistence 10% Soil score 33.33% Soil persistence 50% Soil score 25%
Soil exposure 10% Soil exposure 50%
Soil toxicity 50% Soil toxicity 0%
Soil long-term effects 30% Soil long-term effects 0%
Oral LD50 12.5% Acute score 50% Health score 33.33% Oral LD50 12% Acute score 40% Health score 70% Oral LD50 12%
Dermal LD50 15% Dermal LD50 12% Dermal LD50 12%
IDLH 25% IDLH 12% IDLH 12%
STEL or ceiling 20% STEL or ceiling 12% ACGIH OEL (STEL) 12%
Inhalation LC50 18% Inhalation LC50 12% Inhalation LC50 12%
Skin irritation 3% Skin irritation 20% EU H315 20%
Eye irritation 4.5% Eye irritation 20% Eye irritation 20%
Odor threshold value 2% Odor threshold value 0% AIHA odor threshold value 0%
Reproductive effects 22% Chronic score 50% Reproductive effects 20% Chronic score 60% GHS
Carcinogenicity 25% Carcinogenicity 20%
RfC 5% RfC 4.444%
RfD 5% RfD 4.444%
Sensitizer 7% Sensitizer 13.333%
Neurotoxicity 25% Neurotoxicity 13.333%
TLV 11% TLV 4.444%
Genotoxicity 20%
Flammability 100% Fire score 33.33% Safety score 33.33% Flammability 100% Fire score 33.33% Safety score 5%
Radioactivity 25% Special score 33.34% Radioactivity 0% Special score 33.34%
Oxidizer 25% Oxidizer 33.33%
Water-reactive 25% Water-reactive 33.34%
Corrosive 25% Corrosive 33.33%
Explosivity 100% Reactivity score 33.33% Explosivity 100% Reactivity score 33.33% EU GHS 100%
  • ACGIH OEL = American Conference of Governmental Industrial Hygienists Occupational Exposure Limit; AIHA = American Industrial Hygiene Association; DSL = Domestic Substance List; EU H315 = European Union Hazard Statement 315; GHS = Globally Harmonized System; IDLH = Immediately Dangerous to Life or Health; RfC = Reference Concentration; RfD = Reference Dose; STEL = Short-term Exposure Limit; TLV = Threshold Limit Value.
  • a GreenSuite
  • b GreenWERCS ChemRisk Scoring Model

Data gaps

Data gaps are endpoints for which there are no empirical data and no information can be derived from typical, well-accepted toxicological practices, including read across, modeling, and similar approaches (ICCA 2011). Data gaps were handled differently by each tool depending on the protocol of the tool. Some tools scored endpoint missing data as a more hazardous score (GreenScreen full assessment, SciVera, GreenSuite); other tools, usually list-based tools, scored missing data as undetermined (GreenScreen List Translator) or indifferent because of the lack of presence on the lists specified by the tool (GreenWERCS). GreenSuite's procedure used 5 criteria to differentiate the cause of the missing data to score the endpoint as more hazardous (the endpoint is plausible but data or information are unavailable and estimation or modeling is not possible) or indifferent (physical–chemical properties make the endpoint untestable) (Supplemental Data Table S4).

RESULTS

Overall, our analysis showed a lack of concordance among scores and or hazard characterization for the different tools. Caffeine highlights these discrepancies. It is categorized as “preferable” by GreenWERCS Walmart and ChemRisk Scoring Models, “acceptable” by GreenWERCS GreenScreen Scoring Model, and high by GreenSuite, DfE, and SciVera (Table 3), thereby spanning the range of chemical hazard scores. For tools that scored caffeine high hazard, the driving endpoints included reproductive toxicity, acute oral toxicity, and persistence. Table 3 compares the scores for each tool and identifies the primary endpoints contributing to these scores.

Examining each tool individually further identifies their respective differences. GreenWERCS Walmart Scoring Model scored all 7 chemicals in its preferable range. A closer inspection of the scores reveals that all chemicals except DBP received the best possible score (0) because those 6 chemicals were not on the lists this retailer has specified for use in this model. In contrast, GreenWERCS GreenScreen Scoring Model scored 3 chemicals as “acceptable” and 4 chemicals in this tool's preferable range. Similarly, the GreenWERCS ChemRisk Scoring Model scored 4 chemicals in the “preferable” category and 3 chemicals as acceptable. None of the GreenWERCS models indicated any chemicals from the study should be avoided. The hazard scores for all GreenWERCS models depend on a chemical's presence or absence from the lists contained within each model.

Overall, the scores for DfE and GreenScreen full assessments are very similar. This is not surprising because GreenScreen's hazard tables are based on the DfE hazard criteria, albeit with slightly varying thresholds (Tables 2, 3, 5).

The GreenSuite adjusted model scored 4 chemicals in the upper range of the Grade D score we set for this project (less than 69%), 2 chemicals scored Grade C (70%–79%), and 1 scored Grade B (80%–89%). It is important to understand the magnitude of the difference in GreenSuite scores before using the score to drive selection toward a substitute chemical. In the GreenSuite adjusted model, BIT scored 69%, and ethylene glycol scored 66%, both categorized as Grade D. According to GreenSuite analysts, clients have reported that hazard scores are sensitive to as little as a 2% difference in scoring. As a result, GreenSuite advises clients to have at least a 5% to 7% difference between hazard scores to have more confidence that there is a sufficient incremental change in hazards between chemicals when considering potential alternatives.

For the GreenScreen full assessment, 4 chemicals met 1 or more of the exclusion criteria (PBT, very persistent, very bioaccumulative [vPvB], and high toxicity) and were automatically placed in BM1 (avoid) classification in the early evaluation stage. Two chemicals were marked as BM2, (use, but search for a safer alternative). One chemical was scored as U, “undetermined due to insufficient data.”

In SciVera Lens, 3 chemicals received a very high hazard designation, although for only 1 chemical was that designation related to the 3 core endpoints (CMR toxicity); 3 chemicals were designated as high hazard based on their score in 1 or more of the critical endpoints; and 1 chemical was designated as “moderate” hazard, based on expert toxicologist judgment.

Chemical-specific assessments

Caffeine

The screening tools scores for caffeine varied from a preferable score with the GreenWERCS Walmart model, to high hazard in SciVera, and LT-1 (equivalent to “avoid”) in the GreenWERCS GreenScreen List Translator. Notably, the GreenScreen List Translator score of LT-1 should predict that caffeine would score a BM1 (avoid) in the GreenScreen full assessment; however, it scored a BM2, indicating “acceptable” for use, but search for an alternative. This difference arose because the GreenScreen List Translator pulled information from a government list identified as screening-level quality, whereas the GreenScreen full assessment relied upon CPA-defined “authoritative” lists and data sources as specified by CPA in its methodology (CPA 2013).

Citric acid

Scores and characterizations varied across tools for citric acid depending on the tools' consideration of eye irritation. For example, all of the GreenWERCS scoring models indicated citric acid was preferable, indicating that eye irritation was not an endpoint flagged by the lists that were searched. Conversely, citric acid received a high hazard designation under the DfE Alternatives Assessment criteria as well as SciVera Lens, and under the GreenScreen full assessment, it was scored as BM2, in each case driven by ocular effects. GreenSuite adjusted model scored it as Grade C, indicating that eye irritation was not a primary endpoint driving the analysis.

Ethylene glycol

The screening tools results for ethylene glycol varied due to differences in the evaluation of reproductive and developmental toxicity. Ethylene glycol assessments ranged from a “preferable” score with GreenWERCS Walmart Scoring Model to a BM1 (avoid) score with the GreenScreen full assessment, a Grade D (66%) based on multiple endpoints in the GreenSuite adjusted model, and a moderate hazard in SciVera Lens. The GreenScreen full assessment, scored a high rating for developmental toxicity, resulting in a BM1 (avoid) benchmark score. This score resulted from professional toxicological judgment that found ethylene glycol may adversely affect development with sufficiently high oral exposures (ToxServices 2013). Similarly, ethylene glycol was scored high hazard based on acute mammalian toxicity as well as reproductive and developmental toxicity in DfE. In contrast, 2 GreenWERCS scoring models assigned it a preferable zero score (GreenWERCS Walmart scoring model) or acceptable 5300 score (GreenWERCS ChemRisk Scoring Model) because the chemical was not on any reproductive toxicity lists. The scores highlight potential issues arising from list-based tools versus those based on data or expert judgment: Different data sources and professional judgment can lead to different hazard classifications.

Glycolic acid

Glycolic acid is a metabolite of ethylene glycol and thought to be the agent associated with developmental effects observed after ethylene glycol exposure (Carney et al. 1996, 1999; Bruckner et al. 2013). The scoring for glycolic acid varied widely, from preferable with GreenWERCS Walmart Scoring model because it did not appear on any lists associated with that model, to a BM1 (avoid) score in the GreenScreen full assessment based on the H360 classification used in the European Chemical Agency's (ECHA) Community Rolling Action Plan (CoRAP) as the basis for identifying the substance as a CMR (Belgium 2014; ECHA 2015). Both DfE and SciVera Lens scored glycolic acid as very high for irritation. DfE scored moderate to high for reproductive and developmental toxicity endpoints.

Dibutyl phthalate

Similar to the examples above, the reproductive and development toxicity endpoints drove the scores for DBP in all tools. For the GreenWERCS Walmart Scoring Model, reproductive and development toxicity was weighted equally with other endpoints, and the overall impact on the score was minor, resulting in a “preferable” score. However, in the GreenScreen full assessment, the reproductive and developmental toxicity endpoint drove the scoring for an automatically assigned BM1 (avoid) rating because the chemical is on the EU's substance of very high concern (SVHC) candidate list for reproductive toxicity. SciVera Lens also scored a “high” hazard for reproductive–developmental toxicity, and also reported water toxicity. GreenSuite adjusted model scored Grade D based on persistence and water toxicity.

Benzisothiazolinone

As with other chemicals, the scoring for this antimicrobial varied widely from preferable for all GreenWERCS models to an “undetermined due to insufficient data” (U) score with GreenScreen full assessment. The other tools scored BIT as LT-U (GreenScreen List Translator), and high for repeated dose toxicity and very high for eye irritation and aquatic toxicity endpoints in the DfE evaluation. The GreenSuite adjusted model gave a Grade D score of 69% driven primarily by persistence, aquatic toxicity, and eye and skin irritation or sensitizer endpoints. SciVera Lens scored BIT “very high hazard” due to ocular toxicity or eye irritation.

1,2,5,6,9,10-Hexabromocyclo-dodecane mixed isomers

The scores for the flame retardant HBCD varied widely but were driven primarily by bioaccumulation endpoints. The GreenScreen full assessment scored BM1 (avoid) and GreenWERCS List Translator scored LT-1 (avoid). The persistence and other endpoints score drove the GreenSuite adjusted model (Grade B), as well as GreenWERCS GreenScreen Scoring Models, although the model ultimately gave a preferable score. In contrast, the absence on lists resulted in a preferable score in the GreenWERCS Walmart Scoring Model. Under the DfE Alternatives Assessment framework, the chemical scored high for developmental toxicity and persistence and very high for acute and chronic aquatic toxicity and bioaccumulation.

DISCUSSION

We completed a small pilot study to determine whether various chemical hazard assessment tools produce the same conclusions for chemicals regardless of which tool was used. Although limited in scope, to our knowledge, this is the first attempt at comparing tools and critically examining where they agree and differ. It was clear from our interactions with the tool developers that none believed that their tool was equivalent to another, and the notion that the results should be similar was not widely held. However, given that all of the tools are designed and marketed as means for selecting “safer” or “greener” chemicals or products, it is reasonable to expect that a consistent answer with some reasonable amount of variation might be obtained.

The OECD SAAT is a compilation of resources relevant to chemical substitution and alternatives assessments and contains a function called the “Tool Selector,” which has been designed to provide information on the various tools that can be used in conducting chemical substitutions or alternatives assessments. The Selector allows one to filter based on characteristics of the tool (applicability, chemical hazard attributes, user-friendliness, etc.) to identify tools of greatest relevance to one's substitution or alternatives assessment goals. It also allows for viewing of more in-depth information on each tool, or a side-by-side comparison of a set of tools. Although the Tool Selector will provide information about the differences in the tools (i.e., lists vs. nonlists), there is nothing in the description of data output or the discussion of limitations that alerts the user to the fact that the hazard conclusion about the chemical may be different based on which tool is used for assessment.

During the course of this pilot study, we observed that each tool incorporates value judgments with respect to hazard endpoints, including which are evaluated, weighting preferences, and sources of information. These value judgments led to differences in the scoring results for each chemical. Thus, the outcome of a hazard screening assessment depends highly on the tool selected for the screening.

A detailed look at the results of this project shows that a preferred or avoid status for a given chemical could result solely on the basis of the endpoints factored into the tool's scoring system. All of the tools considered CMR, developmental toxicity, and PBT; however, the tools differed significantly in how they analyzed repeated dose toxicity, aquatic toxicity, and safety parameters (i.e., flammability, reactivity), and treatments ranged from no consideration at all (GreenWERCS Walmart Scoring Model) to extensive weighting (GreenSuite model). Moreover, the number of endpoints incorporated into the scores varied from 9 (GreenWERCS Walmart Scoring Model) to 28 (GreenSuite adjusted model) (Table 4).

The endpoint information source significantly affected the scores for each chemical across all tools. For example, all chemicals except citric acid scored inconsistently across the reproductive toxicity category for all tools. Similar inconsistencies were observed for aquatic, neurotoxicity, and systemic toxicity endpoints. These inconsistencies were driven by the endpoint information source and to a lesser extent by the maintenance and upkeep of the corresponding data or lists. Original or compiled test study data are used in GreenScreen full assessment, DfE, GreenSuite and SciVera; however, all of the GreenWERCS models and the GreenScreen List Translator use lists which are lagging indicators dependent on value judgments made by the list creators rather than by scientific consensus. Moreover, some lists used by tools were not created for hazard identification, and others may not be universally considered as authoritative. For example, in performing the GreenScreen full assessment for glycolic acid, the authors relied upon a CoRAP document that indicated that ECHA considers this substance a CMR. This designation is inconsistent with the literature for the substance and with the information provided in the ECHA database. Nonetheless, the CoRAP documents were considered authoritative by the authors and thus were used as the basis for assigning a BM1. If the CoRAP documents were not considered authoritative, then a BM2 would have been assigned.

It should also be noted that when using tools such as GreenScreen full assessment, DfE, and SciVera, qualified toxicologists can review the same data or literature and render a different opinion as to the degree of hazard that has been demonstrated. Therefore, transparent documentation of these analyses is critical. GreenScreen provides a template report for assessors to use to document their assessment and also provides a portal for uploading the assessments into the public domain so that others may review it. Similarly, SciVera provides a report that presents the data used by its scientists to render a hazard determination. These types of reports provide the tool user a good means of communicating or documenting decisions regarding chemical selection.

Treatment of missing information also impacts the tool output. Endpoint information gaps adversely affected scores for several of the tools. For example, in the GreenScreen full assessment, data gaps automatically result in a lower benchmark score for the endpoint (e.g., BM1 when carcinogenicity data are lacking). The higher GreenScreen benchmark scores have a lower tolerance for data gaps, and in fact a designation of BM4 requires data for all 18 GreenScreen endpoints. Similarly, SciVera Lens characterizes a chemical as more hazardous if it is missing data on 1 of the tool's 3 core endpoints (carcinogen, mutagenicity, reproductive toxicity), and GreenSuite adjusts the scoring depending on the type of missing data and marks scores more hazardous for endpoints where data are possible but unavailable. In contrast, in the GreenWERCS Walmart and GreenWERCS GreenScreen scoring models, absence from the models' lists led to a preferable score.

Lastly, endpoint weightings directly influenced the scores. Although weightings are based on value judgments, they must have a clearly communicated rationale behind their selection. In the tools we examined, the GreenScreen full assessment model gives the highest weighting for the endpoints of PBT, vPvB, and high toxicity (CMR toxicity). This results in an automatic score of BM1 (avoid) designation for a chemical with any of these properties. In SciVera Lens, 3 core endpoints (CMR toxicity) drive the overall assessment. Conversely, models such as GreenWERCS ChemRisk Scoring Model and GreenSuite model allow clients to either choose the model's “default” weighting or adjust the weighting of the endpoints to build “user preferred adjusted” models. Users of tools with user-preference or expert-judgment capability need to decide the weightings in advance, apply these weightings as a “rule” throughout the analysis with no further adjustments, and transparently document these values in all reports and publications.

Another key consideration with respect to the various tools is proper identification of the tool when the results are communicated. For example, it is not uncommon to see a simple reference to GreenScreen or GreenWERCS without any reference to the specific version of these tools. As the present study shows, the version of the tool matters.

Many products contain active chemical ingredients whose inherent toxicity profile is consistent with its function (biocides in paint, antioxidants in polymers, etc.). Some tools allow the evaluation of an ingredient within its functional role in the overall product. For example, the DfE has been used to allow comparisons across chemistries that serve a similar function. Additionally, GreenWERCS, GreenSuite, and SciVera Lens can evaluate products containing multiple chemical ingredients and provide product-level hazard analyses that reflect the weight composition of the chemical ingredients in the product. In these models, the overall assessment or score for the product depends on the chemical concentration, and if the concentration is low, may not significantly affect the product's hazard score. Conversely, in tools such as GreenScreen full assessment, the hazard score for a single chemical ingredient is applied to the entire product regardless of its concentration. Said differently, a product scores a BM1 (avoid) if it contains any amount of any chemical ingredient having a BM1 score.

Although the vast majority of the chemical assessment tools on the market are focused on hazard, the inherent hazard of chemical ingredients provides only 1 consideration in making informed substitution decisions and alternatives assessment. Chemical evaluations should account for exposure to the chemical in the product, as the product is actually used. In its October 2014 report, “A Framework to Guide Selection of Chemical Alternatives,” the National Resource Council of the National Academies (NRC 2014) offers a step-wise approach to addressing exposure, including 1) considering the exposure pathway within the use application of the product in the same step as a hazard analysis, 2) applying a comparative hazard and or exposure assessment for humans and ecological organisms, 3) considering life cycle parameters, and 4) performing a fuller, quantitative exposure assessment, if needed. Additionally, the National Academy of Science (NAS) emphasizes the need to consider new 21st century toxicology data in comparative assessments as a means to improve the ability to select the best alternatives. This guidance is useful to tool developers wishing to include exposure and 21st century data streams in their consumer product evaluation models. These new data streams (e.g., computational toxicology, high throughput testing) can be useful, particularly in the case of evaluation of new chemicals or products where traditional toxicity data are not available (NRC 2014).

At the time of this analysis, only SciVera Lens was capable of providing estimates of chemical exposure and a risk characterization using a hazard quotient approach. However, this optional characterization proceeds only for those chemicals that rank very high or high hazard in the CMR categories. SciVera indicated that the exposure assessment portion is custom-programmed specifically for the chemical-use scenarios of interest to the user. We are aware that some tool developers included in this study (GreenSuite) are exploring or are actively working to add exposure modules to their tools.

No single tool is applicable to every question. The present analysis highlights the need for a better match among the purpose of the evaluation, the questions posed, how the scores will be used, and the tool selected. Tool scores can vary based on the value judgments made by a tool provider. To that end, transparency in tool basis, including the rationale for selection of endpoints as well as documenting the rationale for weighting those endpoints, can help users determine which tool best fits their objectives. If a tool provides an option for expert judgment capability, output objectivity can be strengthened by deciding and documenting the basis for weightings in advance, applying weightings as a rule consistently throughout the analysis, and transparently documenting the weightings, scoring bases, and results in a report or publication.

Limitations of the analysis

The present study was small, looking at 7 chemicals representing a diverse set of toxicological properties that ranged from natural and man-made chemicals, to degradation metabolites, to a known persistent and bioaccumulative chemical. Additional work with a larger number of chemicals is needed to evaluate whether these preliminary findings are observed more broadly.

Acknowledgment

The authors recognize the work of Alison Gauthier and Cameron Flayer of Cardno ChemRisk during the initial analysis of the tools in this project, as well as the tool developers who assisted in the execution of the analyses: Kerry McClurg (GreenWERCS), George Thompson and Kevin Kennedy (GreenSuite), and Joseph Rinkevich and Patricia Beattie (SciVera). Additionally, the authors acknowledge that qualified toxicologists were involved in the GreenScreen Full Assessments; however, none were Verified GreenScreen Assessors. GreenScreen® is a registered mark of The Tides Center Corporation and GreenWERCs® is a registered mark of The Wercs Ltd. Other marks may be pending registration or registered in the USA or internationally. In this article, we do not display trademark logos in accordance with the American Chemical Society style guidelines.

    Disclaimer

    The Cardno ChemRisk analysis was supported by the American Chemistry Council (ACC). T Kingsbury lead the work effort leading to the Gauthier et al. 2015 paper as a consultant to ACC and continued consulting after leaving CardnoChemRisk. AM Mason is employed by the ACC. ACC did not provide funding to other authors or their companies. PJ Spencer is employed by the Dow Chemical Company, a member company of ACC.

    Data availability

    Data available upon request from the corresponding author Julie Panko (julie.panko@cardno.com).

    SUPPLEMENTAL DATA

    Table S1. Endpoints in GreenSuite not used in this project

    Table S2. Globally Harmonized System of Classification and Labeling of Chemicals (GHS) endpoints for hazard screening

    Table S3. GreenSuite default model

    Table S4. GreenSuite handling missing data