# Development of a Bayesian network for probabilistic risk assessment of pesticides

## Abstract

Conventional environmental risk assessment of chemicals is based on a calculated risk quotient, representing the ratio of exposure to effects of the chemical, in combination with assessment factors to account for uncertainty. Probabilistic risk assessment approaches can offer more transparency by using probability distributions for exposure and/or effects to account for variability and uncertainty. In this study, a probabilistic approach using Bayesian network modeling is explored as an alternative to traditional risk calculation. Bayesian networks can serve as meta-models that link information from several sources and offer a transparent way of incorporating the required characterization of uncertainty for environmental risk assessment. To this end, a Bayesian network has been developed and parameterized for the pesticides azoxystrobin, metribuzin, and imidacloprid. We illustrate the development from deterministic (traditional) risk calculation, via intermediate versions, to fully probabilistic risk characterization using azoxystrobin as an example. We also demonstrate the seasonal risk calculation for the three pesticides. *Integr Environ Assess Manag* 2022;18:1072–1087. © 2021 The Authors. *Integrated Environmental Assessment and Management* published by Wiley Periodicals LLC on behalf of Society of Environmental Toxicology & Chemistry (SETAC).

## INTRODUCTION

Pesticides play an important role in food production by maintaining or enhancing crop yields and quality in arable farming. However, they can also lead to harmful effects in the environment and pose risks to human health. There is now a widespread concern about regular emissions of such substances designed to control specific target organisms and their effects on ecosystems (Boye et al., 2019; Bradley et al., 2017; Mohaupt et al., 2020; Szöcs et al., 2017; Van den Brink et al., 2018).

In spite of strict regulations of pesticide use (e.g., Directive 2009/128/EC; Regulation (EC) No 1107/2009), there are still knowledge gaps for the potential environmental impact of these pesticides and their mixtures (Bradley et al., 2017; Mohaupt et al., 2020; Szöcs et al., 2017). Current risk assessment methods use conservative assumptions to avoid underestimating the risk (F. A. M. Verdonck et al., 2003), and decision makers rely on large safety margins for protective decision making (Fairbrother et al., 2015).

In general, risk assessment of pesticides is carried out to protect human health as well as the health and biodiversity of ecosystems (Schäfer et al., 2019). The purpose is to assess the probability that adverse effects of regulatory concern occurs in ecosystems due to the exposure to one or several chemicals. This can be done as a prospective assessment for the registration of substances before products enter the market, or as a retrospective assessment for potentially harmful substances that are already in use (Forbes & Calow, 2002). The environmental risk assessment process usually incorporates exposure and effect assessments as well as risk characterization (Figure 1). Exposure assessment covers the estimation of the predicted or measured environmental concentration (PEC) of the compound in the environment (van Leeuwen & Vermeire, 2007). Predicted environmental concentration is usually calculated as the maximum environmental exposure concentration (Finizio & Villa, 2002). Effect assessment is typically based on the response of species that are exposed to a chemical in toxicity tests, such as data for toxicity endpoints (e.g., mortality, reproduction, and growth) after short-term exposure (acute) or long-term (chronic) exposure (van Leeuwen & Vermeire, 2007). Usually, a predicted no-effect concentration (PNEC) is obtained from the most sensitive no-observed-effect concentration (NOEC). Alternatively, the PNEC can be calculated from the hazardous concentration for 5% of the species (HC5) based on the species sensitivity distribution (SSD) (Bruijn et al., 2002). To account for uncertainty, the lowest NOEC (alternatively the HC5) is divided by an assessment factor (AF) to derive the PNEC, so it can be considered a safe concentration for non-target organisms (Schäfer et al., 2019). Risk characterization includes a risk estimation by comparing effect (hazard identification and characterization) and exposure assessment; some of the metrics used are margin of exposure, hazard, or risk quotient (More et al., 2019). To ensure low risk, it is required that the PEC is lower than the PNEC (Bruijn et al., 2002; Schäfer et al., 2019), so when using a risk quotient (RQ), it is derived by the PEC/PNEC ratio. Usually, in EU frameworks, if the risk quotient exceeds 1, a risk of harmful effects to the environment is indicated (Bruijn et al., 2002). Risk is usually considered an estimation of the likelihood that an adverse effect occurs on a biological target when being exposed to a chemical (Fairbrother et al., 2015; Finizio & Villa, 2002; Moe, Carriger, et al., 2021). Nevertheless, in the commonly used framework for environmental risk assessment, the output of risk characterization tends to be a single value (the risk quotient) from which the conclusion is a “yes/no” statement (Fairbrother et al., 2015). It has been argued that such single-value estimates cannot stand alone as a scientifically defensible characterization of ecological risk (Campbell et al., 2000). The analysis and quantification of uncertainty are a vital part of risk assessment of the environmental impacts of pesticides, which is not reflected in the single-value risk estimate (Fairbrother et al., 2015; USEPA, 2014). Based on this, a concerted action was established to develop a European framework for probabilistic risk assessment of the environmental impacts of pesticides (EUFRAM). The consortium named several shortcomings of conventional ERA (EUFRAM, 2006). For example, there is no indication of the level of certainty associated with the risk assessment; no quantification of the risk is carried out; the uncertainty calculation is not transparent but hidden in assessment factors; and it is difficult to follow all steps of the risk assessment. Various recommendations were given for development toward probabilistic risk assessment, mainly based on the use of cumulative probability distributions (EUFRAM, 2006). Also, Jager et al. (2001) recommend the use of probabilistic risk assessment for the European Union (EU). In recent years, EFSA has published a Guidance document on Uncertainty analysis where they mention not only Bayesian inference but also Bayesian graphical models as a way to use probability distribution to analyze variability and uncertainty (EFSA et al., 2018) Nevertheless, non-probabilistic methods are still more commonly used (Fairbrother et al., 2015). During the “International conference on uncertainty in risk analysis” held in 2018 by the European Food Safety Authority (EFSA) and the German Federal Institute for Risk Assessment (BfR), three conclusions were drawn highlighting that training is important to improve the understanding of uncertainty, that there is an ethical responsibility of scientists to communicate uncertainties, and that active steps need to be taken by risk assessors to avoid undetected sources of uncertainty (EFSA & BfR, 2019).

The aim of this study was to explore Bayesian network modeling as a tool to combine probability distributions of pesticide exposure and effects, to facilitate the calculation of the risk quotient as a probability distribution instead of a single number. We aimed to align the developed model to the EU regulatory requirements and current risk assessment procedures (Figure 1). Although a Bayesian network model could also have incorporated more advanced components such as effect modeling, we chose a simpler model structure to facilitate the comparison of the Bayesian network approach with the more traditional existing approaches (Figure 2). To this end, we present the development from a deterministic toward a fully probabilistic Bayesian network approach to risk characterization for a case study representing a small agricultural catchment in Norway. The model application is demonstrated for three examples of pesticides and for different seasons of the year.

## APPROACHES TO PROBABILISTIC RISK ASSESSMENT

### Proposed methods for probabilistic risk assessment

Probabilistic risk assessment has been defined as using “probabilities or probability distributions to quantify one or more sources of variability and/or uncertainty in exposure and/or effects and the resulting risk” (EUFRAM, 2006). This allows the inclusion of estimates of uncertainty and stochastic properties (Solomon et al., 2000). There are now several probabilistic methods in use for risk characterization. The species sensitivity distribution (SSD) (Posthuma et al., 2001) is a probabilistic model for the variation in the sensitivity of biological species to a single or a set of toxicants, which is used in several frameworks (Belanger & Carr, 2020). Guidance on modeling and data requirements can be found in the “Technical Guidance for Deriving Environmental Quality Standards” (TGD) (SCHEER, 2017). Many of the probabilistic methods currently at hand also incorporate a distribution for the exposure part. Methods such as quantitative overlap and joint probability curves are relatively easy to construct (Campbell et al., 2000; F. A. M. Verdonck, 2003) and use more available data for exposure and effect compared with traditional approaches (Campbell et al., 2000). They also allow for an estimation of the likelihood of potential ecosystem impact and their magnitude (Solomon et al., 1996). Recently, an “Ecotoxicity Risk Calculator” was presented by Dreier et al. (2020) that uses joint probability curves. It is able to provide more information than a single-value risk quotient, as it depicts the relationship between cumulative probability and magnitude of effect. The use of both effect and exposure distributions enables a more powerful approach for risk assessment and communication (Dreier et al., 2020). However, most of these probabilistic methods derive a distribution that can be a challenge for decision makers to understand and interpret (F. A. M. Verdonck et al., 2003).

### From deterministic to probabilistic risk quotient

Another method more consistent with the probabilistic definition of risk is the calculation of probabilistic risk quotients. It can be useful for ranking of different scenarios as well as prioritizing among alternative risk scenarios (Campbell et al., 2000). A fully probabilistic risk quotient calculation requires the quantification of a probability distribution for both exposure and effect. In cases where exposure or effect data are too limited, an alternative “intermediate” probabilistic approach could be applied using a distribution for either the exposure or effect component (Figure 1). This will allow for some variability to be taken into account when deriving a distribution for the risk quotient. For example, an intermediate approach could be applied when an effect concentration distribution can be quantified by a species sensitivity distribution, although few exposure measurements are available. An overview of the underlying concepts for the traditional deterministic approach, and the intermediate and fully probabilistic approaches is shown in Figure 2. The traditional deterministic approach (Figure 2A) uses single-value PEC and PNEC to calculate a single-value risk quotient. The second option (Figure 2B) used an exposure distribution together with a single-value PNEC, derived the same way as in the traditional approach. However, unlike the traditional approach, here, a risk quotient distribution is derived. The third option (Figure 2C) uses the probability distribution of effects (corresponding to an SSD). Instead of using the SSD to extract a single-value HC5 as a basis for a single-value PNEC in combination with an assessment factor, in this case, a precautionary factor (PF) is applied to the calculated risk quotient distribution. The precautionary factor plays a similar role as an assessment factor by adjusting the predicted risk to account for uncertainties, for example, associated with extrapolation from laboratory toxicity tests to environmental effects. However, we chose to use the slightly different term “precautionary factor” to avoid misusing the more well-established term “assessment factor.” The principle of avoiding the use of assessment factors as a prudential measure in the calculation of the exposure/effect ratio, and instead applying a precautionary factor more transparently in the subsequent step, is inspired by the recommendations of F. Verdonck et al. (2005). The fourth option (Figure 2D), uses effect and exposure probability distributions to derive the exposure/effect ratio distribution. Again, no PNEC is derived, so after calculating the exposure/effect ratio distribution, the precautionary factor is applied to derive the risk quotient distribution.

### Probabilistic risk assessment using Bayesian networks

The early efforts of probabilistic risk assessment for pesticides, which were usually visualized by cumulative distribution curves, were sometimes difficult to interpret for both for advanced users and the general public (EUFRAM, 2006). As an alternative, Bayesian networks may provide a way to overcome the limitations associated with visualization of risk estimations while accounting for uncertainties when using probabilistic approaches. They have been recognized as a tool to analyze complex environmental problems and support decision making while considering uncertainty (Sperotto et al., 2017), and have recently been increasingly used for environmental risk assessments (Moe, Carriger, et al., 2021). A Bayesian network can characterize a system by showing its interactions between variables in a network (Chen & Pollino, 2012) through a directed acyclic graph (Kanes et al., 2017). They are probabilistic graphical models implementing Bayes' rule for updating probability distributions based on evidence. The nodes (variables) have discrete states (e.g., intervals), quantified by discrete probability distributions. The causal links (arrows) represent the conditional probability table (CPT), which can be based on equations. The causal links (arrows) represent conditional probability tables (CPT), which can be based on equations of several methods, empirical frequency distributions, information from the literature, or expert opinion. The degree of belief (probability) that a node will be in a particular state given the state of the node are specified by conditional probability table (Chen & Pollino, 2012) and by using Bayes' rule probability distributions are updated based on new evidence (Molina et al., 2010). In this project, Bayesian network construction largely followed the guidelines provided by Marcot et al. (2006) and Pollino and Henderson (2010).

Bayesian networks have an integral feature suitable for risk estimation as they present results in the probability distribution form instead of point estimates. They can accommodate different kinds of data; their sources can include both direct measurements and output from models. Also, if data are limited or non-existent, it is possible to include expert opinions instead (Pitchforth & Mengersen, 2013). The models can be updated with new information on pesticide exposure and effects whenever it becomes available. Model updates are carried out by combining prior probabilities and new data so that an update of the network posterior probabilities can take place as a response to the added observational information (Franco et al., 2016). Bayesian networks are especially useful for pesticide risk assessment and management tasks as these require characterization of the uncertainties (Carriger and Newman (2012)). Focusing on a terrestrial species (puma), Carriger and Barron (2020) reported a process of mapping cause–effect relations into a quantitative model. This is supported by Catenacci and Giupponi (2013), who found that the Bayesian network approach can examine different phenomena due to its flexibility for interdisciplinary integration, e.g., climatic, physical, ecological, and socio-economic. They also have the ability to perform predictive (forward), diagnostic (backward), and mixed (forward and backward) inferences (Carriger & Barron, 2020).

## METHODS

### Study area

The model was developed based on monitoring data from a catchment within the Norwegian Agricultural Environmental Monitoring Program (JOVA) located in South-East Norway (Heia, location: 59°21′29″N, 10°47′52″E). The monitoring catchment has a total area of 1.7 km^{2}, of which 62% is cropland. As the catchment is located in a coastal climate, winters are mild and the growing season starts relatively early as compared to Norwegian conditions in general. The catchment has an annual rainfall of 829 mm and a mean annual temperature of 5.6 °C (in 2016). The crop production in the catchment is mostly grain (up to 75%). Potato and vegetable production made up about 40% until 2007 and had decreased to about 25% in 2015. The catchment's use of plant protection products and exposure data are recorded in the JOVA program (Bechmann et al., 2017). Flow-proportional composite sampling of stream water at the catchment outlet was performed in the JOVA program throughout the spraying season and the analyses of concentrations of a wide range of current and previously used pesticides were included. Based on these data, exceedances of environmental safety thresholds are identified for different agricultural management practices for key agricultural production systems in various catchments in Norway (Stenrød, 2015). The JOVA monitoring data for pesticides have been collected over 25 years (1995 onward) and thus also support the retrospective assessment of ecological risk and temporal trends (Bechmann et al., 2017).

### Pesticides—exposure and effect data

The chemicals selected for analysis in this study are most frequently occurring pesticides and the highest in concentration in the study catchment (Table 1). Azoxystrobin and metribuzin are approved chemicals for use in the EU and Norway. Since 2013, the use and sale of imidacloprid are prohibited in the EU (EC, 2013). Of the selected chemicals, only the fungicide azoxystrobin has low solubility in water at 20 °C (6.7 mg L^{−1}), whereas metribuzin and imidacloprid have high solubility in water. All pesticides form metabolites primarily in soil (for more information, see the Supporting Information, Chemical properties of selected pesticides). The data used in this study were obtained from the NIVA Risk Assessment database (NIVA RAdb, www.niva.no/radb), which hosts exposure and effect data from a wide variety of sources. Moreover, this database provides transparent and harmonized cumulative risk predictions according to international recommendations for harmonized approaches for human and ecological risk assessments (Tollefsen, 2021). Exposure data for the period from 11.05.2011 to 06.12.2016 from the JOVA monitoring program and effect data (NOECs) for the different compounds originating from the ECOTOXicology Knowledgebase (ECOTOX) (https://cfpub.epa.gov/ecotox/index.cfm) were extracted from the NIVA RAdb database.

Substance | CAS | Type | Mode of action | Approved use (crop) |
---|---|---|---|---|

Azoxystrobin | 131860-33-8 | Fungicide | Systemic translaminar and protectant action with additional curative and eradicant properties. Respiration inhibitor | Wheat; fruit (grapes, citrus, strawberries, peaches); sunflowers; vegetables (onions, brassicas, curcubits); potatoes; cotton; pecans; canola; soybeans; peanuts; turf; ornamentals |

Metribuzin | 21087-64-9 | Herbicide | Selective, systemic with contact and residual activity. Inhibits photosynthesis (photosystem II). | Soybeans; potatoes; barley, wheat; asparagus; sugarcane; tomatoes; peas; lentils |

Imidacloprid | 138261-41-3 | Insecticide, veterinary substance | Systemic with contact and stomach action. Acetylcholine receptor (nAChR) agonist. | Lawns and turf; domestic pets; rice, cereals; maize; potatoes; sugar beet |

The total number of measured environmental concentrations was 55 for azoxystrobin and 59 for metribuzin and imidacloprid. There is a large variation in the measured concentration levels during the season and years for each of the pesticides. The percentages of the detection frequencies were 47.4%, 76.3, and 81.4 for azoxystrobin, metribuzin, and imidacloprid, respectively. In general, sampling of pesticides varied markedly between the years and months. The highest concentrations were recorded in summer and autumn, and lower concentrations were recorded in spring and winter. Due to the sampling method and frequency (i.e., an approx. 20-day sampling period of composite flow proportional sampling), the measured exposure concentrations can reflect chronic exposure to the ecosystem, but maximum and/or peak exposure concentrations are unlikely to be reflected (see the Supporting Information.

The exposure data for the three pesticides showed that 22%–50% of the measured values were below the respective limit of quantification (LOQ) (Supporting Information Tables S4, S6, S7, and S8). In the case of non-detected values (below LOQ), new values were generated as follows (see Supporting Information, Figure 4). First, the non-quantified records were temporarily assigned the value LOQ/2. Use of the LOQ/2 value has been common practice in assessing the potential risks of non-detected residues (Loos et al., 2018), but has been criticized for overestimating the risks of chemicals with PNEC below LOQ (von der Ohe et al., 2011). Second, this intermediate data set was used to derive a mean and standard deviation in ln scale. Third, the resulting log-normal distribution was used to simulate new values in the range from 0 to LOD to replace the non-detected values. The discretized version of this distribution was used as the prior probability distribution of the Exposure node.

For the selected pesticides, data on toxic effects for several freshwater species representing various taxonomic groups were extracted from the NIVA RAdb and represent data from the ECOTOX data repository. The data set consisted of NOECs (no observed effect concentration) for adverse effects such as growth, reproduction, and population. For each chemical, multiple NOEC values from the same species were used in our analysis that represent different species, test durations, and time for effect observation (see Table 2). In traditional effect assessments, only the most sensitive value per species is often chosen to derive an SSD, although, in some cases, an average is also used. In cases where multiple NOEC values of the same species were present, the mean NOEC was used. The fitted distribution corresponds to a species sensitivity distribution (SDD), which is often fitted as a log-normal distribution (Belanger & Carr, 2020).

*n*) of means used to fit the distribution and species with multiple NOECs for the same substance

Substance | Endpoints | n |
---|---|---|

Metribuzin | Growth | 11 |

Population | ||

Azoxystrobin | Growth | 13 |

Population | ||

Imidacloprid | Growth | 11 |

Population | ||

Reproduction |

- Abbreviation: NOEC, no observed effect concentration.

### Data processing

Data preparation was carried out using R version 4.0.2 (Team, 2020) using packages including *tidyverse* (version 1.3.0) (Wickham et al., 2019), *dplyr* (version 1.0.2) (Wickham et al., 2020), and *readxl* (version 1.3.1) (Wickham & Bryan, 2019). To obtain probability distributions for the BN model from the exposure and effects data, log-normal distribution models were fitted to the data using the *R* package *MASS* (version 7.3-51.6) (Venables & Ripley, 2002).

In the case of exposure data below the LOQ, new values in the range from 0 to LOQ were simulated using the mean and standard deviation from the fitted log-normal distribution. To take into account the seasonal variation in pesticide exposure, a separate probability distribution was estimated for each season, defined as follows: Winter = Dec–Feb; Spring = Mar–May; Summer = Jun–Aug; and Autumn = Sep–Nov.

For the effect distribution, likewise, a log-normal distribution was fitted to the NOEC values available for each pesticide. However, while SSDs are traditionally used to derive a single PNEC value (Figure 1), we used the whole probability distribution of effects data in this study. For comparison with the traditional risk quotient calculation based on a PNEC, as described in the introduction, an HC5 was derived from a species sensitivity distribution using the package *ssdtools* (Thorley & Schwarz, 2018) (see the Supporting Information).

### Parameterization of the Bayesian networks

The Bayesian networks were built in Netica (Norsys Software Corp., www.norsys.com). For each pesticide, a BN was built with an identical structure, for both exposure and effects nodes, the range was defined by the observed values of the given pesticide, and the intervals were discretized into 12 equidistant bins in a log10-scale. The fitted log-normal distributions were used to parameterize the parent nodes. The individual node description is shown in Table 3; further detailed information is shown in the Supporting Information—IV. Netica discretization and equation syntax).

Node/variable | Type of discretization | States |
---|---|---|

Exposure concentration distribution | C | 10 |

Effect concentration distribution | C | 10 |

Exposure–effect–ratio distribution | C | 8 |

Uncertainty factor | D | 7 |

Risk quotient distribution | C | 8 |

- Abbreviations: C, discretized continuous; continuous variables were binned into the states; D, discretized discrete; States, number of intervals of each node.

All conditional probability tables of the BNs (Figure 3) were generated from equations, by the function “Equation to Table” in Netica (see the Supporting Information). The probability distribution of the nodes “Exposure Concentration (µg/L)” and “Effects Concentration (µg/L)” was calculated from their respective parent nodes by exp-transformation. The node “Exposure/Effect Ratio” was discretized into eight equidistant bins and calculated using the equation [Exposure Concentration (µg/L)]/[Effects Concentration (µg/L)]. Thereafter, the risk quotient distribution was derived by multiplying the “Exposure/Effect Ratio” with a precautionary factor. The precautionary factor can be applied to account for uncertainties in the effect assessment, similar to the use of an assessment factor in traditional risk assessment (Figure 1). This factor can be transparent and standardized in a simple manner by considering the information used during the effect assessment, for example, number of data points, species, taxonomic groups, and region-specific species. In our model (Figure 1), the node “precautionary factor” has alternative levels that can be selected by the risk assessor, depending on the sources of uncertainty to be accounted for in the risk assessment. We describe diagnostic inference in more detail and how we used it to derive an appropriate precautionary factor (see Figure 3) in the results, as we used the parameterized Bayesian network for this.

After the Bayesian network was constructed and parameterized, a sensitivity analysis was carried out in Netica. The report showed that the risk quotient distribution is dominated by the exposure side over the effect side, which is most likely due to the wider range of concentrations.

In this way, a Bayesian network model is intended as a tool for calculating the risk quotient as a probability distribution, to account for, for example, temporal variability in exposure, taxonomic variability in effects, and other types of uncertainties.

## RESULTS AND DISCUSSION

### Diagnostic inference to derive an appropriate precautionary factor used in the Bayesian network

This section describes the parameterized version of the Bayesian network for each of the three pesticides, illustrated with azoxystrobin as an example. For comparison, the risk quotient was also calculated using the traditional single-values method (Figure 2A) as well as by the two intermediate options (Figure 2B,C). For the single-value exposure versions (Options A and C), the minimum (0.01 µg/L), mean (0.129 µg/L), and maximum (0.660 µg/L) of the measured concentrations were selected as alternative PEC values. The highest exposure concentration is usually used as the more conservative or protective choice. To be able to compare traditional and probabilistic outputs better, we have decided to use the mean PEC instead. For the single-value effect version (Options A and B), the PNEC values were derived from an HC5 of 3.87 µg/L divided by an assessment factor of 10, 5, 3, and 1 (Table 5). The Technical Guidance Document recommends the use of an assessment factor of 1–5 when deriving the PNEC from an SSD. We also applied an additional and more conservative assessment factor of 10, as the data set that we used does not fulfil all the requirements of the TGD with at least 10 NOECs and at least 8 taxonomic groups. The Technical Guidance Document also states that the assessment factor should be decided on a case-by-case basis “through consideration of sensitive endpoints, sensitive species, mode of toxic action and/or knowledge from structure-activity considerations” (Bruijn et al., 2002). Therefore, in this study, we present several assessment factors but primarily focus on an assessment factor of 5.

The probability distributions of exposure and/or effects data in Options B, C, and D were based on the fitted log-normal distribution with mean and standard deviation. The exposure distribution had a mean of −4.148 (ln µg/L), with a standard deviation of 1.484 (ln µg/L). The effect distribution had a mean of 2.322 (ln µg/L), with a standard deviation of 0.56 (ln µg/L).

The seasonal version of the Bayesian network was parameterized with exposure distributions based on seasonal mean values for the three pesticides. Winter season for all chemicals and spring season for azoxystrobin had too few detected concentrations to derive a distribution and were therefore excluded from further analysis. In general, the mean concentrations in summer were higher than in spring and intermediate in autumn (Table 4). The exception was Imidacloprid, which had higher concentrations in autumn.

Exposure | ||||
---|---|---|---|---|

Compound | Spring ln (µg/L) | Summer ln (µg/L) | Autumn ln (µg/L) | Effect ln (µg/L) |

Azoxystrobin | ||||

Mean | −3.939 | −4.018 | 2.322 | |

SD | 1.529 | 1.541 | 0.568 | |

Metribuzin | ||||

Mean | −4.357 | −2.794 | −3.292 | 4.946 |

SD | 0.966 | 1.416 | 1.363 | 2.432 |

Imidacloprid | ||||

Mean | −3.902 | −3.404 | −1.783 | 6.484 |

SD | 1.481 | 1.116 | 1.743 | 4.004 |

Before the parameterized Bayesian network model can be used to calculate the risk quotient, an appropriate precautionary factor should be set by the risk assessor. In our example, to follow a regulatory accepted method as closely as possible, we selected a precautionary factor that would yield a similar risk quotient as the SSD-based approach (Figure 2A). The derived ranges of risk quotients are shown in Table 5. The values of the precautionary factor corresponding to selected assessment factor values of 1, 5, and 10 were derived by diagnostic inference by instantiating the nodes for exposure, effect concentration, and risk quotient nodes (Figure 3). For the exposure and effect concentrations, the intervals were set according to the mean of the observed values. The intervals for the risk quotient were set according to Table 5. An example is shown in Figure 3, where the risk quotient was 0.0999 (see Table 5), showing that the risk quotient node interval is set to “0.03 to 0.1.” In this example, the resulting precautionary factor is 30. The appropriate precautionary factors found corresponding to the assessment factors are shown in Table 6. To explore the role of the assessment factor and the precautionary factor and their effect on the risk quotient, we chose precautionary factors of 3, 10, and 30 for Option C and 10, 30, and a 100 for Option D for the first example with azoxystrobin (Figure 5). For all the seasonal versions of the Bayesian network, only one precautionary factor (100) was chosen to focus more on the exploration of the seasonal effects.

PEC minimum | PEC average | PEC maximum | ||
---|---|---|---|---|

AF | PNEC | 0.01 | 0.129 | 0.66 |

10 | 0.387 | 0.0258 | 0.3333 | 1.7041 |

5 | 0.775 | 0.0129 | 0.1665 | 0.8521 |

3 | 1.291 | 0.0077 | 0.0999 | 0.5112 |

1 | 3.873 | 0.0026 | 0.0333 | 0.1704 |

*Note*: The alternative PNECs are derived from the HC5 (see Figure 2A) with an assessment factor (AF) of 1, 3, 5, and 10.

(a) | PEC min | PEC avg | PEC max |
---|---|---|---|

AF | 0.01 | 0.129 | 0.66 |

10 | 30 | 30 |
30 |

5 | 30 | 10 |
10 |

3 | 10 | 3 | 10 |

1 | 1 | 3 |
3 |

(b) | PEC min | PEC avg | PEC max |
---|---|---|---|

AF | 0.01 | 0.129 | 0.66 |

10 | 10 | 300 |
1000 |

5 | 10 | 100 |
300 |

3 | 3 | 30 | 300 |

1 | 1 | 30 |
100 |

*Note*: For each alternative risk quotient in Table 5, the related RQ interval was selected as evidence to derive the corresponding precautionary factor for Option C—the intermediate approach using effect distribution (a) and Option D—the fully probabilistic approach (b). The bold values are the ones used in the examples of the result section. AF, assessment factor; PEC, predicted environmental concentration.

### Risk quotient distributions predicted by the Bayesian network

The Bayesian networks for the different options for the risk quotient calculation (Figure 2) were carried out for azoxystrobin and are shown in Figure 4. The posterior probability distribution of the risk quotient node output was shown for the different approaches (Figure 2) and for alternative values of the assessment factor or precautionary factor, respectively. The colors range from green (no risk) to red (posing a risk) (Figure 5). The risk quotient distribution for the approaches ranged from 0 to 3000. Higher assessment factor and precautionary factor increase the probability of the risk quotient exceeding 1.

An example using a Bayesian network approach for the different approaches for Options A–D (Figure 2) is shown in Figure 4. The assessment factor used in a risk assessment is usually decided by the risk assessor depending on the available toxicity test data. In this study, we have explored the resulting risk quotient when using three alternative plausible assessment factor values for Options (A) and (B), and three corresponding precautionary factor values (see Table 6) for Options (C) and (D). In this example, the risk quotient was calculated using the following evidence: a mean PEC and a PNEC with an applied assessment factor of 5 (Options A and B) and a precautionary factor of 10 (Options C and D). Using the deterministic method, the risk quotient distribution is estimated to be within the interval “0.01 to 0.3” with 100% probability (Figure 4A). On the other hand, Options B–D show a wider distributed risk quotient and probabilities distributed over several risk levels. Options B and D have the highest probabilities in the intervals of “0.003 to 0.01,” “0.01 to 0.03,” and “0.03 to 0.1.” Option C has the highest probability in the interval of “0.1 to 0.3.” A bar charts displaying vizualising the results for the different Options A/D and selected assessment and precautionary factor of the Bayesian network risk quotient node are shown in Figure 5. When using an assessment factor of 1, 5, or 10, the deterministic option (Figure 5A) results in 100% probability of the risk quotient being in the intervals of “0.01 to 0.03,” “0.1 to 0.3,” or 0.3 to 1, respectively. Option B uses an exposure distribution and the same assessment factors as in Option A to calculate the risk quotient, which is distributed over the intervals “0 to 0.0003” and “1 to 3.” For an assessment factor of 1, the probability for the risk quotient to be in an interval higher than 0.1 is about 3.2%, whereas for an assessment factor of 5, it is 26.4%. Option C in this example uses the precautionary factor calculated in Table 6a. For the events of a mean PEC with a precautionary factor of 30, the interval of “0.3 to 1” has the highest probability. If a precautionary factor of 10 is chosen, however, the interval of “0.1 to 0.3” has the highest probability (Figure 5C). The probability for the risk quotient to be above 0.1 with a precautionary factor of 3 is less than 10%; with one of 10, it is about 65% and with one of 30, it is about 100%. The fully probabilistic approach—Option D uses distributions for both exposure and effect, when using precautionary factors of 10, 30, and 100, Table 6b. The probability for the risk quotient to be above 0.3 is about 4% with a precautionary factor of 10, 12% with PF = 30, and about 40 with PF = 100 (Figure 5D).

As can be seen in Figure 5, the probabilistic approaches yield a distributed risk quotient. The general tendency is that the calculated risk quotient is similar in all of the approaches; nevertheless, the Bayesian network yields a more nuanced risk estimation and offers some uncertainty related to the different risk quotient intervals. In other words, instead of having a single risk quotient (e.g., RQ > 1), uncertainties for various risk levels (e.g., RQ > 0.1, RQ > 0.001, RQ < 1) can be derived. The intermediate approaches using a distribution for only exposure or effect also results in a more informative risk quotient compared to the traditional approach, but include more variability and/or uncertainty, respectively, in effect or exposure. Therefore, options b and c could be used whenever data are lacking for the fully probabilistic approach. The assessment and precautionary factor applied have a major impact on the risk quotient exceeding 1 and with that being an unacceptable effect for non-target organisms and aquatic organisms (Bruijn et al., 2002). In this example, fully probabilistic approaches only show the risk quotient exceeding 1 for high assessment and precautionary factors (Options B–D) (Figure 5).

### Seasonal variation in risk quotients

A more temporally refined version of the Bayesian network was developed and used for calculating seasonal risk quotients for all three pesticides (see the Supporting Information). The precautionary factor was set to 100 as this was found to be the most appropriate in comparison with the deterministic method (Table 6). According to this model (Figure 6), the probability of the risk quotient for azoxystrobin exceeding 0.1 during summer is about 72%, while the probability of the risk quotient exceeding 1 is about 15%.

In comparison with the other two pesticides, azoxystrobin clearly showed a higher probability of exceeding the risk quotient levels of 0.1 to 0.3 in summer and autumn (Figure 6). Metribuzin and imidacloprid have a wider distribution for the risk quotient, mainly ranging from 0.0001 to 0.001. Spring and autumn distributions of probability in the case of imidacloprid are more similar, unlike metribuzin, where summer and autumn distributions appear to be more similar, with higher probabilities of the risk quotient exceeding 1 than the spring season. This analysis illustrates how the Bayesian network approach can be used to identify periods with a high risk of environmental effects of individual pesticides. This outcome can in turn be used to assess the combined risk of multiple pesticides in specific periods.

### Evaluation of the Bayesian networks approach for risk characterization

This study has demonstrated that Bayesian networks can account for quantified uncertainties and variabilities in a more coherent and transparent way than traditional risk characterization. When developing this Bayesian network approach, we aimed to follow important recommendations for probabilistic risk estimation given by EUFRAM (2006). We tried to accomplish these by combining the new methods with the conventional “deterministic” assessment to enable the end user (e.g., regulators) to become familiar with the new methodology. Furthermore, the developed models follow well-known concepts described in the TGD whenever it was possible and logical. The TGD, for example, describes what an appropriate assessment factor is depending on the available data and mentions requirements for the used data for a minimum amount of taxonomic and species used for SSD modeling (More et al., 2019). In addition, the Bayesian network methodology provides a simple display of the results in bar plots (histograms) instead of cumulative probability. This was also pointed out by EUFRAM (2006), which mentioned stakeholders being more likely to take up results if they and the concepts used are as simple as possible and aligned with existing frameworks (EUFRAM, 2006).

Bayesian networks are increasingly being used in environmental risk assessment (Moe, Wolf, et al., 2021). They can offer a transparent way of evaluating the required characterization of uncertainty for pesticide risk assessment as well as for ecological risk assessment in general (Carriger & Newman, 2012). Moreover, their application is not only carried out for risk estimation (e.g., risk quotient) but also used to predict ecological effect from stressors more directly (e.g., decline in species abundance [Mitchell et al., 2021]) and to develop quantitative Adverse Outcome Pathways (Moe, Wolf, et al., 2021). Dreier et al. (2020) pointed out that the use of effect and exposure distribution allows for a competent risk assessment and communication approach. In their “ecotoxicity risk calculator,” they used joint probability curves or a risk curve-based approach that are able to show the connection between cumulative probability and magnitude of effect (Dreier et al., 2020). Although this might be an advantage of using joint probability curves, probabilistic risk quotients can provide a better sense of the risk estimates and are useful for ranking of different scenarios as well as prioritizing among alternative risk scenarios (Campbell et al., 2000). Another probabilistic alternative to the risk quotient was introduced by van Straalen (2001) and has also been applied by Aldenberg et al. (2001); it defines the ecological risk (δ) as the probability that the environmental concentration exceeds the no effect concentrations, while making use of the whole probability distributions. This method does not make use of an assessment factor; therefore, the δ would correspond to the probability of our calculated Exposure/Effect ratio >1 (e.g., Figure 3), or a risk quotient with the UF set to 1. However, this method does not allow for the calculation of different levels of risk.

Especially in ecological systems, limited data and knowledge can hinder modeling efforts, as they constrain it to simpler model structures that involve more assumptions. In these cases, Bayesian network models can still be applied by making better use of different sources of information, including expert judgment (Hamilton & Pollino, 2012). Also, Bayesian networks can be developed as casual models, which can help understand pathways of hazard and vulnerability relations better and thereby be used to assist risk prioritization (Sperotto et al., 2017).

Carriger and Barron (2020) recently showed how the Bayesian network estimated a probabilistic risk quotient for a single species by calculating the probability of an exposure distribution exceeding an effect distribution. Their Bayesian network estimated the risk by expanding the standard risk equation to include more uncertainties and variables that influence the risk (Carriger & Barron, 2020). The networks that we have created used similar risk quotient calculations, though instead of focusing on one terrestrial species, we have included toxicity data for multiple aquatic species using a species sensitivity distribution. Also, Carriger and Barron (2020) stated that “the capabilities for performing diagnostic, mixed, and predictive inference make Bayesian networks especially useful for examining the causal factor that could lead to higher or lower risk outcomes.” The influence of different causal factors on the predicted risk in our case study will be further explored later by including different scenarios of climate and pesticide application.

The networks that we developed use discretization of continuous variables and, due to this, lose some of the initial precision and information. This is commonly considered a shortcoming of Bayesian network models (Marcot, 2017). Nevertheless, a possible improvement can be to use dynamic discretization to enable higher resolution and lower uncertainty associated with the predictions (Carriger & Barron, 2020).

Furthermore, F. A. M. Verdonck (2003) pointed out that there are some unquantifiable uncertainties such as the choice of distribution, model, and extrapolation uncertainties that remain difficult to quantify, some of which may be overcome by using distribution models other than the ones used in this study. An alternative to the exposure modeling that we have carried out in this study was presented by Wolf and Tollefsen (2021), showing how Bayesian distributional regression models could be used to better include spatiotemporal conditional variances in exposure assessment and still allow for a distributed PEC (Wolf & Tollefsen, 2021). Further refinement of the Bayesian Network model presented here can make use of such statistical modeling for better estimation of the pesticide exposure distributions.

There are many possibilities for further development of the models presented here, for example, to better account for spatial and temporal variations in exposure and inter- versus intra-species variation in sensitivity in effect assessment. Nevertheless, we have demonstrated that this approach can offer a transparent way of evaluating the required characterization of uncertainty for pesticide risk assessment (Benford et al., 2018) as well as for ecological risk assessment in general (Carriger & Newman, 2012).

## CONCLUSION AND OUTLOOK

This study demonstrates that Bayesian network modeling is a promising tool for probabilistic calculation of a risk quotient to carry out risk assessment of pesticides. A probabilistic risk quotient is a more informative alternative to the traditional single-value risk quotient, which is often interpreted as a binary outcome. The Bayesian network approach provides more opportunities for interpretation, such as the probability of the risk quotient that exceeds not only the conventional threshold of 1 but also other specified threshold values. The model presented here can easily be mapped to the main steps of traditional risk characterization frameworks. The Bayesian network approach can still apply a precautionary factor to account for additional uncertainties that are not captured by the exposure and effects distributions, corresponding to the assessment factor used in traditional risk assessment. Thus, Bayesian networks can offer a transparent way of evaluating the characterization of uncertainty required for pesticide risk assessment as well as for ecological risk assessment in general.

Our planned further development of this Bayesian network includes extending the model for cumulative risk assessment of pesticide mixtures in the aquatic ecosystem. Furthermore, we intend to incorporate climate and agricultural scenarios to predict the environmental risk of pesticides under alternative future conditions.

## ACKNOWLEDGMENT

This research was funded by ECORISK2050, which has received funding from European Union's Horizon 2020 research and innovation program under the grant agreement No. 813124 (H2020-MSCA-ITN-2018). K. E. Tollefsen was funded by NIVA's Computational Toxicology Program (www.niva.no/nctp).

## DISCLAIMER

The peer review for this article was managed by the Editorial Board without the involvement of S. Jannicke Moe.

## Open Research

# DATA AVAILABILITY STATEMENT

This work resides on the bioRxiv Preprint Server (bioRxiv.org; BIORXIV/2021/444913). The R scripts developed for data preparation and data used are available in the Supporting Information.

Bayesian network modeling was carried out using Netica 6.05 (www.norsys.com/). Files are added as supplementary information.