This is the fourth page at Level 1 of a three level web. It contains interpretations of analytical pesticide and metabolite data for project groundwater sites.

Navigation: ◄ 3: Simple results | ▲ Abstract | ► 5: Outputs

4: Interpretations (Level 1)

4.1 Organizing data to interpret it

We begin with a two-dimensional matrix of one row per pesticide or metabolite analytical result per sample; as of the cutoff date of December 31, 2023 this was about 85 analyses X about 220 samples, yielding nearly 20,000 rows. Matrix columns are in four groups:

Name of the analyte (pesticide or metabolite), anonymized identifier of the site, and other “result labels” that are used for grouping results but not used in computations;
Numerical or text values of potential causal factors (“independent variables”) about the site, the sampling point, the sample, or the chemical; such as soil organic matter, ecological region, presumed vulnerability, chemical sorption parameter K_{oc}, chemical degradation parameter t_{1/2}, field measurements such as water table depth and electrical conductance, ion concentrations measured in lab, and indication of whether or not the active ingredient (or metabolite parent) was definitely used or definitely not used in recent years at the site;
Analytical result: numerical result if any, qualifier such as “not detected”, and the applicable minimum detectable concentration (detection limit). This is a composite “dependent” variable.

The matrix contains a core of around 30 primary columns, thus the core is around 600,000 data cells through 2023. It will reach a core of a million data cells after the 2024 samples are analyzed.

There can also be secondary derived columns for summarization.

This matrix is created from the project’s confidential ===>tabular database (Level 3). Since patterns in the independent variables might be usable to infer the detailed locations and identities of the cooperators, so far no version of this interpretive matrix is available to NYSDEC or the public. This web pageset reports only aggregated data from the matrix taking various subsets of columns and rows.

4.2 Cross tabulations

A cross tabulation summarizes data from the full detail matrix by excerpting rows for some part of the project’s sites, then grouping the remaining rows according to values of at least one independent variable, then within each group tallying a dependent variable (analytical result); the “cross” aspect of cross-tabulation enters when there is more than one independent variable. For example, we could group rows by site category and presumed aquifer vulnerability.

The simplest result combination is counting detections versus nondetections; a more elaborate combination is breaking out counts by result numerical ranges. The results of the mathematical crosstab operation, which can be implemented using a pivot table in Microsoft Excel¹, are numbers in a table with one column per grouping aspect and one row per group. The table can also be represented visually using a chart. This section includes four bar charts based on cross-tabulation results.

Data in these charts are usually scaled by dividing by the number of samples or the number of analyses, since the separate bars represent different numbers of samples and analyses that provide opportunities for detection. Dividing the raw detection count by the number of opportunities for detection adjusts for different counts per plotted bar so the tallness or width of the bar is not affected by simply having a larger count of samples represented by the bar. We then scale upward by 100 or 1000 to give whole numbers on the axis instead of small fractions. These scalings are not intentionally percentages or permilles of samples or of tests performed; however they may also have that meaning.

There are two aspects that can influence patterns that are not adjusted for by scaling:

different chemical usage by different sites; not every categorical cooperator uses the same chemicals, and not all of the chemicals they use are tested for;
for plots that include all groundwater samples, which include categorical and long term sites, some site characteristics such as presumed vulnerability of the pesticide use zone to leaching are poorly known.

Table 1 shows the data summary underlying the following Figure 1. These combine all matrix rows for groundwater site samples (omitting only lake site samples) and break them into two chemical types, pesticides and metabolites, and three presumed aquifer vulnerabilities.

Figure 1 indicates that for the groundwater sites of this project the aquifer vulnerability presumptions are not a strong indicator of leaching potential to shallow groundwater, particularly not for metabolites which are more common at the presumed low vulnerability sites than the presumed high vulnerability sites. There is roughly equal detection of active ingredients in any of the vulnerability classes, and highest detection of metabolites in the lowest vulnerability class. One might expect the highest vulnerability sites to have higher detection frequency than the lower vulnerability sites; high vulnerability Long Island certainly has the lion’s share of detections in its extensive monitoring data.

Like other statistical methods, cross tabulations reflect correlation (conditions that occur together) rather than causation (one condition causing the other).

Table 1: Data underlying Figure 1.
Presumed vulnerability	Chemical type	Detections per 100 samples
‘a-high’	‘metabolites’	27.2
‘a-high’	‘pesticides’	27.2
‘b-med’	‘metabolites’	35.7
‘b-med’	‘pesticides’	33.3
‘c-low’	‘metabolites’	91.0
‘c-low’	‘pesticides’	32.1

Figure 1: Pesticide and metabolite detections at groundwater sites, 2022 and 2023, by presumed site vulnerability, scaled by sample count

A note about vulnerability: As described earlier, vulnerabilities of whole sites are assigned as follows:

High vulnerability: alluvium, glacial outwash, glacial lake sands (similar to most of Long Island)
Medium: some glacial tills without restrictive layers, OR mixed conditions high and low on same site
Low: some glacial tills having restrictive layers, glacial lake silt/clay deposits, organic/muck soils

Effect of chemical degradation potential

Standard pesticide testing in lab and field estimates the relative tendency of the pesticide or metabolite to degrade to simpler compounds by chemical or biochemical processes. The relative rate of degradation (in a first order process that reduces concentration C over time t: C(t>0) = C(t=0) e^{-kt}) is represented by a half-life (t_{1/2}) in days, higher meaning more tendency to resist degradation, i.e. persisting longer. The k value in the first order equation is related to half life by k = \frac{-ln(0.5)}{t_{1/2}}.

Figure 2 (chemicals in Table 2) demonstrates that the longest lived analytes are much more often detected than short lived analytes. There are also more higher concentrations detected for the longer lived analytes. As with the vulnerability chart, different kinds of pesticides (in terms of degradability) used by the different sites may explain part of the pattern.

Figure 2: Magnitudes of analyte detections at groundwater sites, 2022 and 2023, by soil halflife of analyte, scaled by analysis count

Table 2: Half-life ranges for detected active ingredients and metabolites grouped by soil half-life (t_{1/2})
Half-life range (days)	Detected active ingredients and metabolites (italics)
>200	diuron, metolachlor ESA, metolachlor OA
100-200	fluopyram, imidacloprid, mefentrifluconazole, oxadiazon, hydroxy atrazine, JSE76, terbacil
50-100	simazine, acetochlor ESA, bromacil
25-50	atrazine, myclobutanil, paclobutrazol, propiconazole, thiamethoxam, de ethyl atrazine
5-25	acetochlor OA, bentazon, glyphosate, S-metolachlor, carbaryl
	Field values used when available, “typical” values otherwise. All data from Hertfordshire PPDB ².

Effect of chemical sorption to soil organic matter

Standard pesticide testing in labs estimates the tendency of the pesticide or metabolite to adsorb (stick) to soil organic matter. The relative affinity to adsorb, as opposted to staying in solution in water, is expressed by a K_{oc} or K_{foc} parameter value, higher meaning more tendency to adsorb. K_{foc} applies to a nonlinear (Freundlich) expression of sorption tendency as affected by the sorbed concentration.

Figure 3 (chemicals in Table 3) demonstrates that the stickier analytes are much less often detected than less sticky analytes. As with the vulnerability and half life charts, different kinds of pesticides (in terms of sorption) used by the different sites may explain some of the pattern.

Figure 3: Magnitudes of analyte detections at groundwater sites, 2022 and 2023, by Koc (or Kfoc) of analyte, scaled by analysis count

Table 3: Detected pesticides and metabolites grouped by range of organic sorption coefficients (Koc or Kfoc).
Koc range (m³/Mg)	Detected active ingredients and metabolites (italicized)
>1000	propiconazole*, glyphosate, oxadiazon
500-1000	diuron
200-500	Imidacloprid, fluopyram, carbaryl, S-metolachlor
60-200	de ethyl atrazine, simazine, metolachlor, atrazine
0-60	dicamba, bentazon, metlachlor ESA, metolachlor OA*
	*K_{foc} used in absence of reported K_{oc} values. All data from Hertfordshire PPDB ³.

Detection patterns by category

The categories of different sites represent different purposes for using pesticides. For example, some land uses need weed management and others need insect and fungus control. Some categories manage pests coarsely and others need very fine control. Also the categories treat different fractions of their sites differently. A golf course needs to devote most pest management effort to its immaculate greens (a small part of the site) and successively less to fairways and roughs. A sod farm occupies a large fraction of its property with their grass crop and manages it uniformly over the area. A vineyard treats narrow lines of its fields that contain the vines and does not treat between them. A closed greenhouse may have only fugitive pesticide emissions from water overflowing from pots onto a solid greenhouse floor, then directed to a small zone outdoors next to the greenhouse.

There are many factors that could contribute to different pesticide escape from different categories. Are there nevertheless patterns between the categories?

Figure 4 arrays similar data to the sorption and half life plots, rotating to horizontal bars to allow for longer bar labels. The turf and ROW categories have few samples and almost no detections, thus should receive little weight in interpretation. Fruit and vegetable farms and sod farms have the most detections (normalized for different analysis counts). Both of these categories contain mucklands whose influence also affected the first chart on this page about vulnerability.

Both cooperating sod farms are on presumed low vulnerability mucklands. They are in different ecological regions. The industry does not have many farms in total in New York, and a large majority of upstate sod farm land is on mucklands.

The three vegetable and fruit farms are very different in approach; that broader industry in upstate consists of hundreds of farms that are very varied in crop and spread widely across most ecoregions. The cooperating fruit farm grows perennial tree fruits, compared to both vegetable farms that grow annual crops and change them from year to year. One vegetable farm is on glacial outwash soil and the other on drained muckland.

Figure 4: Magnitudes of analyte detections at categorical groundwater sites, 2022 and 2023, by category, omitting upgradient sampling positions, scaled by analysis count

There are many, many factors in pesticide use and pesticide environmental transport and fate to consider if attempting to account for why certain concentrations occur. The crosstab plotting did not yield many new insights, except for the fact that standard vulnerability ratings based on surficial geology are not good predictors of the delivery of pesticide residues to shallow groundwater at low to very low concentrations.

4.3 TGUS conceptual model for comparison of chemicals

This is a second way of looking at the data matrix of independent variables versus analytical results, applying physics and chemistry to combine several parameters at once mathematically instead of using simple cross-tabulations and plots. There were not that many detections overall, because none of the sites use more than a few of the analyzed-for chemicals and upstate’s groundwater is highly localized, in contrast to Long Island’s extensive, much more uniform Upper Glacial and Magothy aquifers.

The TGUS model represents uniform pesticide spreading on surface soil and its subsequent leaching into preferential flow pathways downward to a shallow water table. It applies best to a site where outdoor pesticide applications are relatively uniform over an area, i.e. a subset of the categorical sites.

TGUS does not apply well to golf courses (too small areas treated), greenhouses (usually no outdoor pesticide use), Rights of Way (linear use rather than area), and other turf (too small areas treated).

TGUS does apply well within the project to fruit and vegetable farms, sod farms, vineyards, and some outdoor nurseries. These sites do include a sizeable majority of analyte detections.

The model draws from our and others’ experience in experiments in which surface-applied liquid dye spreads out relatively uniformly in a thin surface layer then breaks through into rapidly-forming narrow fingers that make it move much deeper into the soil, quickly since water is applied with the dye to represent a rain event. Over time, if enough water is applied to saturate the soil pores again, more of the surface layer’s dye (or other solute) will leach into the preferential pathways and almost none moves outside of the paths.

The model represents the vadose zone (unsaturated, above the water table) only, seeking to measure delivery to the water table from one pesticide application at the surface that has been partially washed downward by a storm event that saturates the upper layer (distribution zone).

Figure 5: Preferential path pesticide leaching model

Data to apply TGUS to project sites come from several columns of the project’s data matrix.

Application of TGUS leaching detection duration across categorical sites

The second TGUS equation represents the duration of one pesticide’s or metabolite’s concentrations in shallow groundwater above the applicable analytical detection limit (T_{LRP}). This represents the ephemerality or persistence of a given chemical used in a certain way in a certain environmental context. Longer detectable duration comes from either higher concentrations or a smaller lowest analytical detection limit. Higher concentrations come from more usage per unit area (kilograms active ingredient per hectare) or a larger fraction of the applied amount that manages to reach groundwater without being destroyed by chemical and biochemical processes. Very small detection limits, as low as 0.01 micrograms per liter in the project, come from more investment in machines and analytical expertise.

Figure 6 applies the equation for T_{LRP} in two ways. The color shaded background combines all of the TGUS input factors for sites and chemicals yielding a 2-dimensional plot of the highest TGUS detectable leaching durations in days (dark red at upper right) to the lowest (dark blue at upper left); the color scale is to the right of the plot; negative values in deep blue indicate that the concentration will never exceed the detection limit. The background colors are based on 1 kilogram/hectare (about 1 pound/acre) active ingredient application rate, 0.1 μg/L detection limit, 1 cm thickness of distribution (initial mixing) zone, 1% of the treated land recharging preferentially, and average mineral soil parameters.

The vertical axis represents different chemical half lives, and the horizontal axis represents the ratio of a \xi term that combines application rate and detection limit to a mobility term K_{oc}.

Figure 6: Theoretical Groundwater Ubiquity Score leaching detection risk period overlaid with analytical detections, 2022-2023 groundwater sites

On this background the individual analytes detected in the project are represented with rectangles of sizes proportional to the number of detections. Individual chemical plotted points refine this based on product label maximum application rate and the detection limit applying to the analyte.

Except for outliers Glyphosate and Bentazon, each at one sampling point only, all of the project’s detections have TGUS detectable leaching time scores of at least 100 days which is in the light cyan color zone. This means that the chemicals being detected are ones that tend to have particular combinations of mobility (K_{oc} sorption) and persistence (t_{1/2} degradation half life) as we saw in the Crosstab section above. TGUS seems to integrate all factors well.

There are two detection outliers from the 100 day figure. The single Glyphosate detection probably arrived by surface washoff to the sampling point, and Bentazon detections at one well are a mystery so far, consistently present at 2-8 μg/L in all four samples from one well that lies between a large drainage ditch and a vegetable field. Bentazon was not quickly recognized by the cooperator as something they use. (In our experience, cooperators recognize the product names but not necessarily all active ingredients as the analytical lab names them.)

Application of leaching detection duration to a single site’s pesticides used

The plots in this subsection display detections of used chemicals, non-detections of used chemicals, and detections of unused chemicals which may come from offsite sources. As in Figure 6, the background coloration is based on representative site soil organic matter and bulk density, 1 kg/ha application rates, 1 cm distribution zone thickness, and a 0.1 μg/L analytical detection limit. Individual chemicals at the site are plotted using their maximum annual application rates from the label (not necessarily how the owner or neighbor used them), and their own detection limit.

Figure 7, for a fruit farm in glacial outwash or lakeshore gravel, uses black to plot three detected, used analytes, yellow to plot four undetected, used analytes, and magenta to plot two unused but detected analytes. The used but undetected analytes are all left/lower than the used but detected analytes. The two unused but detected analytes could come from upgradient; at this particular site we do not have a good upgradient sampling point.

Figure 7: Gravelly fruit chemical detections and non-detections

Figure 8 for a sod farm in muck has three herbicide metabolites plotted in red that are not part of the sod farm’s operation on turfgrass; their detection is unexplained; perhaps there is upwelling from bedrock below the muck that was recharged through an uphill neighbor’s property? Mucklands (former lakes that filled in with organic matter that decayed incompletely) are in low lying areas which tend to be more groundwater discharge areas than groundwater recharge areas. The two metolachlor metabolites are particularly mobile compounds.

Detected imidacloprid, in black, was definitely used. Five undetected but used chemicals are plotted in blue. As for the Gravelly fruit site, undetected and used chemicals are below and to the left on the plot indicating shorter detectability periods than upper right. Glyphosate in particular is below the zero contour of T_{LRP} (vertical line).

Figure 8: Mucky sod chemical detections and nondetections

TGUS indexing seems very promising for comparing chemicals and contexts against one another. Earlier GUS ⁴ which inspired TGUS does not consider explicitly application rates, detection limits, or site characteristics thus provides a much coarser vantage point.

4.4 Machine Learning

We are experimenting with machine learning (an artificial intelligence technique to find patterns in data) using the same data matrix as cross-tabulation and TGUS. When that approach matures we will report findings here.

Navigation: ◄ 3: Simple results | ▲ Abstract | ► 5: Outputs

Last edited 2025-01-08, sp17 AT cornell.edu

Footnotes

URL: https://support.microsoft.com/en-us/office/create-a-pivottable-to-analyze-worksheet-data-a9a84538-bfe9-40a9-a8e9-f99134456576 ↩︎
University of Hertfordshire, UK. 2024. Pesticide Properties Database. URL: https://sitem.herts.ac.uk/aeru/ppdb/en/. Visited 2024-12-06.↩︎
University of Hertfordshire, UK. 2024. Pesticide Properties Database. URL: https://sitem.herts.ac.uk/aeru/ppdb/en/. Visited 2024-12-06.↩︎
Gustafson, D. I. (1989). Groundwater ubiquity score: A simple method for assessing pesticide leachability. Environmental Toxicology and Chemistry, 8(4), 339–357. URL: https://doi.org/10.1002/etc.5620080411.↩︎