Methods detail - tabular database

1. Objectives and organization for tabular database

The database is designed to facilitate data interpretation which requires combining information about all sample test results and the contexts of the samples, sampling points, and sites. As of 2023-07-06, the database contained over 6700 measurement results on 172 samples taken from 82 sampling points.

The database uses a multi-table format that defines a hierarchy of entities such as sites, sampling points, samples, and analytical results (Figure 1). Each row of each table contains columns that allow looking up into the next higher level of the hierarchy or downward into deeper levels of the hierarchy. The tables are designed for this linkage and the columns include much extra coverage to satisfy a mandate to submit (anonymized) data into NYSDEC’s agency database called EQuIS.

graph LR;
s[Sites];
p[Sampling<br>points];
sm[Samples];
a[Analyses];


s -- one<br>or<br>more --> p
p -- one<br>or<br>more --> sm
sm -- lab<br>measurements --> a
p -- field<br>measurements --> a

Figure 1: Tabular database hierarchy

Data from earlier joint projects between Cornell and NYSDEC Bureau of Pest Management are in an earlier database similarly organized, and in EQuIS.

Since before this project began, we have kept a separate database file that is a cumulative compilation of all published zip code level summaries of reported New York pesticide sales and use data back to the origins of New York’s Pesticide Sales and Use Reporting System (PSUR). Earlier data in this series have been converted to active ingredient amounts from product amounts; later published data already include that conversion. The data structure allows producing maps of pesticide use, sales to end users, or the combination of two, at the zip code level across New York. The maps can show (raw or weighted) sums across active ingredients, or individual active ingredients. The maps can be for individual years or sums across two or more years. Our PSUR cumulative database is designed for linkage to mapping software, QGIS.

All of the data are in a free software SQLite3 format. Earlier instances were maintained using Microsoft Access, until we outgrew Access’ size capacity. The sampling and analytical database is confidential because it includes exact locations and names of the confidential cooperators.

More technical detail about the tabular database: HowTo

2. EQuIS destination of data

NYSDEC maintains an EQuIS agency-wide database of environmental monitoring results of many origins related to NYSDEC programs. Any NYSDEC Bureau can draw upon the compilation of monitoring results, hopefully providing synergy across programs. The database requires very thorough record-keeping, and extensive “metadata” (data about data) are submitted with each data record so someone other than the originator of the data can understand the context. The data submitted include location (in our case very blurred), time and date of a sample, and analytical results. The encodings of aspects like chemical name and analytical procedure are highly standardized.

Cornell SWL assumes there is a mechanism to fulfill the public’s Freedom of Information requests from the EQuIS database. Because there is no apparent tagging about a level of confidentiality, we must be conservative about what to include since our categorical and long term site cooperators would not participate without confidentiality from both NYSDEC and the public. We therefore blur some aspects of data we submit into EQuIS to prevent discovery of cooperator identity and location, directly or indirectly, by a NYSDEC employee or the receiver of data under a Freedom of Information request.

Cornell SWL will begin submitting batches of project data into EQUIS in 2023 or 2024. Some prerequisites will have to be satisfied such as agreeing how to encode blurred locations for confidential sites. For long term sites we may be able to use zip code centroids as used in earlier projects, but this is much too close to disclosure of business identities for categorical sites. Even naming the county is too detailed for some categories, for example sod farms which are very sparse except in Orange County.

Data from earlier joint project between Cornell and NYSDEC Bureau of Pest Management are already in EQuIS, providing Cornell SWL with experience with the submission process.

3. Reporting to cooperators

Cornell SWL will send annual data reports with interpretations to each confidential and lake cooperator. The contents and format are slightly different for categoricals, lakes, and long term sites. The reports to categorical and long term cooperators are confidential, of course, marked that way. The lakes reports are not confidential and go to NYSDEC, but we give the volunteers the option of whether or not they are disclosed publicly such as on this website.

The first year (2022) reports to categorical cooperators were split into two parts, one part including field data and ions data plus a sampling point map and a soil map, and the other part repeating the sampling point map and providing pesticide results.

The lake cooperators received reports that combined ions and pesticide data. The reports include lake maps with sampling point locations.

There were no long term site samples taken in 2022 thus no reports will be issued until analytical data are available, planned for early 2024. We are providing soils or other maps on demand to interested owners.

The individual reports also include a synopsis of results across all sites of the type.

4. Reporting to NYSDEC

Cornell composes an annual interpretive report for DEC, which would be submitted in the first quarter of the following year assuming that analytical data are all available early in the quarter as they were in early 2023. The report to NYSDEC also draws upon the cumulative tabular database and does cross-site interpretations.

Interpretations are included in annual review meetings between NYSDEC and Cornell. These are in PowerPoint format. Earlier joint projects made versions of these powerpoint files available on the Cornell SWL website.

Since reports to NYSDEC and the annual presentations contain no confidential data, they can be made available for download from this website.


Last updated: 2023-08-17, sp17 AT cornell.edu.