The New York State Department of Environmental Conservation (Department), in cooperation with Cornell University, presents the pesticides sales and application data. This data is collected under the auspices of the Environmental Conservation Law Article 33, Title 12, known as the Pesticide Reporting Law (PRL). The pesticide annual report data is available to health researchers and the public. The reporting community, the Department, and the Department’s computer consultants at Cornell University (Cornell) work together to provide the best information that can reasonably be compiled. However, the data are neither entirely accurate nor complete. On occasion the data will be updated.
Although the Department and Cornell have gone to great lengths to assure the quality of the data, there are still significant concerns regarding the validity of the data received from the regulated community. Users of the data are strongly cautioned about limitations of the data. Please review the following information prior to using the data for any purpose.
Annual report submissions are accepted by the Department at face value. Neither the Department nor Cornell can attest to the accuracy of the data. However, the data goes through a manual review and a more detailed review with various computer applications for obvious or likely errors. Follow-up with the applicators and distributors is conducted and corrections are made when possible.
The PRL allows the regulated community to submit pesticide records that are handwritten. Some of the data on these handwritten forms is not decipherable. Data that is unreadable is stored in the database as “illegible” and therefore those quantities of pesticides cannot be counted.
ZIP Code Issues:
The use of ZIP codes to define application and sales locations creates several problems. ZIP codes are postal delivery locations. Large wilderness areas and farmland may have few, or no, postal delivery locations. Mail is not delivered to these locations and they are technically, not located within a ZIP code. Assigning a ZIP code for an application or intended application for these geographic areas is problematic.
ZIP codes may also include more than one contiguous location. Without additional address data there is no way to know where applications or intended applications occurred. For example, Harris Hill, Williamsville and portions of Clarence, NY all share one zip code.
In cases of special ZIP codes, which are unique for certain locations (for example, a single building or campus), the data is reassigned to the enclosing ZIP code that represents a larger area if one exists. Otherwise the data is stored in the database within the "private" zip code category. Without this step, specific pesticide applications could be identified. This step is not necessary for data reported by county.
Quantities for some pesticides are reported using both weight and volume based units. Rather than reject quantities reported using the inappropriate unit, the reports list both measurements as they were reported to the Department.
Products listed with a quantity of zero means that applications or intended applications of the product were made, but that the quantity was indecipherable, the reported unit of measure was invalid, or the quantity was negligible (less than 0.01).
The database may contain over-reporting of pesticides actually used or sold. There is no way to determine how much of the reported amounts are higher than they should be, but several factors contribute to this:
- Private applicators often return unused pesticides. The products may be returned a year or two after the initial purchase. The reporting system does not have a way to account for returns. Only the original sale is recorded. The database will show the sale, but not the return.
- Commercial Permit Holders (sellers of restricted-use pesticides) report sales of restricted pesticides to other distributors. These distributors may sell the same pesticide a second time, possibly to another distributor, who may sell it yet a third time. Each sale is reported. There is no way of identifying reports of multiple sales of a single volume of pesticide.
- Pesticide products are routinely diluted with inert material prior to application. Some applicators report the diluted amount applied, not the undiluted amount as required by the Department. The Department and Cornell review reports in an attempt to identify obvious occurrences of this error; however, not all occurrences are obvious or corrected.
Data are not reported by active ingredient. This makes the database different from most other pesticide use tracking databases, which may cause difficulties in comparing NYS reporting data with data from other states. However, the Department and Cornell have developed a mechanism for displaying active ingredient summaries for those products being reported.
Unknown Sales to Private Applicators:
Commercial Permit Holders must report sales of general use agricultural pesticides to certified private applicators. However, certified private applicators can purchase general use agricultural pesticides from non-commercial permit holders. Under these circumstances, these sales and the associated use information would not be captured by the PRL data.
All of the annual report information is self reported. Many applicators, technicians and commercial permitees report that they made no applications or sales. There is no practical way to verify this.
Annual reports for individual reporters are not as a rule reviewed across years. However there have been cases where annual reports are copied and submitted for more than one year.
Extensive computerized data quality assurance processes are followed in producing the final reports. The description below summarizes the methodology that is used to produce the annual PRL data report.
How the Data is Characterized:
Pesticide products were summarized using the EPA registration number, not the product name.
Pesticide products registered with one EPA number may have different product names. All registered product names are available in a separate report (Pesticide Products by Name and EPA Registration Number).
Non-standard applications and sales are flagged for separate reporting when
- Sales or applications did not occur during the report year.
- Applications or sales occurred outside of New York State.
- A general use product was reported on Form 25 (Annual Report for Restricted Pesticide Sales form).
All quantities are rounded to two decimal positions.
The data summaries include information on data that were reported incompletely or incorrectly. These data have been identified by using a set of standard descriptions. The reason for including this information is that partial data may still have some informational value. The descriptions used are
No value reported for this field.
Unreadable value reported for this field.
An invalid EPA Registration Number is a number that does not match those EPA Registration Numbers for pesticide products registered at any time in New York State or by the EPA. An invalid county or zip code is a county or zip code that does not exist in New York State.
Two values reported for one field on the report form or a value that could not be mapped to the report form field for any reason.
Preliminary Quality Assurance for Paper Reports:
The contractor who performs the data entry of the paper reports performs data quality checks which include
- Decipher non-standard form submissions,
- Code illegible/irregular values,
- Reformat dates,
- Validate Certification & Permit IDs,
- Standardize city using zip code look up, and
- Duplicate dittoed fields.
Preliminary Quality Assurance for Electronic Reports:
Most of the electronic data are formatted using one of the bureau's software applications. When the pesticide reports are received at Cornell a number of validation processes are performed. Cornell validates the file format and checks the data values as outlined here. This preliminary validation process enables the bureau to contact the report submitters for corrections in a timely fashion. Files are verified by checking
- Are the required number of fields present?
- Are files named so the type of data can be identified (applications, sales etc.)?
- Do the fields contain required data types (numbers, characters etc.)?
- Are fields the expected length?
- Are DEC ID numbers (certification, business registration and sales permit numbers), county codes, zip codes, units of measure and dates valid values?
- Are EPA Registration Numbers formatted correctly?
The electronic service bureau also accepts a limited number of reports that were not created in one of the bureau's software applications. These reports are closely reviewed and manually reformatted. These reviews include
- County and zip code look ups,
- EPA Registration Numbers looked up, and
- Outreach to businesses/applicators.
Duplicate Data Removal:
Electronic reports are logged by ID number(s), which prevents most duplicate reports from getting into the system. However we sometimes receive duplicate reports under different ID numbers (e.g. the business and one of the applicators file the same data in separate reports). Another source of duplication is when the same report or partial report is filed electronically and on paper.
In order to detect duplicate reports we have developed QA processes which isolate potential duplicates. We then manually review these to identify the actual duplicates, which can be of both of the types listed above.
As part of our standard quality assurance processes, the Department and Cornell identify reports that contain quantities that appear to fall outside of accepted parameters. Staff review reports containing these “out-of-range” quantities and the responsible applicators and businesses are contacted. Reporting errors are corrected by staff with the approval of the applicator or business. The corrected data is forwarded to Cornell and substituted for the original reports in the database. Common errors include misplaced or incorrectly data-entered decimal points and systematic computer-generated errors. Note that not all of the "out-of-range" data can be verified.
Applicators and sellers incorrectly report applications and sales for some cooling tower and wood preservative products in pounds rather than gallons. These errors dramatically inflate the reported use and sales of those products. Cornell is able to correct the units for those products from pounds to gallons. The corrected units are reflected in the data.
Cornell has an application that converts quantities of liquid products reported as dry ounces into fluid ounces.
Data revisions are also performed based on the error reports generated by Cornell's data validation application. Department staff contacts many of the businesses that appear on the report in order to correct the data which failed validation. Examples of errors handled in this way include reported EPA numbers that do not correspond to registered products or county codes that do not correspond to the address.
The primary data quality assurance process is our data validation application. This application implements a suite of validation rules, which consist of three primary types:
- fields checked against set up tables (e.g. zip code valid?),
- value checks (e.g. date month and day within valid ranges?), and
- presence checks (e.g. required fields reported?).
The following types of edits are performed:
- format error,
- validation error,
- illegible, irregular or null value reported,
- formulation state,
- report year and date mismatch,
- date range > 1 year, and
- out-of-state data.
Data items identified by these edits are compiled into reports which the Department uses to contact the businesses and applicators for corrections (see Data Revision section).
Some data cleansing is also done by the data validation process. The initial reported value is always retained whenever a correction is made. For example, California revision codes are stripped from EPA Registration Numbers and units of measure are matched against known spelling and punctuation variants.
The final layer of data quality checking is a series of reviews performed by PSUR programmers and Department pesticide reporting staff. Department staff performs two different processes to improve the data:
- Cornell generates a report, which compares aggregated data from year to year within counties. Large increases and decreases within counties of products applied or sold are investigated and corrected if necessary.
- The Department reviews drafts of the final reports and works with Cornell to investigate anomalous values and make corrections as necessary.