Statistical FAQs

This is a list of commonly asked (and answered) statistical questions that apply to both DUMPStat and CARStat. Answers to statistical questions that apply only to DUMPStat appear below.

DUMPStat and CARStat
DUMPStat DUMPStat and CARStat
1. What do manual reporting limits do?
Manual reporting limits replace the numerical values of all laboratory-reported detection limits, which are used for quantifying nondetects. When enabled, they are used in calculations and also appear on graphs in place of the laboratory results. You can enable manual limits in Statistical Options or from Set Manual Reporting Limits on the Constituents menu. To specify a limit for a certain constituent, select Set Manual Reporting Limits. Choose a constituent from the Reporting Limits list, enter the desired value in the Manual Limit box and ensure that Enable Manual Reporting Limits is set. Then click OK to save your changes. Constituents that do not have values in the Limit column will continue to use the laboratory values even if Manual Reporting Limits are selected.

2. What is the difference between tolerance limits and prediction limits?
Tolerance limits provide coverage of a percentage of the total distribution of measurements (e.g., 95%) with a certain degree of confidence (e.g., 95%). Prediction limits provide coverage of 100% of the next k measurements with a given level of confidence (e.g., 95%). With 95% coverage, tolerance limits should be exceeded by 5% of the measurements with 95% confidence whereas prediction limits should fail for none of the next k measurements with 95% confidence.

3. Should I use MDLs or PQLs for statistical analysis?
The detection limit is used to determine if an analyte is present in a sample and the quantification limit is used to make a quantitative determination of the amount of the analyte in the sample. USEPA has used the terms MDL (method detection limit) and PQL (practical quantification limit) to describe two specific approaches of estimating the detection and quantification limits respectively. If we are comparing a concentration directly to a standard then it must be greater than the quantification limit in order to provide a reliable estimate of whether or not the standard has actually been exceeded. If all that we care about is whether or not the analyte is present or absent in the sample, then measurements above the detection limit provide that information. Measurements above the quantification limit can be used directly in the previously described statistical methods, however, measurements below the quantification limit are considered to be censored and the appropriate adjustments for censored data should be used. Both DUMPStat and CARStat use Aitchison's method to adjust for nondetects in computing normal and lognormal prediction limits. No statistical adjustment is required for nonparametric or Poisson prediction limits. The primary advantage of Aitchison's method over other alternatives (e.g., Cohen's method) is that it can accommodate varying reporting limits which are quite common in practice.

4. How is the median value of the reporting limit for nondetect samples computed when there is an even number of samples?
The median always works out unambiguously when the number of data is odd. When there is an even number of data, there is more than one definition of the median in common use. The definition we use is that the median is the smallest value that divides the data into two equal parts. So for example the set of numbers {0, 0.5 , 1, 2, 2, 5} would have 1 as its median.

5. Is the original sample included in DUMPStat's verification resampling numbers?
No. DUMPStat and CARStat use "Pass 1 of 1", "Pass 1 of 2", and "Pass 2 of 2" and CARStat uses 'None' to refer only to the numbers of resamples. Thus, choosing Pass 1 of 1 means that both the initial sample AND the (single) resample must exceed the limit for the exceedance to be verified. An alternative terminology exists where the initial sample is included as part of the 'm' in "Pass n of m" in which case Pass 1 of 1 would mean that no resampling is being used. Both terminologies are widely used and care must be taken to ensure that you are following the strategy detailed in your permit.

DUMPStat
1. When can I use intra-well comparisons?
Intra-well comparisons should always be used when predisposal data are available. When no data prior to disposal of waste are available, then the owner/operator must provide empirical justification that use of intra-well comparisons will not mask existing contamination at the facility. One good approach is to show that constituents of concern (e.g., VOCs) are not present in the wells and that naturally occurring constituents show no evidence of increasing trend (e.g., using Sen's test).

2. What can I do if I have only one upgradient well?
With only one upgradient well, spatial variability and potential contamination are completely confounded (i.e. you can't tell one from the other). To perform upgradient versus downgradient comparisons and consider spatial variability you need a minimum of two upgradient wells.

3. What is the Control chart factor?
The Control chart factor is the multiplier that determines how many standard deviations above the mean the control chart limit is: SCL = mean + (factor * SD). You can modify the Control chart factor in the Statistical Options dialog box. There are two settings for the Control chart factor based on the number of samples. Also, factors that vary significantly from the default values will be highlighted with a cautionary color and/or limited to a range.

4. Do I have to take four independent samples from downgradient wells per semi-annual monitoring event?
The requirement for four semi-annual samples is for ANOVA only, which is a technique that DUMPStat does not use because it is inappropriate for groundwater monitoring. All other methods require a single semi-annual sample once the background is established.Find out more.

5. Over what period of time can I take my background samples?
A minimum of eight background samples must be taken for prediction limits, tolerance limits and control charts. The samples must be independent and representative of seasonal and spatial variability at the site. Spatial and seasonal variability apply to naturally occurring constituents only (e.g., inorganics). Spatial variability is addressed by either using intra-well comparisons and/or having multiple upgradient wells. Seasonal variability is addressed by collecting samples over a period of time that includes the seasons at which downgradient samples will be collected. For this reason, the eight background samples should be collected over a period of no less than one year, and preferably over a two year period in which a constant sampling interval is used (e.g., quarterly sampling over a two year period for intra-well comparisons and quarterly sampling over a one year period from at least two upgradient wells for inter-well comparisons). However, all samples required to establish background should be collected prior to the date of statistical comparison as required by the regulations.

6. What is the minimum background sample size required to compute detection monitoring statistics?
A minimum of eight background samples (e.g., eight samples in each well for intra-well comparisons or four samples in each of two upgradient wells for inter-well comparisons) are required for a meaningful statistical evaluation.

7. If I am using intra-well comparisons should I continue to monitor the upgradient well(s)?
Yes. It is always wise to perform intra-well comparisons on both upgradient and downgradient wells. If an exceedance is seen both in upgradient and downgradient wells, it is usually good evidence that the potential impact is not from the site. Any data which helps in evaluating off-site and/or seasonal, regional and climactic changes should be collected and investigated.

8. When are nonparametric prediction limits appropriate?
Nonparametric prediction limits are optimal in the sense that they make no assumptions regarding the specific form of the underlying distribution. However, as the number of wells and constituents increase, large numbers of background measurements are required in order to have reasonable confidence (e.g., 16 or more). When the site-wide confidence level is poor (i.e. lower than 90%) alternatives based on Poisson prediction limits are often useful. Poisson prediction limits can be used regardless of detection frequency and their associated level of confidence is independent of number of background measurements. Note that Poisson prediction limits are approximate in that many constituents will not have a Poisson distribution. For this reason, Poisson prediction limits should only be used when statistical power analysis reveals that there is an insufficient number of background measurements to justify the nonparametric approach. In addition, Poisson prediction limits should only be used with constituents with detection frequencies of less than 50% whereas nonparametric prediction limits are valid regardless of detection frequency.

9. What should I do for VOCs?
VOCs are not naturally occurring and therefore they should not be found in background groundwater samples. For VOCs, verified exceedance of the appropriate quantification limit is an indication of a significant exceedance. Do not apply the previously described statistical methods to VOCs unless you are doing assessment or corrective action monitoring and are attempting to determine if a known release of these compounds is getting better or worse or exceeds a standard. Alternatively, if VOCs are detected in upgradient wells due to an offsite source, statistical comparison (i.e. up vs. down) may be appropriate.

10. How do control charts deal with multiple comparisons?
As described, combined Shewhart-CUSUM control charts do not explicitly adjust for multiple comparisons. The effects of verification resampling and increasing number of comparisons produced by multiple wells and constituents generally balance the site-wide false positive and false negative rates at reasonable levels, however, there is no statistical guarantee that they will. Please note that when using control charts it is particularly important to determine site-wide false positive and false negative rates via simulation. Certain states (e.g., California) require that you select the control chart factor based on generating a 5% site-wide false positive rate. DUMPStat allows the user to input the factor in the Statistical Options item of the Settings Menu and the Intra-well Control Chart Power Analysis can be used to determine the site-wide false positive rate for varying choices of the control chart factor.

11. When computing tests of normality and lognormality what data should be used?
Tests of distributional form should only be performed on background data or data that are known with certainty not to be influenced by the facility. This would typically exclude use of downgradient data.

12. How do I adjust for seasonal variability?
In general, you can't adjust for seasonal variability because you typically do not have a large enough number of samples in each season to provide a reliable estimate of the effect. This is not a big problem because seasonal variability is incorporated into the usual estimate of the background standard deviation, even if it is not explicitly modeled as a separate variance component. Gibbons (1994a) and Gilbert (1987) provide methods for seasonally adjusted trend estimators and this topic is also discussed in the new ASTM guidance D6312-98. Note that collecting samples over a 12 month period is generally sufficient to incorporate seasonal variability into the background standard deviation.

13. Should I ever use ANOVA?
ANOVA is an extremely useful statistical tool for designed experiments with random sampling. Unfortunately groundwater monitoring data do not enjoy such luxuries. Spatial variability becomes confounded with upgradient versus downgradient comparisons and in general, ANOVA can be more sensitive to spatial variability (i.e. small but consistent differences) than a real release (i.e. a large but highly variable increase). The reason is that ANOVA compares between well variability to within well variability. In the absence of contamination, within-well variability is a combination of temporal variability and analytic variability whereas between well variability is due to spatial variability. Since spatial variability is invariably large relative to the combination of temporal and analytic variability, the ANOVA will conclude that the ratio of between-well variability to within-well variability is significantly larger than zero. Of course, the assumption of ANOVA is that under the null hypothesis (i.e. no contamination) all wells are drawn from the same distribution with the same population mean. This assumption is justifiable under random sampling. However, this assumption is not justified in natural systems in which initial conditions are already different, for example due to natural spatial variability. One good application of ANOVA is in testing whether or not the amount of spatial variability is statistically significant. Here we simply restrict the analysis to the upgradient or background wells (which could not be affected by a release from the site) and if a significant F-statistic results then we can conclude that there is significant spatial variability. However, even in the absence of a significant ANOVA, spatial variability may still be appreciable but simply not present in the small number of available upgradient or background wells.

14. Does nonparametric ANOVA correct the limitations of its parametric counterpart?
The only difference between nonparametric and parametric ANOVA is that the nonparametric ANOVA does not assume a specific distributional form for the concentration measurements whereas the parametric ANOVA assumes normality. Both models assume independence of the measurements and homogeneity of variance and both models are severely compromised by spatial variability.

15. Are different methods required for comparison to ACLs and MCLs?
When comparing measurements to a standard, the same approach is used (e.g., a 95% upper confidence limit for the mean of the last four measurements) regardless of how the standard was derived.

16. If I have constituents with detection frequencies less than 25% for intra-well or less than 50% for inter-well comparisons do I have to wait until I have a minimum of 13 background samples before I begin computing statistics?
No. For inter-well comparisons remember that the number of background samples is pooled over all upgradient wells so with eight samples in each of two wells you have 16 background samples. For intra-well comparisons 13 background samples are required for a nonparametric prediction limit with one verification resample but only eight background samples are required with two verification resamples (i.e. fail the first and pass either one of two verification resamples). Alternatively, Poisson prediction limits can be used with as few as four background samples regardless of detection frequency.

17. Do I have to conduct a statistical analysis if VOCs are detected only in the downgradient wells?
Verified quantification of VOCs in a downgradient well is a statistical exceedance in and of itself. No statistical comparisons are required.

18. Do I need to compute statistics when all of the background data are below the MDL/PQL/LOQ?
The LOQ and PQL are both quantification limit estimates whereas the MDL is an estimate of a detection limit. For statistical purposes, the smallest measured concentration is the quantification limit (e.g., PQL or LOQ) therefore if all values in the upgradient wells are nonquantifiable, the prediction limit becomes the QL. Our level of confidence in this decision rule is based on the number of background measurements, the number of comparisons and the verification resampling strategy. If we have a small background sample size (e.g., the minimum of eight background measurements) and nothing is detected, there is still appreciable probability that the true detection frequency is greater than zero. Since there are typically far more downgradient wells than upgradient wells we will have a greater chance of detecting the constituent in a downgradient well, therefore giving the appearance of a potential release. For this reason, even when nothing is detected in background, confidence levels associated with using the QL as the nonparametric prediction limit should be determined. Note, that this does not apply to VOCs which should not be detected in clean background wells with any frequency.

19. Do I need to compute statistics when all of the downgradient data are below the MDL/PQL/LOQ?
Statistical computations are based on background data only. The fact that a constituent has never been detected and/or quantified in a downgradient well is irrelevant to the statistical analysis; however, it may indicate that the constituent adds little to the monitoring program and should be eliminated from the suite of constituents used for statistical analysis.

20. My regulator doesn't want to see nonparametric limits, but DUMPStat automatically uses them when the data are neither normally nor lognormally distributed - what can I do?
In the DUMPStat statistical options, the "Rare Event Statistics" setting can be used to override the choice of nonparametric limits, even for events with high detection frequencies. When "Poisson" is selected, you will never get a nonparametric limit. When computing a prediction limit, if the detection frequency is insufficient to compute a parametric limit (a "Rare event"), you will either get a nonparametric limit or a Poisson limit, depending on the "Rare Event Statistics" setting in your statistical options. For inter-well comparisons, if the detection frequency is sufficient to compute a parametric limit, the background data are tested for normality. If they pass this test, you will get a normal limit. If they fail, the data are tested for lognormality. If the data are found to be lognormally distributed, you will get a lognormal limit. If both tests fail, then the "Rare events" setting is used — even though the detection frequency is high. If "Nonparametric" is selected you will get a nonparametric limit. If "Poisson" is selected, you will get a normal limit even though the data failed the normality test.

21. Why do I see trends on my intra-well control charts that aren't there on my time series graphs for the same wells and constituents?
The trend detection for intra-well control charts is one-tailed - that is, only increasing trends are sought. In contrast, trend detection in time series (implemented in DUMPStat version 2.1.1) is two-tailed, finding both increasing and decreasing trends. In this case the area under the curve in each tail is half of the area in a one-tailed test, so that a trend in time series must be more pronounced to be detected. The same Sen's test is being used for each analysis, but results can differ based on the 'tailed-ness' of the detection.

22. What would be sufficient for pairs with insufficient data?
Surface and air monitoring use the same minimum number of background samples as the rest of the analyses, chosen from the statistical options. If the number of pairs of samples for a particular con/well is less than or equal to the minimum number of samples, the UCL's for the upstream and downstream sample points cannot be computed.

23. Two-sided prediction limits & pH
The pH measurement differs from concentration measurements in that there is a numerical maximum and minimum that a result must fall within to be considered acceptable. To account for this, the prediction limits treat the constituent pH differently from others by computing a two-sided limit. This is one of the few places where prediction limits are more useful than control charts. While the Shewhart portion of the control chart does account for the two-sided nature of pH, the CUSUM measure on the control chart is designed to identify significant increases, and is therefore not a useful indicator for decreases in pH.
DUMPStat will identify only one constituent name as 'pH'. More specifically, amongst all data records where 'pH' occurs as either the whole constituent name, or as the first distinct word in a constituent, DUMPStat will designate only one name as being pH. Thus, 'pH' or 'pH field' could be the constituent name. However, 'phenol' would not.
If your database contains records where more than one name could be identified as 'pH', it is important that you alias all related names to a single choice. Otherwise, the statistical analyses will not collect all the relevant data records. It does not matter which name you choose to be the 'dominant' pH.