Assess Goodness-of-Fit (EVA Theory)

The purpose of Goodness of Fit tests are to investigate how well a given sample of data approximates a given probability distribution.

Reliable prediction of the whole system’s maximum wall loss strongly depends on the fitted model’s quality. It is important that the estimated extreme value distribution describes the available data well.

In IMS, the goodness of fit assessment can be made based on statistical tests and graphical methods.

The implemented statistical tests are:

The K-S (Kolmogorov-Smirnov) Normality test; and
The A-D (Anderson-Darling) Normality test.

The graphical methods are:

Variate Plot,
Probability Plots,
Quantile Plot, and
Exceedance Probability Plot.

One should not call a verdict, based on the statistical tests alone. It is important to also review the graphical methods (the charts), before a conclusion is drawn. In case of a poor fit, one should consider defining a new stratification, since a homogeneous spread is required for a good fit. If both the statistical and graphical tests indicate a poor fit and stratification does not improve the fit, one can assume that Wall Loss cannot be accurately estimated with the Gumbel distribution that IMS EVA fits.

Take Note

Many samples will not pass the statistical fit tests. This does not necessarily invalidate the ability to use the EVA results. However, you should always evaluate the predicted Wall Loss and make sure the Corrosion behavior is understood before using the EVA results.

Kolmogorov-Smirnov (K-S) Normality test

This test is used to assess the overall quality of fit.

The test works with a hypothesis:

Formulate H0: Distribution fits OK (H1: not OK)
Determine significance level: usually 0.05
Calculate Test Statistic
Calculate p-value: Probability of outcome given H0
If p-value < significance level: Reject H0
If p-value > significance level: Do not reject H0

The higher the p-value (probability of outcome) the bigger evidence that the assumed model is an appropriate one.

K-S puts equal weights on all the distribution points, and measures how different the fitted distribution is from the hypothesized one.

where F is the hypothesized distribution, F_n is the empirical distribution function, and x_i is the ordered sample data.

Anderson-Darling (A-D) Normality test

This test is often preferred over other test methods for Gumbel distribution since it is more sensitive to deviations in the tails of the distributions. The tail’s fit is important for extrapolation.

where

with F being the CDF (Cumulative Distribution Function) of the hypothesized distribution.

Mathematically, the method determines the difference between the probabilities of each sample data point and its corresponding ideal value in the Gumbel distribution and then sums those differences to output a result in the form of a single numerical value.

Failing the test tells us that it is unlikely that the sample data is drawn from a Gumbel type distribution.

Variate Plot

This plot is used to assess whether the data stays within confidence bands or has outliers requiring further evaluation or inspection. This plot is also the most useful to understand.

Variate Plot.

The plot shows the reduced variate vs. the max wall loss in the whole HX (or stratum). The brown line represents the model and the extreme value is also shown. The measurements are in blue. If they are on or close to the model line, the fit is good.

The CBs (Confidence Bounds) for different CIs (Confidence Intervals) are shown. These are an indication of model uncertainty. By default, the 99% CI (Confidence Interval) is used in the calculations, i.e. max Wall Lass = Upper CB for the 99% CI. This means that we have a 99% chance that the true Max Wall Loss (for all tubes in the HX / Strat) will be equal to or less than the calculation value. Advanced users may decide to lower the CI to e.g. 90%. This will make the calculations less conservative and move the NID out. The User’s choice of CI is not limited to the percentages shown in the plot.

Probability Plots

These plots are also used to assess whether the data follow an assumed distribution.

Probability Plots.

Given an ordered sample (x) from a population with the estimated CDF (Cumulative Distribution Function), F, the probability plot consists of the points:

Where

with BETAINV = inverse of the beta distribution,

0.5 is a probability, and

are the Shape parameters.

If F is a good model for the data, the points of the probability plot should form approximately a straight line. Departures from this straight line indicate departures from the specified distribution and provide evidence for fitting a different model. The red dots are the observations, while the blue line is the reference line. The green lines give the 95% CI.

The second plot shows the cumulative probability against an ordered sample. Here the blue dots are the empirical probability and the red line is the model’s probability. We want the empirical probability to be close to the model’s probability. (So, when they are close together and we plot the two probabilities against each other, the points will approximate a straight line and we are back with the plot shown on top …)

Quantile Plot

This plot is also used to assess whether the data follow an assumed distribution. This is the inverse of the Probability Plot shown above and thus specifically provides a better indication of the fit quality in the tail.

Quantile Plot.

A quantile plot is a graphical method for comparing two probability distributions by plotting their quantiles against each other. (The Probability Plot on the (previous tab) is thus also a quantile plot.)

Given an ordered sample from a population with the estimated CDF, F, the probability plot consists of the points:

where

with BETAINV = inverse of the beta distribution,

0.5 is a probability, and

are the Shape parameters.

Like the Probability Plot, if F is a good model for the data, the points of the quantile plot should form approximately a straight line. The plot displays the observations (red dots) and the reference line (blue line) together with the 95% confidence bounds (green lines). Points above the line indicate underestimation, and below the line indicates overestimation. This method put more weight on the tail of the model!

Exceedance Probability Plot

This plot also provides a good indication of the fit quality in the tail.

The probability of the exceedance plot consists of points expressing the empirical probabilities of exceedance and the model probabilities of exceedance plotted against the ordered samples.

Exceedance Probability Plot.

Empirical probabilities of exceedance:

model probabilities of exceedance:

with BETAINV = inverse of the beta distribution,

0.5 is a probability, and

are the Shape parameters.

If the empirical probabilities (red dots) lie below the model probabilities (blue line) it implies that the model is overestimating the observed data. In the case that they lie above, it suggests that the fitted model underestimates the measurements. To better illustrate the fit of the tail, the probabilities are plotted on a log scale.