Introducción al análisis de identificabilidad
Introductory overview of identifiability analysis
Parameter identifiability analysis assesses whether it is theoretically possible to estimate unique parameter values from data
Identifiability analysis is therefore an essential technique that should be adopted more routinely in practice, alongside complementary methods such as uncertainty analysis and evaluation of model performance
Understand that non-uniqueness can occur even with ideal and/or noise-free data
In reality, we typically have incomplete knowledge and data to adequately conceptualise and simulate a system
it is recognized as good practice to work with multiple model structure hypotheses
There are four key, complementary methods for measuring how well uncertainty has been reduced
Non-identifiability means the modeler does not have the information needed to choose between alternative models
think more systematically and strategically about what type of information is needed to estimate parameters in their model, and increase adoption of the tools of identifiability analysis.
three high-level sources of parameter non-uniqueness
Source I is the model structure
Source II is the input, forcing dataset
If dynamics related to a parameter are not activated, then no information will be available to estimate that parameter. This is also referred to as persistence of excitation of the model dynamics.
Realmente, al no excitar correctamente un modo de un modelo tampoco se obtienen los parámetros correspondientes. Un análisis de sensibilidad puede mostrar este caso.
Source III consists of model and observation errors.
It can, however, be useful to identify the most plausible parameter vector but this may not be possible due to the characteristics of the errors
Consider an obvious example of non-identifiability involving estimation of parameters related to snow processes in a hydrological model
Source I is traditionally considered the core concept of so-called theoretical, “structural or a priori identifiability”
Structural identifiability involves analysis of the equations of the model and can be undertaken without observational data.
globally identifiable
locally identifiable
Identifiability is by definition a binary problem, i.e., a parameter is either identifiable or non-identifiable given the model structure and the type of data available.
uniqueness and identifiability therefore depend on the objective function used.
In a multi-objective parameter identification setting, the aim is to check the uniqueness of a given vector of parameter values on the “Pareto front”
Identifiability can be assessed without observations by using “synthetic data”
Flat surfaces, in some direction, such that points in that direction with different parameter values have the same output value. These cases are both locally and globally non-identifiable.
Distinct peaks with an equal objective function value. These cases are locally identifiable, but globally non-identifiable.
A model is identifiable if there are no flat surfaces at an optimum and, in the case that distinct optima exist, there is only one global optimum.
Dotty plots show the values of one or two parameters while varying all other parameters.
Intuitively, one might think a flat surface would be identified by a zero gradient, but the situation is more complex when evaluating the gradient at a point.
If, at a point, the first derivative is zero (i.e. a stationary point), and the second derivative is zero as well, but the third derivative is non-zero (i.e., an inflection point), the point is a “saddle point” not an optimum
Non-identifiability where the objective function is flat therefore occurs when the first, second and third derivatives are all zero, as well as further “higher order” derivatives, though in practice they may have little effect.
The flatness is referred to as lack of sensitivity
For an optimum, the first partial derivatives should be zero, and a matrix of second partial derivatives is then constructed, considering pairs of parameters together (including second partial derivatives of each parameter with respect to itself). This is known as the Hessian matrix.
If all the second partial derivatives are non-zero, the response surface is not flat.
the point is an identifiable minimum if all eigenvalues are positive, such that second partial derivatives are positive and the gradient and objective function increase in all directions.
Identifiability analysis (IA) is closely related to sensitivity analysis (SA)
If the objective function is insensitive to a parameter, it means that the objective function is flat, and the parameter is not identifiable.
Even if the objective function is sensitive to all parameters, it is not guaranteed that the parameters are identifiable
Sensitivity analysis is therefore an effective tool for identifying non-identifiability, but not for ruling it out.
In addition to being non-unique, parameters estimated with noisy data may also be biased
improving identifiability is often non-trivial
Benefits of identifiability analysis
if the uncertainty induced by non-identifiability does not change a decision, then perhaps it can be ignored
Eliminating the symptoms of non-identifiability, however, typically requires invasive changes to the model or model identification procedure, for example using different or transformed parameters, selecting specific data periods, changing model structure, and/or using a more sophisticated objective function.
There is, however, still much to be done to make practices for assessing identifiability easier to apply, and for them to be commonly applied.