As part of an ongoing series, the NAFEMS Simulation Governance & Management Working Group present a discussion about the difference between model calibration and model validation. As well as reading the conversation here, you can watch the video on YouTube.
The discussion was hosted by Gregory Westwater (GW, Emerson Automation Solutions) and William Oberkampf (WO, consultant), both members of the NAFEMS Simulation Governance & Management Working Group
GW: Model calibration and model validation, how do you define each? And what's the distinction that you'd like people to understand?
WO: Let's talk about the definition of model validation first. Model validation is a model accuracy assessment relative to experimental data. In one sentence, that's what it is. So, you've got to have experimental data, it’s not a code-to-code comparison. And the simulation whose accuracy you're assessing is relative to the experiment that you're doing. Whatever its relationship is to the real system- it can be a component, sub-assembly, etc. So, it's an accuracy assessment. Whereas, calibration– sometimes called model calibration, model updating or model improvement– is really a model improvement activity. It is adding information, usually from experimental data, to the model to improve its accuracy or, you could say, to improve predictive capability. So those are the two basic concepts.
GW: A question that I hear sometimes at conferences is, "Is model calibration always appropriate, or are there times that it's not appropriate?"
WO: Model calibration is almost always appropriate, especially in solid mechanics or structural dynamics. It's been done that way for forever. In fluid mechanics and some other fields, it's not done as much. Let's talk about some examples, where it is absolutely needed and critical to developing models. Model calibration or model updating, sometimes called parameter estimation, is most commonly done in structural dynamics, sometimes in solid mechanics, but particularly in structural dynamics.
Let's take the example of vibration of a structure instructural dynamics. You have a situation where you have sub-models of, let's say, assemblies of structures, like bolted joints, riveted joints, glued joints. Well, the physics in those connections are extremely complicated. In fact, we can't even really write down very good equations for them. So, we have models, you could call them sub-models, that represent the deformation and also the stiffness and damping of these joints– and this can be a matrix, a tensor. And so, what you do in that situation is you take, say, the structure or substructure of interest and you vibrate it, so you excite all the different modes in the structure. Then, you take a lot of experimental data on the vibrating modes of the structure and you compute the inverse solution to the model. The reason it has to be an inverse solution is to allow you to best estimate what those unknown parameters, specifically stiffness and damping (and these can be tensors),are. You then optimize, which is another way to say it, or you calibrate those parameters.
The final point is, these parameters can be scaler values for each element of the tensor. Or they can be probability distribution, that is, they are non-deterministic. So, either way, that is the calibration step that is needed. In fluid mechanics, you don't do that very often, it’s by far more common in structural dynamics.
GW: That's definitely where I saw my first example of it, where a simulation user discovered that they could tune, and get the answer they wanted, by changing some of the numerical settings. So, we dove into it to really understand, and did our homework, to know what was going on, what it was we were representing, and how we were influencing that. That was an early learning opportunity for me. So, what are some of the pitfalls of model calibration? I think, probably, the most common presumption is it lulls us into this false sense of security.
WO: As you said, you could call it model tuning. There are some recommendations, or you could say rules, on how you can do this. So, the parameters that are fully defensible to tune, update, or calibrate, are the parameters in the model or sub-model that you cannot measure independent of the system you're interested in. For example, take a bolted joint, if you take that bolt out and you take it apart, all of that physics is gone from the system; it only exists when the system is together. The complexity of that sub-model exists which means you cannot measure those parameters directly.
However, Young's modulus, you can measure independent of the structure, it's a characteristic of the material. But you must calibrate complex physics pieces. So those are very defensible, that's the only way to do it. Now, there are some things that you should not calibrate or update. For example, Young's modulus, if you have it for all the different elements in the structure, it is inappropriate to readjust, or retune, Young's modulus. Because you can say, ‘I can measure that one very accurately on all the different pieces.’ So that's an example where it's fully defensible, and needed, and one where it's not defensible.
GW: I get the sense that model calibration is a step that precedes model validation. Is that a correct understanding?
WO: Absolutely. Let's take the case of structural dynamics calibration and validation. That sequence has been done for many, many decades. Let's suppose you have a built-up structure, because that's always the best example. It can be the complete structure or it can be a sub-assembly, makes no difference. So, let's suppose you have an assembly or a sub-assembly, and let's suppose you can test it in the lab. You excite it and you calibrate the needed input parameters that you cannot measure using that data. So that's the calibration step. You can then go on to do model validation. As I mentioned at the beginning, model validation is an accuracy assessment of the model. So sometimes people say, ‘All right. I want to use the same data and compare with what data I use to calibrate.’ And I say, ‘No, no, no, no. You're fooling yourself. Don't do that’ You can do it, but what you'll find is, ‘Oh, the model looks great’ But the model is not that good.
GW: Yeah, you need to perturb the system a little bit and see how it changes.
WO: That's right. What you should do in the validation stage is to find some experimental data that has not been used for calibration. It can be, for example, a different loading on the structure, or a different temperature of the structure, whatever it is as long as the model has never been tuned with that data. You then use the existing parameters you calibrated, and compare with this new data to see how well the model compares, how well the simulation compares, with this new experimental data. So, that's the brief explanation.
GW: On the subject of model calibration, I remember early on in my career, I was asked to do a thermal analysis. We had an extended structure that was hot at one end and there was a sensitive device at the other end and we wanted to make sure we weren't going to overheat that device. We had some test data that we were working with and I discovered very quickly that there was a lot of uncertainty around the convection coefficients and emissivity for radiation, as well as what contact losses we were getting between parts. And so, with three variables or more to play with, it was really easy to get an exact answer at that location of interest.
What saved me, or kept me from blundering into making a novice mistake, was the fact that we'd had the foresight to have an array of thermocouples along this whole structure. I very quickly realized that I needed to not tune to that one location of interest, but if I had the physics right, that would give me the best answer, or the best fit, for the entire structure.
That was a real wake up call to me. Now, whenever I'm working with people who are calibrating a model, I encourage them to think about it in that way, i.e., get response data throughout the structure, not just at the one point of interest.
WO: That's a good example. Heat transfer's another field where you can't do without calibration because emissivity, for one, is a complicated parameter. And then you have contact resistance, another really complicated one. And so those and some others typically have to be calibrated first. With a broad set of data, like you said, you had thermocouples over the structure, you can tune the parameters, that's fine, this is a calibration step. In your example, you have to try to tune it where you have a set of thermocouple data over the entire structure. Because if you just do it in one point, you could tune it where you're going to match it exactly. But you're fooling yourself if you think your model is that good. Because every model and sub-model has approximations and assumptions built into it, so take as much data as you can.
GW: It'd be really easy to look good on day one, but not so good when the field failures start showing up months later.
WO: …or when your boss says, ‘This system came apart!’
*This conversation transcript has been edited and condensed for readability.