Simulation Predictions Within the Validation Domain

A Discussion with the NAFEMS SGMWG

14-minute read

NAFEMS SGMWG - August 17th 2023

As part of an ongoing series, the NAFEMS Simulation Governance & Management Working Group present a discussion on Model Predictive Capability. As well as reading the conversation here, you can watch the video on YouTube.

The discussion was hosted by Gregory Westwater (GW, Emerson Automation Solutions) and William Oberkampf (WO, consultant), both members of the NAFEMS Simulation Governance & Management Working Group

GW: We are going to get little bit deeper into what the validation domain is. In the past, I've described it as the range of sizes, conditions, et cetera, for which the model has been validated, typically by comparing simulation results to either test data or some other known and accepted solution. In my choice of terminology there was a range, and I'm curious if that's a valid description. That would imply that if, for example, I've got a simulation and I've validated it at room temperature and at 400- and 800-degrees Fahrenheit, I have validated for that entire range in between. Is that valid thinking? Or is validation more about just those discrete points?

WO: This is an important conceptual point to understand. When we are talking about validation here, we usually refer to it as computational simulation validation. This is not the systems engineering meaning of validation, that has to do with meeting requirements and is not what we're talking about here. What we specifically mean by validation, whether it's a domain or at a point, we mean computational model accuracy assessment. That's all we mean. It is not a statement of good, bad, or adequate. It is just the conditions for which we have assessed the computational simulation accuracy for whatever response quantities we're interested in.

So, more specifically to your point, we typically have various conditions in which we obtain experimental data– experimental data can be in a lab, it can be in a complete system, or in a fully represented environment, it doesn't make any difference where it is– and in these conditions, the way we describe it is, there's an input space that defines all the physical characteristics of the system, and then there are the loading conditions, the excitation conditions. So, there are two big groups of what's called input data to the model. And whenever we do an experiment, that is a point in that space. And this space happens to be a high dimensional space. So, when we say we have validated the model, it doesn’t mean it’s good or bad, it means we have assessed the accuracy for those conditions that we have experimental data for.

Your question was, what happens if we have multiple points of data? That's good, it’s typically what we do have. And so, we can say we have assessed accuracy of the model at all those various points.

But what about accuracy in between those points? That's what we mean by ‘inside the validation domain.’ That's a good description of what we mean.

The thing that makes it relatively complicated is that this is a space, a high dimensional space. It's more than just temperature or pressure, it's material properties, its geometry characteristics, it's all different kinds of things. What we mean by the validation domain is that we're inside the boundary of this space. I wanted to clarify that because it's a very precise concept.

GW: So, if I'm understanding it correctly , technically the validation is at those discrete points. But we define that validation domain as the cloud of conditions, sizes, etc. that's covered by that data set that we have.

WO: Exactly right.

GW: We’ve had some conversations about interpolation. A simple analogy would be for me to say, ‘Oh, this is like having a 2D curve and doing a curve fit, I'm interpolating between those data points.’ I suspect it’s not quite that simple, that there's some additional pitfalls once we get into this multidimensional space.

WO: Thinking of in between points, as you described it, as interpolation is a reasonable concept. But it's not exactly right because interpolation means you take data from whatever points, and whatever the dimensionality of the space is, makes no difference.

We typically think of interpolation in a two-dimensional space, but here we have a high dimensional space, but the concept still holds. What we're doing in this validation domain is we don't interpolate solutions in between points, we calculate any point we want in this space. You can think of it as interpolation, but another way to say it is: We're using the computational model– and we're usually talking about physics-based models– and we compute the simulation result, given the input, at any point in this space that we're interested in and, in the validation domain, we think of them inside.

Another way to say it is: We are near the points at which we have assessed model accuracy.

GW: That helped turn on a light bulb for me. If I'm truly interpolating, I would just use my test data for those two validation points and do the interpolation. I wouldn't run the simulation a second time. So, it is a second calculation that's got all the attendant chances for things to be different.

WO: That's right. If computational simulation wasn't as powerful as it is today, that's exactly what you would do with experimental data. You’d ask; what is the stress condition in between these two loading? Then you'd interpolate the experimental data. But here we recalculate what we want at any condition in this space.

GW: So now, when we've done our validation, that helps us quantify or give a good idea our level of accuracy or perhaps the uncertainty associated with the solutions that our simulation tools are giving us. So then, do we assume that within that validation domain that accuracy is constant or at least is representative within that domain space that we've defined?

WO: You can assume it's constant, but that's not a very good assumption. Because let's suppose you have five points of these input quantities, either from the system or from the boundary conditions, excitation. And you can assess the accuracy of the model, the computational model, at those five conditions. And you will find the accuracy of the model is different at all those five points, you may even have a hundred points. And you think, "Why would it be different?" We must remember that when we make a computational model, there are always a set of approximations and assumptions, every model has these.

For example, you could talk about, in Solid Mechanics, the Euler-Bernoulli Beam Equation, or assumption of homogeneous or isotropic. In Fluid Mechanics, you could talk about laminar or turbulent or where there is transition. All of these are assumptions or approximations. As you vary the input conditions, those assumptions are better in some places than others. For example, in Solid Mechanics, is it small deflection theory or small deformation, or is it large? You use different models, so the approximation can be better in one space than another. A better way to think about it is that the accuracy of the model varies over this space.

GW: As we make predictions within the validation domain, is that where we just need to use engineering judgment to make sure that we're choosing an appropriate conservative margin around that uncertainty?

WO: You could do that if you are pressed for time. And a lot of times, real computational engineers are. This isn't an academic exercise; you've got to produce numbers!

If you wanted to calculate it a little bit better, you could construct an interpolation function in this space. If you have the software, you can do it pretty easily. And you can then estimate what the error is in the model at any point in this space, and you'll see it varies over the whole space.

So, if your boss asks you, ‘What do you think the model form uncertainty is?’ or you could call it structural uncertainty- it's due to the assumptions and approximations. You could say, ‘this is an estimate of what the model accuracy is over the entire domain where we have data.’

And then, he could say, ‘What are you going to do with that number?’ And you'd say,

‘I’m going to use that to increase my margin in the uncertainty in my predictions.’ And then, you could say, ‘a prediction at a given point (and it can be in between these points) has these kinds of uncertainties’. It's called model form uncertainty, but you can have many others too, like variability in Young's modulus, or variability and input flow rate or any kind of loading on the structure, acoustic environment, all those things are called inputs.

GW: I think that idea of interpolating the accuracy, or the error, across that validation domain, as you said, sounds like a very straightforward solution assuming that you've got a pretty good fit to what that data looks like.

In my own personal experience, I think a lot of our uncertainty has come from the test side. We can have eight conditions where our accuracy is within 5%, and then we'll have one condition where we're at 15%. And if we retest that two more times, then it cleans it up. So, all the same discussions could go on looking at the test side of this as well, I think.

WO: Throughout, we’ve been concentrating on uncertainties, like model form uncertainty and input uncertainty. But there's also experimental uncertainties, experiments are never perfect; never have been, never will be. And so in this interpolation of this model form uncertainty you should try to take that into account. For, let's say deformation, local strain or local stress, the experimentalist might say, ‘My estimated total uncertainty for these measurements is x.’ That's additional uncertainty, it has nothing to do with the model, it has to do with the experimental data. So, don't forget about that, those are the fine points when you get into it. You could say you get less and less certain of your computational results, and that's a good thing!

GW: That mirrors my own career path; straight out of college, I was really confident, and 10 years in, I was like, "Man, there's so much stuff I don't know yet! ".

WO: Yeah. When you get my age, Greg, you're not certain about anything!

GW: One final question; Is there a valid concept for judging if you’re close enough to the validation domain, how do we evaluate that?

WO: If you were very near any one of these points where you have experimental data, it could be laboratory data, it may not be at the full conditions you want, i.e., it doesn't have the proper temperature, or the loading, or there are differences in the structure, it could even be only a piece of the structure. Even if you are outside this validation domain and you're still ‘relatively close’ you still have to ask yourself, what does close mean? This is a multidimensional space!

You could aim for a position where you're pretty confident that the error that you've estimated, both computational and experimental, even outside this validation domain, is a reasonable estimation of the error or the uncertainty that you like.

Our next video will be on larger extrapolations, for when you get very far away from where you have data or where you don’t have data. And just as a prequel of this, we actually do that much more often than we ever think we do.

*This conversation transcript has been edited and condensed for readability.

Posts By Month

2025

2024

2023

2022

2021

2019

2018

2017

2016

2015

2014

October 2014 (1)

2013