Skip to content

Releases: UsmanovSla/MGD-RSS

MGVD-SI Methodology of Data Generation and Validation for Artificial Intelligence Models in Civil Engineering

21 Feb 22:22

Choose a tag to compare

The implementation of artificial intelligence methods in civil engineering is constrained by the lack of high-quality and representative data required for training and validation. This work formulates a general theoretical–methodological framework for data generation and validation based on physically consistent modeling of engineering systems, stochastic simulation, and statistical verification of agreement between simulated and real data. The objective is to establish a framework enabling the production of synthetic data that preserve the key statistical and structural characteristics of real engineering processes.
The proposed approach is based on Bayesian calibration of deterministic physical models and their extension through a general stochastic representation of system states, illustrated using Markov processes and Monte Carlo methods. The variability of the observed quantities is modeled as a combination of deterministic system response and random effects representing systemic uncertainty, while empirical distributions are employed to preserve realistic distributional structures. The resulting hybrid model enables simulation of the posterior predictive distribution and the systematic generation of synthetic datasets under varying conditions.
The methodology includes a multi-stage validation framework based on parametric and non-parametric tests and interval estimation. Verification on a case study of a robotic masonry process confirmed a high degree of distributional agreement between simulated and real data; distributional goodness-of-fit tests did not reveal statistically significant differences between empirical and synthetic data in more than 75% of repetitions.
The proposed framework is applicable across a wide range of civil engineering tasks, particularly in the analysis and numerical modeling of structural systems, material nonlinearities, time-dependent degradation processes, probabilistic reliability assessment, as well as process planning and optimization. The work systematically integrates simulation modeling, probabilistic inference, and synthetic data generation, thereby establishing a theoretical foundation for the implementation of artificial intelligence models under conditions of limited empirical data availability.