Skip to content

Latest commit

 

History

History
378 lines (312 loc) · 20.1 KB

File metadata and controls

378 lines (312 loc) · 20.1 KB

MTRsim C++ Port — Project Plan

Overview

This document describes the plan to convert the MTRsim MATLAB codebase to C++. The deliverable is a CMake-based project providing:

  • libmtrsim — a static library containing all simulation and analysis algorithms
  • mtrsim — a CLI executable that drives the simulation via configuration file and command-line flags
  • A CTest suite of unit tests

Package management is handled by vcpkg.


1. MATLAB → C++ Functional Mapping

The following table maps each MATLAB source file to its C++ counterpart inside libmtrsim.

MATLAB File C++ Module (proposed) Role
simulate_MTRs.m mtrsim_cli (executable) Top-level driver
PGRF_simulation.m PGRFSimulation Plurigaussian RF simulation
gp_generator.m GPGenerator Gaussian process field via Kronecker Cholesky
select_AR.m AssignmentRule Threshold selection using mvn CDF
eval_AR.m AssignmentRule Apply thresholds to classify voxels
qsimvn.m QSimVN Quasi-Monte Carlo multivariate normal CDF
norminv_code.m (inline) Inverse standard normal CDF
calc_ODF.m ODFCalculator Histogram ODF from Euler angles
sample_orientation_from_ODF.m ODFSampler Draw single orientation
sample_N_orientations_from_ODF.m ODFSampler Draw N orientations (batch)
symmetric_euler_angles.m CrystalSymmetry (wraps EbsdLib HexagonalOps) Generate symmetrically equivalent orientations
unit_triangle_IPF_coords.m IPFMapper (wraps EbsdLib generateIPFColor) Project orientations to IPF triangle
IPF_colors.m IPFMapper (wraps EbsdLib generateIPFColor) Assign RGB colors in IPF triangle
view_IPF_map.m IPFMapper Build IPF map pixel array
convert_ODF_to_PF.m PoleFigure (wraps EbsdLib generatePoleFigure) Convert ODF to pole figure intensities
load_MTR_data.m MTRDataLoader Load EBSD data from CSV files
plot_AR.m (output data only — no GUI) Assignment rule visualization
plot_PF.m (output data only — no GUI) Pole figure visualization
generate_inverse_pole_figure.m IPFMapper Full IPF generation

2. MATLAB Built-in Functions That Need Replacing

The following MATLAB built-ins are used throughout the code and have no direct C++ standard library equivalent. Each is addressed either by an external library or a small custom implementation.

2.1 Linear Algebra

MATLAB Function Usage Location C++ Replacement
chol(A) gp_generator.m Eigen::LLT<> or Eigen::LDLT<>
A * B (dense matrix multiply) gp_generator.m Eigen operator *
reshape(x, m, n) gp_generator.m Eigen::Map<>
eye(n) gp_generator.m, select_AR.m Eigen::MatrixXd::Identity()
zeros(m, n) throughout Eigen::MatrixXd::Zero()
ones(n, 1) throughout Eigen::VectorXd::Ones()
diag(A) qsimvn.m A.diagonal()
norm(v) load_MTR_data.m v.norm()
sqrt(max(diag(c), 0)) qsimvn.m element-wise via Eigen
kron(A, B) * vec(X) (implicit) gp_generator.m manual Kronecker-structure multiply

2.2 Special / Statistical Functions

MATLAB Function Usage Location C++ Replacement
erfc(x) qsimvn.m (phi) std::erfc (<cmath>, C++11)
erfcinv(x) norminv_code.m (norminv) boost::math::erfc_inv
norminv(p) norminv_code.m, qsimvn.m Implemented via boost::math::erfc_inv
rand(m, n) throughout std::uniform_real_distribution
randn(m, n) gp_generator.m std::normal_distribution
randperm(N) sample_N_orientations_from_ODF.m std::shuffle on std::iota
primes(n) qsimvn.m Sieve of Eratosthenes (small custom utility)

2.3 Array / Indexing Operations

MATLAB Function Usage Location C++ Replacement
cumsum(v) sample_N_orientations_from_ODF.m std::partial_sum
histc(u, edges) sample_N_orientations_from_ODF.m custom histogram (loop or std::upper_bound)
hist(v, bins) calc_ODF.m custom histogram
unique(v) view_IPF_map.m, calc_ODF.m std::sort + std::unique
find(condition) throughout std::find_if, range loops
sort(v) load_MTR_data.m std::sort
max(A, [], dim) unit_triangle_IPF_coords.m Eigen rowwise().maxCoeff()
fix(x) calc_ODF.m std::trunc
mod(a, b) throughout std::fmod
abs(x) throughout std::abs / Eigen::abs
diff(v) sample_N_orientations_from_ODF.m std::adjacent_difference

2.4 Data I/O

MATLAB Function Usage Location C++ Replacement
load file.mat (v7.3 / HDF5) simulate_MTRs.m, calc_ODF.m HDF5 C++ API (hdf5[cpp] via vcpkg)
save file.mat simulate_MTRs.m HDF5 C++ API (hdf5[cpp] via vcpkg)
csvread(file) load_MTR_data.m custom CSV reader or fast-cpp-csv-parser
csvwrite(file) simulate_MTRs.m (commented) std::ofstream with formatted output

Note: The one legacy MATLAB v5 .mat file (simulation_ODF.mat) was converted to HDF5 in Phase 1 via tools/convert_simulation_ODF.py. All remaining .mat files were already MATLAB v7.3 (HDF5) format. No matio dependency is required.

2.5 Visualization / Image Output

MATLAB Function Usage Location C++ Replacement
imwrite(a, file) simulate_MTRs.m, view_IPF_map.m stb_image_write (header-only)
figure, image, scatter, plot throughout Not ported — output PNG/CSV instead
colormap('jet') several files Custom jet colormap function (trivial math)
interp1 simulate_MTRs.m std::lerp or linear interpolation helper

Visualization note: The C++ executable will not replicate the MATLAB GUI plots. Instead, outputs (MTR assignment map, IPF map, orientation data) will be written to PNG images and CSV/HDF5 files that can be visualized externally (e.g., ParaView, Python/matplotlib, MATLAB).


3. Required External Libraries

3.1 Core / Required

Library vcpkg Port Purpose Why This Library
Eigen3 eigen3 Dense linear algebra: matrix construction, Cholesky decomposition, matrix-matrix products, reshape/map semantics The gp_generator Kronecker-Cholesky algorithm and qsimvn Cholesky reordering (chlrdr) are the computational core of the simulation. Eigen provides LLT/LDLT Cholesky, optimized BLAS-level matrix multiply, and Map<> for zero-copy reshape. Header-only; no additional runtime dependency.
Boost.Math boost-math Inverse error function complement: boost::math::erfc_inv The inverse normal CDF (norminv in MATLAB) is computed as z = -sqrt(2) * erfcinv(2*p). erfcinv is not in the C++ standard library. Boost.Math provides a high-quality, tested implementation. Only the Math sub-library is needed; the full Boost install is not required.
HDF5 (C++ API) hdf5[cpp,zlib] Read and write all .mat/.h5 data files All input data files are now in HDF5 format (five were already MATLAB v7.3; one was converted in Phase 1). The hdf5[cpp] feature provides the hdf5::hdf5_cpp-shared target which exposes the full HDF5 C++ API.
stb stb Write PNG/JPEG image output imwrite in simulate_MTRs.m saves the IPF map as a JPEG. stb_image_write.h is a single-header library that writes PNG and JPEG with no external dependencies.
CLI11 cli11 Command-line argument parsing The simulation parameters (volume dimensions, voxel spacing, correlation lengths, volume fractions, output paths) are currently hardcoded. CLI11 provides a clean, modern C++11 API for defining and parsing CLI flags and options. Header-only; integrates trivially with CMake.
nlohmann-json nlohmann-json JSON configuration file support A JSON config file is the natural replacement for the parameter block at the top of simulate_MTRs.m. nlohmann/json is the de facto standard for JSON in modern C++, is header-only, and is available in vcpkg.
Catch2 catch2 Unit testing framework (CTest integration) Catch2 provides BDD-style and traditional C++ unit tests and integrates directly with CTest via catch_discover_tests. It is more lightweight than Google Test and easier to set up with vcpkg.

3.2 External (Pre-built — Not via vcpkg)

Library Integration Purpose Why This Library
EbsdLib find_package(EbsdLib) after local build+install Crystallographic symmetry operators, IPF coloring, pole figure generation, Euler-angle orientation transforms EbsdLib provides production-grade implementations of the three most difficult modules to port: crystal symmetry (replaces the hand-coded HCP operator table in symmetric_euler_angles.m), IPF coloring (replaces unit_triangle_IPF_coords.m + IPF_colors.m via HexagonalOps::generateIPFColor), and pole figure computation (replaces convert_ODF_to_PF.m via HexagonalOps::generatePoleFigure). The OrientationTransformation namespace also provides eu2om(), eu2qu(), and companion functions for all orientation representation conversions. EbsdLib is available at ../EbsdLib (sibling directory) and must be built and installed before configuring MTRsim (see §7.6).

3.3 Recommended (Strong Preference)

Library vcpkg Port Purpose Why This Library
spdlog spdlog Structured progress and diagnostic logging The MATLAB code uses fprintf extensively for simulation progress (e.g., "building mean vector...", "% complete: 0%"). spdlog provides thread-safe, leveled logging with minimal overhead.

3.4 Optional / To Be Decided

Library vcpkg Port Purpose Notes
OpenMP (compiler flag) Parallelism for voxel loops The orientation-sampling loop (for i = 1:N) and ODF histogram loops are embarrassingly parallel. OpenMP is compiler-native (no vcpkg port needed) and is the lowest-friction parallelism option.
{fmt} fmt String formatting If spdlog is included, {fmt} comes along. Can also be used directly to replace fprintf-style formatted output in the CLI.
fast-cpp-csv-parser fast-cpp-csv-parser CSV input for load_MTR_data The load_MTR_data.m script reads EBSD data from CSV files. A lightweight CSV parser avoids hand-rolling parsing. Alternative: implement manually with std::getline.

4. Libraries Explicitly Not Needed

The following MATLAB toolboxes / categories do NOT require external C++ libraries because equivalent functionality exists in the C++ standard library or can be trivially implemented:

  • Trigonometry & transcendentals (sin, cos, acos, atan2, exp, sqrt, erfc) → <cmath>
  • Pseudo-random numbers (rand, randn) → <random> (Mersenne Twister MT19937)
  • Sorting & searching (sort, unique, find) → <algorithm>
  • Cumulative sum (cumsum) → <numeric> std::partial_sum
  • Basic file I/O (csvread, csvwrite) → <fstream>
  • Prime sieve (primes(n)) → ~15-line Sieve of Eratosthenes (used only in qsimvn)
  • Histogram (hist, histc) → ~10-line custom function using std::upper_bound

5. Project Directory Structure

MTR_sim/
├── CMakeLists.txt                  # Top-level CMake
├── CMakePresets.json               # Shared CI presets
├── CMakeUserPresets.json           # Local developer presets (not committed)
├── vcpkg.json                      # vcpkg manifest
├── cmake/
│   └── triplets/
│       └── arm64-osx-v11.cmake    # Custom macOS arm64 dynamic triplet
├── src/
│   └── libmtrsim/                  # libmtrsim library (headers + sources co-located)
│       ├── CMakeLists.txt
│       ├── AssignmentRule.hpp
│       ├── AssignmentRule.cpp
│       ├── CrystalSymmetry.hpp
│       ├── CrystalSymmetry.cpp
│       ├── GPGenerator.hpp
│       ├── GPGenerator.cpp
│       ├── IPFMapper.hpp
│       ├── IPFMapper.cpp
│       ├── MTRDataLoader.hpp
│       ├── MTRDataLoader.cpp
│       ├── ODFCalculator.hpp
│       ├── ODFCalculator.cpp
│       ├── ODFSampler.hpp
│       ├── ODFSampler.cpp
│       ├── PGRFSimulation.hpp
│       ├── PGRFSimulation.cpp
│       ├── PoleFigure.hpp
│       ├── PoleFigure.cpp
│       ├── QSimVN.hpp
│       ├── QSimVN.cpp
│       └── SimulationParams.hpp
├── app/                            # CLI executable
│   ├── CMakeLists.txt
│   └── main.cpp
├── data/                           # Input data files (HDF5)
│   ├── blank_ODF.mat               # Already HDF5 (MATLAB v7.3)
│   ├── simulation_ODF.mat          # Original MATLAB v5 (superseded)
│   ├── simulation_ODF.h5           # Converted HDF5 (Phase 1 output)
│   ├── uniformODF.mat              # Already HDF5 (MATLAB v7.3)
│   ├── PF_indexed.mat              # Already HDF5 (MATLAB v7.3)
│   └── PF_coords_for_ODF.mat       # Already HDF5 (MATLAB v7.3)
├── tests/                          # CTest / Catch2 unit tests
│   ├── CMakeLists.txt
│   ├── test_placeholder.cpp
│   ├── test_gp_generator.cpp       # (future)
│   ├── test_qsimvn.cpp             # (future)
│   ├── test_odf_sampler.cpp        # (future)
│   ├── test_crystal_symmetry.cpp   # (future)
│   └── test_assignment_rule.cpp    # (future)
├── matlab/                         # Original MATLAB source (reference)
└── tools/                          # Helper scripts
    └── convert_simulation_ODF.py   # Phase 1: MAT v5 → HDF5 conversion

6. vcpkg Manifest (vcpkg.json)

{
  "$schema": "https://raw.githubusercontent.com/microsoft/vcpkg/master/scripts/vcpkg.schema.json",
  "name": "mtrsim",
  "version": "1.0.0",
  "dependencies": [
    { "name": "eigen3" },
    { "name": "boost-math" },
    { "name": "hdf5", "features": ["cpp", "zlib"] },
    { "name": "stb" },
    { "name": "cli11" },
    { "name": "nlohmann-json" },
    { "name": "spdlog" }
  ],
  "features": {
    "tests": {
      "description": "Unit tests via Catch2",
      "dependencies": [ { "name": "catch2" } ]
    }
  }
}

7. Key Implementation Decisions

7.1 MAT File Input Strategy

All six input data files are now accessible as HDF5:

  • Five files (blank_ODF.mat, uniformODF.mat, PF_coords_for_ODF.mat, PF_indexed.mat, uniformPF.mat) were already MATLAB v7.3 (HDF5) format and can be read directly by the HDF5 C++ API.
  • One file (simulation_ODF.mat) was MATLAB v5 format and was converted to simulation_ODF.h5 in Phase 1 via tools/convert_simulation_ODF.py.

The C++ code reads all data using hdf5[cpp] (hdf5::hdf5_cpp-shared target). No matio dependency is required.

7.2 No GUI / Visualization in Library

All MATLAB figure, image, scatter, and plot calls are visualization-only and are not part of the simulation algorithm. The C++ library will instead:

  • Write the MTR assignment map as a PNG (via stb_image_write).
  • Write the IPF map as a PNG.
  • Write orientation data (Euler angle triples + spatial coordinates) as CSV or HDF5.

7.3 Random Number Generator

MATLAB's rand/randn use a global RNG state. The C++ port will use a single std::mt19937_64 engine seeded from std::random_device, passed by reference to functions that require it. This enables reproducible runs via a --seed CLI flag.

7.4 Cholesky Structure in gp_generator

The Kronecker product property kron(Rx, kron(Ry, Rz)) * vec(Z) = Rz * reshape(Z) * Ry * Rx is the mathematical core of the fast GP generator. This requires three separate Cholesky factorizations and careful index arithmetic. Eigen's LLT<MatrixXd> will be used for each directional covariance matrix.

7.5 qsimvn Port

The Genz quasi-Monte Carlo MVN CDF algorithm requires:

  • Cholesky decomposition with pivoting (chlrdr) → Eigen::LDLT with setPivot
  • Standard normal CDF (phi) → std::erfc
  • Inverse standard normal CDF (phinv) → boost::math::erfc_inv
  • The prime-based quasi-random sequence → custom Sieve of Eratosthenes

7.6 EbsdLib Integration

EbsdLib (../EbsdLib) is integrated as a pre-built external dependency located one directory level above MTRsim. It must be built and installed separately before configuring MTRsim.

Step 1 — Build and install EbsdLib:

# Configure (create a CMakeUserPresets.json in EbsdLib with the arm64-osx-v11 triplet,
# matching the VCPKG settings from MTRsim's CMakeUserPresets.json)
cmake -S /Users/mjackson/Workspace7/EbsdLib \
      -B /Users/mjackson/Workspace7/DREAM3D-Build/EbsdLib-Rel \
      -G Ninja \
      -DCMAKE_BUILD_TYPE=Release \
      -DCMAKE_TOOLCHAIN_FILE=/opt/local/vcpkg/scripts/buildsystems/vcpkg.cmake \
      -DVCPKG_TARGET_TRIPLET=arm64-osx-v11 \
      -DVCPKG_OVERLAY_TRIPLETS=/Users/mjackson/Workspace7/MTR_sim/cmake/triplets \
      -DVCPKG_INSTALLED_DIR=/Users/mjackson/Workspace7/vcpkg-installed-ebsdlib \
      -DEbsdLib_ENABLE_HDF5=ON \
      -DEbsdLib_BUILD_H5SUPPORT=OFF \
      -DEbsdLib_ENABLE_MULTICORE=OFF \
      -DEbsdLib_BUILD_TESTS=OFF \
      -DCMAKE_INSTALL_PREFIX=/Users/mjackson/Workspace7/DREAM3D-Build/EbsdLib-Rel

# Build and install
cmake --build /Users/mjackson/Workspace7/DREAM3D-Build/EbsdLib-Rel
cmake --install /Users/mjackson/Workspace7/DREAM3D-Build/EbsdLib-Rel

Step 2 — Configure MTRsim (the CMAKE_PREFIX_PATH in CMakeUserPresets.json already points to the EbsdLib install prefix above):

cmake --preset mtrsim-Rel

EbsdLib → MTRsim module mapping:

EbsdLib API MTRsim Module Replaces
HexagonalOps::getNumSymOps() / getMatSymOpD(i) CrystalSymmetry::expand() symmetric_euler_angles.m
HexagonalOps::generateIPFColor(e0,e1,e2,dir0,dir1,dir2,false) IPFMapper::eulerToColors() unit_triangle_IPF_coords.m + IPF_colors.m
HexagonalOps::generatePoleFigure(PoleFigureConfiguration_t&) PoleFigure::fromODF() convert_ODF_to_PF.m
OrientationTransformation::eu2om<>() (inline in CrystalSymmetry) rotation matrix math

Key EbsdLib headers:

EbsdLib/Source/EbsdLib/LaueOps/HexagonalOps.h
EbsdLib/Source/EbsdLib/Core/OrientationTransformation.hpp
EbsdLib/Source/EbsdLib/Utilities/PoleFigureUtilities.h

CMake target: EbsdLib::EbsdLib


8. Suggested Implementation Order

  1. MAT file Conversion - Convert the data/simulation_ODF.mat to hdf5 based matlab file
  2. Setup — CMake skeleton, vcpkg manifest, CI stub
  3. Data conversiontools/convert_mat_to_hdf5.m, verify HDF5 round-trip
  4. Core numericsQSimVN, AssignmentRule (select_AR, eval_AR)
  5. GP GeneratorGPGenerator (Kronecker Cholesky)
  6. PGRF SimulationPGRFSimulation (ties 3 & 4 together)
  7. ODFODFCalculator, ODFSampler, CrystalSymmetry
  8. Visualization outputsIPFMapper, PNG export
  9. CLI executablemain.cpp, JSON config, CLI11 flags
  10. Unit tests — numerical correctness against MATLAB reference outputs
  11. Integration test — end-to-end simulation run, compare statistics to MATLAB

9. Licensing Notes

  • qsimvn: The original Alan Genz algorithm (BSD-style license, preserved in matlab/qsimvn.m) must be acknowledged in the C++ port's QSimVN.cpp source header and in any distribution documentation.
  • HDF5: BSD-like license. Compatible with the AFRL distribution statement.
  • All other libraries (Eigen3, Boost.Math, stb, CLI11, nlohmann-json, Catch2, spdlog) use MIT, BSL-1.0, or MPL-2.0 licenses compatible with the AFRL distribution statement.