Skip to content

A header-only C++ library for mean and standard error estimation of moments, cumulants, and factorial cumulants

License

Notifications You must be signed in to change notification settings

vlvovch/sample-moments

Repository files navigation

sample-moments

DOI

sample-moments is a header-only C++ library to calculate sample averages and standard error estimates of (joint) (higher-order) moments, central moments and cumulants for one or two random variables. It also provides factorial moments/factorial cumulants and ratios, with analytic standard-error estimates via the Delta method.

Such a task arises when one tries to estimate various properties of a distribution from a finite set of observations sampled from that distribution. Common examples of such properties are given by the leading moments of the distribution, for instance the mean, variance, skewness, kurtosis etc. And while the calculation of the sample mean and its standard error is well known and straightforward, estimation of sample statistics (and their errors) that are functions of higher-order moments is more involved.

This library calculates sample average of any (joint) (central) moment or cumulants for one or two variables from a set of observations. The standard error estimate is calculated as well, to leading order in 1/√nobs via the Delta method (see e.g. M. Kendall and A. Stuart, ''The advanced theory of statistics'').

Prerequisites

A compiler with C++11 support.

CMake v3.8+ to build samples and tests.

Usage

Include SampleMoments.h header file to use the library in C++ code. Then use either NumberStatistics or TwoNumberStatistics objects to calculate sample moments and/or cumulants from single or two variable observations. For example (see example_basic.cpp:

#include <SampleMoments.h>  // The library
#include <random>           // Random number generation
#include <iostream>         // Output

// Sample usage with the normal distribution
int main() {
  // Parameters of the normal distribution
  double mean  = 10.;
  double sigma = 0.5;

  // Random number generator from the normal distribution
  std::mt19937 rng;
  std::normal_distribution<double> distribution(mean, sigma);
  
  // The number of observations
  int number_of_observations = 100;

  // Populate the observations
  std::vector<double> observations;
  for(int i = 0; i < number_of_observations; ++i) {
    double N = distribution(rng);
    observations.push_back(N);
  }

  // Analyze the observations
  SampleMoments::NumberStatistics stats(observations);

  std::cout << "                      Sample mean: " << stats.GetMean()
  << " +- " << stats.GetMeanError() << std::endl;
  std::cout << "                  Sample variance: " << stats.GetVariance()
  << " +- " << stats.GetVarianceError() << std::endl;
  std::cout << "                  Sample skewness: " << stats.GetCentralMoment(3)
  << " +- " << stats.GetCentralMomentError(3) << std::endl;
  std::cout << "           Sample excess kurtosis: " << stats.GetCumulant(4)
  << " +- " << stats.GetCumulantError(4) << std::endl;
  std::cout << "Sample normalized excess kurtosis: " << stats.GetCumulantRatio(4, 2)
  << " +- " << stats.GetCumulantRatioError(4, 2) << std::endl;
  
  return 0;
}

A sample output is the following:

                      Sample mean: 9.99956 +- 0.00157491
                  Sample variance: 0.248034 +- 0.00110527
                  Sample skewness: -0.000286803 +- 0.00093758
           Sample excess kurtosis: -0.000880853 +- 0.000896272
Sample normalized excess kurtosis: -0.00355133 +- 0.00361298

See examples for more use cases and annotated header files for all the features. Examples cover basic cumulants as well as factorial moments/cumulants.

Building / Testing

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DINCLUDE_TESTS=ON
cmake --build build
ctest --test-dir build

Thread safety

Library classes (NumberStatistics, TwoNumberStatistics) are not thread-safe. If you access a single instance from multiple threads, guard calls with external synchronization or use separate instances per thread.

Same applies to the global partition cache initialization via SampleMoments::PrecomputePartitions. While optional and usually unnecessary, if called, it should be done from a single thread before using the library in a multi-threaded context.

Notes

  • By default, calculation of sample moments is supported up to 16th order, and standard error estimates up to 8th order. To increase this limit, e.g. increase 16/8 to 32/16 call the constructor as follows: SampleMoments::NumberStatistics(observations, 32).
  • The library uses double-precision floating-point arithmetic to store the values of the moments. Beware of the round-off errors, especially when the first moment is large but the (higher-order) central moments are small. See normal_distribution.cpp for an example on how to deal with this.

Copyright (C) 2021-2025 Volodymyr Vovchenko

About

A header-only C++ library for mean and standard error estimation of moments, cumulants, and factorial cumulants

Resources

License

Stars

Watchers

Forks

Packages

No packages published