Detailed Description

A class to perform basic 1D statistics. Each method is static.

Author: Mathias Bavay

Date: 2009-01-20

#include <libinterpol1D.h>

Static Public Member Functions
static double	min_element (const std::vector< double > &X)

static double	max_element (const std::vector< double > &X)

static std::vector< double >	quantiles (const std::vector< double > &X, const std::vector< double > &quartiles)
	This function returns a vector of quantiles. The vector does not have to be sorted. See https://secure.wikimedia.org/wikipedia/en/wiki/Quartile for more. This code is heavily inspired by Ken Wilder, https://sites.google.com/site/jivsoft/Home/compute-ranks-of-elements-in-a-c—array-or-vector (quantile method, replacing the nth-element call by direct access to a sorted vector). More...

static std::vector< double >	quantiles_core (std::vector< double > X, const std::vector< double > &quartiles)
	This function returns a vector of quantiles, but does not filter out nodata values! The vector does not have to be sorted. See https://secure.wikimedia.org/wikipedia/en/wiki/Quartile for more. This code is heavily inspired by Ken Wilder, https://sites.google.com/site/jivsoft/Home/compute-ranks-of-elements-in-a-c—array-or-vector (quantile method, replacing the nth-element call by direct access to a sorted vector). More...

static std::vector< double >	derivative (const std::vector< double > &X, const std::vector< double > &Y)
	This function returns the vector of local derivatives, given a vector of abscissae and ordinates. The vectors must be sorted by ascending x. The derivatives will be centered if possible, left or right otherwise or nodata if nothing else can be computed. More...

static void	sort (std::vector< double > &X, std::vector< double > &Y, const bool &keep_nodata=true)
	This function sorts the X and Y vectors by increasing X. The nodata values (both in X and Y) are removed, meaning that the vector length might not be kept. More...

static void	equalBin (const unsigned int k, std::vector< double > &X, std::vector< double > &Y)
	data binning method This bins the data into k classes of equal width (see https://en.wikipedia.org/wiki/Data_binning) More...

static void	equalCountBin (const unsigned int k, std::vector< double > &X, std::vector< double > &Y)
	data binning method This bins the data into k classes of equal number of elements (see https://en.wikipedia.org/wiki/Data_binning). The number of elements per classes is adjusted in order to reduce unevenness between casses: for example when distributing 100 elements in 8 classes, this will generate 4 classes of 13 elements and 4 classes of 12 elements. More...

static double	weightedMean (const double &d1, const double &d2, const double &weight=1.)
	This function returns the weighted arithmetic mean of two numbers. A weight of 0 returns d1, a weight of 1 returns d2, a weight of 0.5 returns a centered mean. See https://secure.wikimedia.org/wikipedia/en/wiki/Weighted_mean for more... More...

static double	weightedMean (const std::vector< double > &vecData, const std::vector< double > &weight)
	This function returns the weighted arithmetic mean of a vector. See https://secure.wikimedia.org/wikipedia/en/wiki/Weighted_mean for more... More...

static double	arithmeticMean (const std::vector< double > &vecData)

static double	getMedian (const std::vector< double > &vecData, const bool &keep_nodata=true)

static double	getMedianAverageDeviation (std::vector< double > vecData, const bool &keep_nodata=true)

static double	variance (const std::vector< double > &X)
	Compute the variance of a vector of data It is computed using a compensated variance algorithm, (see https://secure.wikimedia.org/wikipedia/en/wiki/Algorithms_for_calculating_variance) in order to be more robust to small variations around the mean. More...

static double	std_dev (const std::vector< double > &X)

static double	covariance (const std::vector< double > &z1, const std::vector< double > &z2)

static double	corr (const std::vector< double > &z1, const std::vector< double > &z2)
	Computes the Pearson product-moment correlation coefficient This should be equivalent to the default R "corr" method. More...

static double	Pearson (const std::vector< double > &X, const std::vector< double > &Y)
	Computes the Pearson product-moment correlation coefficient in a more numerically efficient manner than "corr". More...

static double	R2 (const std::vector< double > &obs, const std::vector< double > &sim)
	Computes the R2 coefficient of determination See https://en.wikipedia.org/wiki/Coefficient_of_determination and https://en.wikipedia.org/wiki/Fraction_of_variance_unexplained. More...

static double	NashSuttcliffe (const std::vector< double > &obs, const std::vector< double > &sim)
	Computes the Nash-Suttcliffe correlation coefficient for two vectors It is assumed that the same indices contain the same timesteps. A value of 1 means a perfect match, a value of zero that no variance is reproduced (see https://en.wikipedia.org/wiki/Nash%E2%80%93Sutcliffe_model_efficiency_coefficient) More...

static double	getBoxMuller ()
	Box–Muller method for normally distributed random numbers. More...

static void	LinRegression (const std::vector< double > &X, const std::vector< double > &Y, double &a, double &b, double &r, std::string &mesg, const bool &fixed_rate=false)
	Computes the linear regression coefficients fitting the points given as X and Y in two vectors the linear regression has the form Y = aX + b with a regression coefficient r (it is nodata safe) More...

static void	NoisyLinRegression (const std::vector< double > &in_X, const std::vector< double > &in_Y, double &A, double &B, double &R, std::string &mesg, const bool &fixed_rate=false)
	Computes the linear regression coefficients fitting the points given as X and Y in two vectors the linear regression has the form Y = aX + b with a regression coefficient r. If the regression coefficient is not good enough, tries to remove bad points (up to 15% of the initial data set can be removed, keeping at least 4 points) More...

static void	twoLinRegression (const std::vector< double > &in_X, const std::vector< double > &in_Y, const double &bilin_inflection, std::vector< double > &coeffs)
	Computes the bi-linear regression coefficients fitting the points given as X and Y in two vectors We consider that the regression can be made with 2 linear segments with a fixed inflection point. It relies on Interpol1D::NoisyLinRegression. More...

static void	LogRegression (const std::vector< double > &X, const std::vector< double > &Y, double &a, double &b, double &r, std::string &mesg)
	Computes the Log regression coefficients fitting the points given as X and Y in two vectors the log regression has the form Y = a*ln(X) + b with a regression coefficient r (it is nodata safe) More...

static void	ExpRegression (const std::vector< double > &X, const std::vector< double > &Y, double &a, double &b, double &r, std::string &mesg)
	Computes the power regression coefficients fitting the points given as X and Y in two vectors the power regression has the form Y = b*X^a with a regression coefficient r (it is nodata safe) More...

Member Function Documentation

◆ arithmeticMean()

double mio::Interpol1D::arithmeticMean ( const std::vector< double > & vecData )

static

◆ corr()

double mio::Interpol1D::corr	(	const std::vector< double > &	X,
		const std::vector< double > &	Y
	)

static

Computes the Pearson product-moment correlation coefficient This should be equivalent to the default R "corr" method.

Parameters

X	first vector of data
Y	second vector of data

Returns: correlation coefficient

◆ covariance()

double mio::Interpol1D::covariance	(	const std::vector< double > &	z1,
		const std::vector< double > &	z2
	)

static

◆ derivative()

std::vector< double > mio::Interpol1D::derivative	(	const std::vector< double > &	X,
		const std::vector< double > &	Y
	)

static

This function returns the vector of local derivatives, given a vector of abscissae and ordinates. The vectors must be sorted by ascending x. The derivatives will be centered if possible, left or right otherwise or nodata if nothing else can be computed.

Parameters

X	vector of abscissae
Y	vector of ordinates

Returns: vector of local derivatives

◆ equalBin()

void mio::Interpol1D::equalBin	(	const unsigned int	k,
		std::vector< double > &	X,
		std::vector< double > &	Y
	)

static

data binning method This bins the data into k classes of equal width (see https://en.wikipedia.org/wiki/Data_binning)

Parameters

k	number of classes
X	vector of abscissae
Y	vector of ordinates

◆ equalCountBin()

void mio::Interpol1D::equalCountBin	(	const unsigned int	k,
		std::vector< double > &	X,
		std::vector< double > &	Y
	)

static

data binning method This bins the data into k classes of equal number of elements (see https://en.wikipedia.org/wiki/Data_binning). The number of elements per classes is adjusted in order to reduce unevenness between casses: for example when distributing 100 elements in 8 classes, this will generate 4 classes of 13 elements and 4 classes of 12 elements.

Parameters

k	number of classes
X	vector of abscissae
Y	vector of ordinates

◆ ExpRegression()

void mio::Interpol1D::ExpRegression	(	const std::vector< double > &	X,
		const std::vector< double > &	Y,
		double &	a,
		double &	b,
		double &	r,
		std::string &	mesg
	)

static

Computes the power regression coefficients fitting the points given as X and Y in two vectors the power regression has the form Y = b*X^a with a regression coefficient r (it is nodata safe)

Parameters

X	vector of X coordinates
Y	vector of Y coordinates (same order as X)
a	slope of the regression
b	origin of the regression
r	regression coefficient
mesg	information message if something fishy is detected

◆ getBoxMuller()

double mio::Interpol1D::getBoxMuller ( )

static

Box–Muller method for normally distributed random numbers.

This generate a normally distributed signal of mean=0 and std_dev=1. For numerical reasons, the extremes will always be less than 7 * std_dev from the mean. See https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform

Note: Do not forget to initialize the (pseudo) random number generator! Something like "srand( static_cast<unsigned int>(time(nullptr)) );"

Returns: normally distributed number

◆ getMedian()

double mio::Interpol1D::getMedian	(	const std::vector< double > &	vecData,
		const bool &	keep_nodata = `true`
	)

static

◆ getMedianAverageDeviation()

double mio::Interpol1D::getMedianAverageDeviation	(	std::vector< double >	vecData,
		const bool &	keep_nodata = `true`
	)

static

◆ LinRegression()

void mio::Interpol1D::LinRegression	(	const std::vector< double > &	X,
		const std::vector< double > &	Y,
		double &	a,
		double &	b,
		double &	r,
		std::string &	mesg,
		const bool &	fixed_rate = `false`
	)

static

Computes the linear regression coefficients fitting the points given as X and Y in two vectors the linear regression has the form Y = aX + b with a regression coefficient r (it is nodata safe)

Parameters

X	vector of X coordinates
Y	vector of Y coordinates (same order as X)
a	slope of the linear regression
b	origin of the linear regression
r	absolute value of linear regression coefficient
mesg	information message if something fishy is detected
fixed_rate	force the lapse rate? (default=false)

◆ LogRegression()

void mio::Interpol1D::LogRegression	(	const std::vector< double > &	X,
		const std::vector< double > &	Y,
		double &	a,
		double &	b,
		double &	r,
		std::string &	mesg
	)

static

Computes the Log regression coefficients fitting the points given as X and Y in two vectors the log regression has the form Y = a*ln(X) + b with a regression coefficient r (it is nodata safe)

Parameters

X	vector of X coordinates
Y	vector of Y coordinates (same order as X)
a	slope of the regression
b	origin of the regression
r	regression coefficient
mesg	information message if something fishy is detected

◆ max_element()

double mio::Interpol1D::max_element ( const std::vector< double > & X )

static

◆ min_element()

double mio::Interpol1D::min_element ( const std::vector< double > & X )

static

◆ NashSuttcliffe()

double mio::Interpol1D::NashSuttcliffe	(	const std::vector< double > &	obs,
		const std::vector< double > &	sim
	)

static

Computes the Nash-Suttcliffe correlation coefficient for two vectors It is assumed that the same indices contain the same timesteps. A value of 1 means a perfect match, a value of zero that no variance is reproduced (see https://en.wikipedia.org/wiki/Nash%E2%80%93Sutcliffe_model_efficiency_coefficient)

Parameters

obs	vector of observed data
sim	vector of simulated data

Returns: Nash-Suttcliffe correlation coefficient, between ]-∞, 1]

◆ NoisyLinRegression()

void mio::Interpol1D::NoisyLinRegression	(	const std::vector< double > &	in_X,
		const std::vector< double > &	in_Y,
		double &	A,
		double &	B,
		double &	R,
		std::string &	mesg,
		const bool &	fixed_rate = `false`
	)

static

Computes the linear regression coefficients fitting the points given as X and Y in two vectors the linear regression has the form Y = aX + b with a regression coefficient r. If the regression coefficient is not good enough, tries to remove bad points (up to 15% of the initial data set can be removed, keeping at least 4 points)

Parameters

in_X	vector of X coordinates
in_Y	vector of Y coordinates (same order as X)
A	slope of the linear regression
B	origin of the linear regression
R	linear regression coefficient
mesg	information message if something fishy is detected
fixed_rate	force the lapse rate? (default=false)

◆ Pearson()

double mio::Interpol1D::Pearson	(	const std::vector< double > &	X,
		const std::vector< double > &	Y
	)

static

Computes the Pearson product-moment correlation coefficient in a more numerically efficient manner than "corr".

Parameters

X	first vector of data
Y	second vector of data

Returns: correlation coefficient

◆ quantiles()

std::vector< double > mio::Interpol1D::quantiles	(	const std::vector< double > &	X,
		const std::vector< double > &	quartiles
	)

static

This function returns a vector of quantiles. The vector does not have to be sorted. See https://secure.wikimedia.org/wikipedia/en/wiki/Quartile for more. This code is heavily inspired by Ken Wilder, https://sites.google.com/site/jivsoft/Home/compute-ranks-of-elements-in-a-c—array-or-vector (quantile method, replacing the nth-element call by direct access to a sorted vector).

Parameters

X	vector to classify
quartiles	vector of quartiles, between 0 and 1

Returns: vector of ordinates of the quantiles

◆ quantiles_core()

std::vector< double > mio::Interpol1D::quantiles_core	(	std::vector< double >	X,
		const std::vector< double > &	quartiles
	)

static

This function returns a vector of quantiles, but does not filter out nodata values! The vector does not have to be sorted. See https://secure.wikimedia.org/wikipedia/en/wiki/Quartile for more. This code is heavily inspired by Ken Wilder, https://sites.google.com/site/jivsoft/Home/compute-ranks-of-elements-in-a-c—array-or-vector (quantile method, replacing the nth-element call by direct access to a sorted vector).

Parameters

X	vector to classify (nodata values processed as normal values)
quartiles	vector of quartiles, between 0 and 1

Returns: vector of ordinates of the quantiles

◆ R2()

double mio::Interpol1D::R2	(	const std::vector< double > &	obs,
		const std::vector< double > &	sim
	)

static

Computes the R2 coefficient of determination See https://en.wikipedia.org/wiki/Coefficient_of_determination and https://en.wikipedia.org/wiki/Fraction_of_variance_unexplained.

Parameters

obs	vector of observed data
sim	vector of simulated data

Returns: coefficient of determination

◆ sort()

void mio::Interpol1D::sort	(	std::vector< double > &	X,
		std::vector< double > &	Y,
		const bool &	keep_nodata = `true`
	)

static

This function sorts the X and Y vectors by increasing X. The nodata values (both in X and Y) are removed, meaning that the vector length might not be kept.

Parameters

X	vector of abscissae
Y	vector of ordinates
keep_nodata	should nodata values be kept? (default=true)

◆ std_dev()

double mio::Interpol1D::std_dev ( const std::vector< double > & X )

static

◆ twoLinRegression()

void mio::Interpol1D::twoLinRegression	(	const std::vector< double > &	in_X,
		const std::vector< double > &	in_Y,
		const double &	bilin_inflection,
		std::vector< double > &	coeffs
	)

static

Computes the bi-linear regression coefficients fitting the points given as X and Y in two vectors We consider that the regression can be made with 2 linear segments with a fixed inflection point. It relies on Interpol1D::NoisyLinRegression.

Parameters

in_X	vector of X coordinates
in_Y	vector of Y coordinates (same order as X)
bilin_inflection	inflection point absissa
coeffs	a,b,r coefficients in a vector

◆ variance()

double mio::Interpol1D::variance ( const std::vector< double > & X )

static

Compute the variance of a vector of data It is computed using a compensated variance algorithm, (see https://secure.wikimedia.org/wikipedia/en/wiki/Algorithms_for_calculating_variance) in order to be more robust to small variations around the mean.

Parameters

X	vector of data

Returns: variance or IOUtils::nodata

◆ weightedMean() [1/2]

double mio::Interpol1D::weightedMean	(	const double &	d1,
		const double &	d2,
		const double &	weight = `1.`
	)

static

This function returns the weighted arithmetic mean of two numbers. A weight of 0 returns d1, a weight of 1 returns d2, a weight of 0.5 returns a centered mean. See https://secure.wikimedia.org/wikipedia/en/wiki/Weighted_mean for more...

Parameters

d1	first value
d2	second value
weight	weight to apply to the mean

Returns: weighted arithmetic mean

◆ weightedMean() [2/2]

double mio::Interpol1D::weightedMean	(	const std::vector< double > &	vecData,
		const std::vector< double > &	weight
	)

static

This function returns the weighted arithmetic mean of a vector. See https://secure.wikimedia.org/wikipedia/en/wiki/Weighted_mean for more...

Parameters

vecData	vector of values
weight	weights to apply to the mean

Returns: weighted arithmetic mean

The documentation for this class was generated from the following files:

Detailed Description

Static Public Member Functions

Member Function Documentation

◆ arithmeticMean()

◆ corr()

◆ covariance()

◆ derivative()

◆ equalBin()

◆ equalCountBin()

◆ ExpRegression()

◆ getBoxMuller()

◆ getMedian()

◆ getMedianAverageDeviation()

◆ LinRegression()

◆ LogRegression()

◆ max_element()

◆ min_element()

◆ NashSuttcliffe()

◆ NoisyLinRegression()

◆ Pearson()

◆ quantiles()

◆ quantiles_core()

◆ R2()

◆ sort()

◆ std_dev()

◆ twoLinRegression()

◆ variance()

◆ weightedMean() [1/2]

◆ weightedMean() [2/2]