Random facts

benchmarks

In order to evaluate how we are doing (performance wise), we very regularly run benchmarks, profilers, etc. Sometimes, it is also good to compare how versions years apart do.

between version svn450 and svn988

The version 450 still has a feature set that allows to compare enough things related to resampling and reaccumulation. It still does not support arbitrary parameters (a feature that proved quite costly) and the resampling/reaccumulation code was quite broken (making some assumptions that helped the performances but made it very dependant on the initial sampling rate). Enough filters were implemented, but the most time consuming filters were not available in this version.

Please also keep in mind that a lot of bugs have been fixed between these different versions, lots of extra safety checks and warning added, lot of features and flexibility added, that all added some extra computational cost. Therefore some versions in between were significantly slower than both.

Several workloads have been compared: a single SMET file for one year of data, representing a typical, easy SNOWPACK simulation. The same input, but using a WINDOW_SIZE such that ten days gaps could be re-interpolated, as usually set up for operational purposes. 10 SMET files for one year of data, representing a very simplistic Alpine3D simulation.

Workload	svn450	svn988	speedup
Snowpack	9.845	0.704	14
Snowpack-opera	12.973	1.948	6.7
Alpine3d	130.412	11.497	11.4

For the IOUtils::seek optimization

Since most of the time the sampling rate is constant, we can reduce the search window around a predefined position. The 12 years WFJ data set has been used with data_converter (but not writing the data out) to benchmark the effect of this optimization (average of 10 runs with one run of preconditioning).

Optimization	total duration
none	8.201037
+/- 15%	8.182136
+/- 10%	8.150069

Profiling

The version 1086 has been profiled by putting timers in IOManager, running meteo_reading on one example station (one data point) as well as converting 2 months of an example station (multiple data points). The filters are two steps min_max filters (hard followed by soft filter), the ILWR is multiplied by 0 and added -999, the resampling is linear except VW that has a nearest neighbor and HNW that has an hourly accumulation and the only data generator is an Unsworth generator on ILWR.

The influence of resampling multiple data points can easily be seen in the increase of the data processing load when converting 2 months of data. The profiling has been performed on an 8 years old Pentium M 1.2GHz laptop with 1.5Go ram.

Workload	one data point	%	multiple data points	%
reading	0.205	100%	0.228	64%
processing	2.5e-4	0%	0.105	29%
generating	8.7e-5	0%	0.0246	7%
total	0.205	100%	0.3576	100%

The same version has been profiled on a Core2 2.6GHz desktop, reading 12 years of WFJ hourly data (but not writing it out), read on an hourly base (thus not requiring any resampling) or an half-hourly base (requiring resampling for all parameters).

Workload	no resampling	%	resampling	%
reading	3.389	43%	3.349	31%
filtering	2.092	27%	2.089	20%
resampling	1.911	24%	4.277	40%
generating	0.473	6%	0.941	9%
total	7.865	100%	10.656	100%