-->
Adventures & experiences in contemporary technology
Recently, I needed to calculate geometric means after a break of a number of years and ended racking a weary brain until it was brought back to me. In order that I am not in the same situation again, I am recording it here and sharing it always is good too.
The first step is to take the natural log (to base e or the approximated irrational value of the mathematical constant, 2.718281828) of the actual values in the data. Since you cannot have a log of zero, the solution is either to exclude those values or substitute a small value that will not affect the overall result as is done in the data step below. In SAS, the log function uses the number e as its base and you need to use the log10 equivalent when base 10 logs are needed.
data temp;
set temp;
if result=0 then _result=0.000001;
else _result=result;
ln_result=log(_result);
run;
Next, the mean of the log values is determined and you can use any method of doing that so long as it gives the correct values. PROC MEANS is used here but PROC SUMMARY (identical to MEANS except it defaults to data set creation while that generates output by default; because of that, we need to use the NOPRINT option here), PROC UNIVARIATE or even the MEAN function in PROC SQL.
proc means data=temp noprint;
output out=mean mean=mean;
var ln_result;
run;
With the mean of the log values obtained, we need to take the exponential of the obtained value(s) using the EXP function. This returns values of the same magnitude as in the original data using the formula emean.
data gmean;
set mean;
gmean=exp(mean);
run;