I am interested in calculating various entropy measures of time series, and was surprised to find that none of them are included with Mathematica. Sample Entropy (SampEn) is a fairly common technique for this purpose, so I’ve written a module to perform the calculation, but I’m surprised at how slow it is (especially compared to my Python implementation).
SampEn[TheData_,TheWindowSize_,TheThreshold_]:= Module[{Num=Length[TheData],NumWindows,TheValueArray1,TheValueArray2,TheDifferences1,,TheDifferences2}, NumWindows=Num-TheWindowSize+1; TheValueArray1=Table[Table[TheData[[j]],{j,i,i+(TheWindowSize-1)}],{i,NumWindows}]; TheValueArray2=Table[Table[TheData[[j]],{j,i,i+(TheWindowSize)}],{i,NumWindows-1}]; TheDifferences1=Total[Table[Total[Table[If[ChessboardDistance[i,j]<TheThreshold,1,0],{j,TheValueArray1}]]-1,{i,TheValueArray1}]]; TheDifferences2=Total[Table[Total[Table[If[ChessboardDistance[i,j]<TheThreshold,1,0],{j,TheValueArray2}]]-1,{i,TheValueArray2}]]; -Log[N@(TheDifferences2/TheDifferences1)]]; RandomWalkingData=Accumulate[Table[1.0*RandomVariate[NormalDistribution[0,.4]],{t,1,1000,1}]]; m=2; r=0.2*StandardDeviation[RandomWalkingData]; Timing[SampEn[RandomWalkingData,m,r]]
This takes 40.04s
for 1000 datapoints, but with 5000 points it already takes 989.608s
. I also tried a variant that combined the Table
and Total
operations, but it wasn’t any faster.
TheDifferences1=Total[Table[If[ChessboardDistance[i,j]<TheThreshold,1,0],{j,TheValueArray1},{i,TheValueArray1}],2]-Length[TheValueArray1];
Are there any shortcuts or built-in functions that can be used to speed up this calculation? How can this be made more efficient?