For a similar problem, I am trying to modify the code here (https://www.wolfram.com/language/11/image-and-signal-processing/image-recognition-using-deep-learning.html?product=mathematica) so that the distance between the prediction and the label matters, not merely right or wrong. That is, in that example, predicting a 7 instead of 9 is much better than predicting a 1.
Edit: Rather then using a “Class” for the NetDecoder, I am using a “Scalar.”
Edit 2: To make my question more clear, I think I have modified the code by essentially switching from “Class” to “Scalar” and predicting a continuum) and removing the softmax layer. I just was wondering the most elegant way to predict a use something like the digit recognition architecture to predict a continuous quantity (so mistaking a 9 for a 7 is better then for a 1).