I’m writing a neural network and am having trouble with finding the cost to weight gradient. Supposedly the gradient is given by $ \frac{\partial C}{\partial w^l_{jk}} = a^{l-1}_k \partial^l_j$ but I’m having problems with my matrix dimensions. If $ a^{l-1}$ is of dimensions $ [n^{l-1},1]$ , $ \delta^l$ is of dimensions $ [n^{l},1]$ , andRead more