Intenté implementar una función propagate ()
que calcula la función de costo y su gradiente sabiendo que :
Propagación hacia adelante:
- Recibes X
-
Calcula
-
Calcula la función de costo:
Aquí están las dos fórmulas que estaba usando:
![$ $ \frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T\tag{7}$ $ $ $ \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})\tag{8}$ $ ]3
Que codifiqué como :
def propagate(w, b, X, Y): """ Implement the cost function and its gradient for the propagation explained above Arguments: w -- weights, a numpy array of size (num_px * num_px * 3, 1) b -- bias, a scalar X -- data of size (num_px * num_px * 3, number of examples) Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples) Return: cost -- negative log-likelihood cost for logistic regression dw -- gradient of the loss with respect to w, thus same shape as w db -- gradient of the loss with respect to b, thus same shape as b Tips: - Write your code step by step for the propagation. np.log(), np.dot() """ m = X.shape[1] # FORWARD PROPAGATION (FROM X TO COST) ### START CODE HERE ### (≈ 2 lines of code) A = sigmoid(np.dot(w.T,X)+b) # compute activation cost = -1/m * np.sum(np.dot(Y.T,np.log(A)) + np.dot((1-Y).T,np.log(1-A))) # compute cost ### END CODE HERE ### # BACKWARD PROPAGATION (TO FIND GRAD) ### START CODE HERE ### (≈ 2 lines of code) dw = 1/m*np.dot(X,(A-Y).T) db = 1/m*np.sum(A-Y) ### END CODE HERE ### assert(dw.shape == w.shape) assert(db.dtype == float) cost = np.squeeze(cost) assert(cost.shape == ()) grads = {"dw": dw, "db": db} return grads, cost
Pero wcon el siguiente ejemplo :
w, b, X, Y = np.array([[1.],[2.]]), 2., np.array([[1.,2.,-1.],[3.,4.,-3.2]]), np.array([[1,0,1]]) grads, cost = propagate(w, b, X, Y)
El output estaba :
dw = [[ 0.99845601] [ 2.39507239]] db = 0.00145557813678 cost = 10.6046359582
Whereas the expected output was