反向传播错误：概念性还是编程？-Java 学习之路

我编写了以下反向传播算法来模拟两个输入标识函数

clc
% clear

nh = 3;              % neurons in hidden layer
ni = 2;              % neurons in input layer

eta = .001;          % the learning rate
traningSize =100;

for x=1:traningSize 
    input   = rand(traningSize,ni);
    test    = input;
end





nk = size(test,2);    % neurons in output layer

b1 = rand(1,nh);%+ .5;
b2 = rand(1,nk);%- .5;
w1 = rand(nh,ni) + .5;
w2 = rand(nk,nh) - .5;

figure
hold on;

for iter = 1 :5000
    errSq = 0;
    for x = 1: traningSize

        a0      = input(x,:);
        ex      = test(x,:);

        [a1, a2]= feedForward(a0,w1,w2,b1,b2);

        del2    = (a2-ex) .* (1-a2) .* (a2);
        del1    = (del2 * w2) .* (1-a1) .* (a1);

        delB2   = del2;
        delB1   = del1;

        delW2   = zeros(nk,nh);
        for i = 1:nh
            for j = 1:nk
                delW2   = a1(i) * del2(j);
            end
        end
        for i = 1:ni
            for j = 1:nh
                delW1   = a0(i) * del1(j);
            end
        end

        b2 = b2 - eta * delB2;
        b1 = b1 - eta * delB1;

        w2 = w2 - eta * delW2;
        w1 = w1 - eta * delW1;

        errSq = errSq + sum(a2-ex) .* sum(a2-ex);

    end

cost = errSq /(2 * traningSize);
plot(iter,cost,'o');

if cost < 0.005
    cost
    break
end

end    
cost

feedForward函数：

function [a1, a2]  = feedForward(a0,w1,w2,b1,b2)

    z1      = a0 * w1' + b1;
    a1      = sig(z1);
    z2      = a1 * w2' + b2;
    a2      = sig(z2);


end

The cost function plot

现在我搞砸了什么？

是否有一些程序错误引起了我的注意？或者我是否错误地实施了算法？

当我测试得到的权重时，计算出的成本是经过训练的，但结果完全错误

Blue > expected output ; red > output of neural network

Blue > expected output ; red > output of neural network

为什么成本值有时会在降低之前上升（如图1所示）

1 回答

事实证明，我需要纠正编程错误;

delW2   = zeros(nk,nh);
    for i = 1:nh
        for j = 1:nk
            delW2(j,i)   = a1(i) * del2(j);   % forgot the index for delW2
        end
    end
    delW1   = zeros(nh,ni);                   % initilize delWI1 (although this is optional)
    for i = 1:ni
        for j = 1:nh
            delW1(j,i)   = a0(i) * del1(j);   % forgot the index for delW1
        end
    end

并且 eliminate the sigmoid function from the output layer . 即制作线性输出层以获得可接受的结果 .

我不确定为什么线性输出层是严格要求的，并希望对此进行评论 .

回复于 2024-05-09T07:29:18+08:00

反向传播错误：概念性还是编程？

1 回答

相关问题