首页 文章

反向传播错误:概念性还是编程?

提问于
浏览
0

我编写了以下反向传播算法来模拟两个输入标识函数

clc
% clear

nh = 3;              % neurons in hidden layer
ni = 2;              % neurons in input layer

eta = .001;          % the learning rate
traningSize =100;

for x=1:traningSize 
    input   = rand(traningSize,ni);
    test    = input;
end





nk = size(test,2);    % neurons in output layer

b1 = rand(1,nh);%+ .5;
b2 = rand(1,nk);%- .5;
w1 = rand(nh,ni) + .5;
w2 = rand(nk,nh) - .5;

figure
hold on;

for iter = 1 :5000
    errSq = 0;
    for x = 1: traningSize

        a0      = input(x,:);
        ex      = test(x,:);

        [a1, a2]= feedForward(a0,w1,w2,b1,b2);

        del2    = (a2-ex) .* (1-a2) .* (a2);
        del1    = (del2 * w2) .* (1-a1) .* (a1);

        delB2   = del2;
        delB1   = del1;

        delW2   = zeros(nk,nh);
        for i = 1:nh
            for j = 1:nk
                delW2   = a1(i) * del2(j);
            end
        end
        for i = 1:ni
            for j = 1:nh
                delW1   = a0(i) * del1(j);
            end
        end

        b2 = b2 - eta * delB2;
        b1 = b1 - eta * delB1;

        w2 = w2 - eta * delW2;
        w1 = w1 - eta * delW1;

        errSq = errSq + sum(a2-ex) .* sum(a2-ex);

    end

cost = errSq /(2 * traningSize);
plot(iter,cost,'o');

if cost < 0.005
    cost
    break
end

end    
cost

feedForward函数:

function [a1, a2]  = feedForward(a0,w1,w2,b1,b2)

    z1      = a0 * w1' + b1;
    a1      = sig(z1);
    z2      = a1 * w2' + b2;
    a2      = sig(z2);


end

The cost function plot
The cost function plot

现在我搞砸了什么?

是否有一些程序错误引起了我的注意?或者我是否错误地实施了算法?

当我测试得到的权重时,计算出的成本是经过训练的,但结果完全错误

Blue > expected output ; red > output of neural network

Blue > expected output ; red > output of neural network

为什么成本值有时会在降低之前上升(如图1所示)

1 回答

  • 1

    事实证明,我需要纠正编程错误;

    delW2   = zeros(nk,nh);
        for i = 1:nh
            for j = 1:nk
                delW2(j,i)   = a1(i) * del2(j);   % forgot the index for delW2
            end
        end
        delW1   = zeros(nh,ni);                   % initilize delWI1 (although this is optional)
        for i = 1:ni
            for j = 1:nh
                delW1(j,i)   = a0(i) * del1(j);   % forgot the index for delW1
            end
        end
    

    并且 eliminate the sigmoid function from the output layer . 即制作线性输出层以获得可接受的结果 .

    我不确定为什么线性输出层是严格要求的,并希望对此进行评论 .

相关问题