我试图重现DeepCoder项目发布的结果(参见https://arxiv.org/abs/1611.01989),特别是它的神经网络组件 .

Brief Overview:

DeepCoder的前馈神经网络模型是多标签多类分类器 - 给出黑盒函数的输入和输出列表 - 将预测函数向量 .

例如 . 假设函数集[ - * /]和输入 - 输出集[0,1,2,3] - > [1,2,3,4],正确的预测可能是[1,0,0,0] ](即,黑盒功能包含但不限于a的预测) .

Here's a graph view of my implemented Neural Network

Breakdown of the implemented model:

给定2(N,1)个输入张量(分别对应于输入和输出样本),NN按行分割张量,使每个子张量通过嵌入层,连接输入和输出嵌入,然后叠加每个N复合嵌入单张量 . 请注意,根据DeepCoder论文中的规范,N对应于每个黑盒功能的样本数 . 然后张量通过具有256个ReLU / softmax单元的3个隐藏层(用两者测试),然后在最后一层平均成1D阵列 .

在训练NN时,将样本分类为(x x)或(x - x), the prediction always comes out the same (通常是训练分类的最后一个函数) . 我还使用(x / x),(x * x),cos(x),sin(x)和sqrt(x)测试了类似(错误)的结果 .

Question: 任何有神经网络设计经验的人都可以发现我的NN布局有任何缺点吗?也许偏离了DeepCoder的规格?

这是我的代码,用Keras编写(带有Tensorflow后端):

hidden_size = 256
inputs_per_sample = 1
n_attribs = 7
samples_per_iter = 5

# custom functions for lambdas
def split(x, i):
    return x[0, i]
def average(x):
    ave = K.mean(x, axis=0, keepdims=True)
    return ave

def get_model(inputs_per_sample, embedding_dimension, samples_per_prog):
    inp_tens = Input(shape=(samples_per_prog, inputs_per_sample,), name="INPUT")
    out_tens = Input(shape=(samples_per_prog, 1,), name="OUTPUT")
    # formats are (None, spp, sps)... The None represents the number of samples that can be fed *incrementally* (i.e. isolated)

    inp_embedding = Embedding(50 + 1, embedding_dimension, input_length=inputs_per_sample, name='INPUT_EMBEDDER')
    out_embedding = Embedding(512 + 1, embedding_dimension, input_length=1, name='OUTPUT_EMBEDDER')
    l1 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_1')
    l2 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_2')
    l3 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_3')
    decode = Dense(n_attribs, activation='softmax', name="DECODER")
    combine = Concatenate(name='COMBINATOR')
    stack = Concatenate(axis=0, name='STACKER')
    reshape = Reshape((-1,), name='RESHAPER')

    output_tensor = []
    for i in range(samples_per_prog):
        input_tens_inter = Lambda(split, arguments={'i': i}, name='INP_SPLITTER_' + str(i))(inp_tens)
        outpt_tens_inter = Lambda(split, arguments={'i': i}, name='OUT_SPLITTER_' + str(i))(out_tens)
        tens_inter = combine([reshape(inp_embedding(input_tens_inter)), reshape(out_embedding(outpt_tens_inter))])
        if i == 0:
            output_tensor = tens_inter
        else:
            output_tensor = stack([tens_inter, output_tensor])

    l1_layers = Dropout(0.2, name='DROPOUT_1')(l1(output_tensor))
    l2_layers = Dropout(0.2, name='DROPOUT_2')(l2(l1_layers))
    l3_layers = Dropout(0.2, name='DROPOUT_3')(l3(l2_layers))

    decoder = decode(l3_layers)
    pooled_output = Lambda(average, name='AVERAGE_POOL', output_shape=(n_attribs,))(decoder)
    print(pooled_output.get_shape())

    model = Model(inputs=[inp_tens, out_tens], outputs=pooled_output)
    model.compile('adam', loss='binary_crossentropy', metrics=['accuracy'])
    return model


model = get_model(inputs_per_sample, 20, samples_per_iter)
print(model.summary())
#plot_model(model, to_file='ml_graph.png', show_shapes=True)

inputs = pd.read_csv("G:\\s_is.csv", header=None)
outputs = pd.read_csv("G:\\s_os.csv", header=None)
avs = pd.read_csv("G:\\s_avs.csv",  header=None)

ins = inputs
out = outputs
attribs = avs

for j in range(len(attribs)):
    print("j=" + str(j))
    _inps = np.array(ins[j * samples_per_iter:(j + 1) * samples_per_iter])
    _outs = np.array(out[j * samples_per_iter:(j + 1) * samples_per_iter])
    _inps = np.reshape(_inps, (1, samples_per_iter, 1))
    _outs = np.reshape(_outs, (1, samples_per_iter, 1))
    _atts = np.array([attribs.T.get(j)])

    model.fit(x=[_inps, _outs], y=_atts, batch_size=10, epochs=1000)

# saving the model
model_json = model.to_json()
with open("C:\\Users\\James-MSI\\Desktop\\model.json", "w") as json_file:
    json_file.write(model_json)
model.save_weights("C:\\Users\\James-MSI\\Desktop\\modelweights.h5")

谢谢 :)

詹姆士