我试图重现DeepCoder项目发布的结果(参见https://arxiv.org/abs/1611.01989),特别是它的神经网络组件 .
Brief Overview:
DeepCoder的前馈神经网络模型是多标签多类分类器 - 给出黑盒函数的输入和输出列表 - 将预测函数向量 .
例如 . 假设函数集[ - * /]和输入 - 输出集[0,1,2,3] - > [1,2,3,4],正确的预测可能是[1,0,0,0] ](即,黑盒功能包含但不限于a的预测) .
Here's a graph view of my implemented Neural Network
Breakdown of the implemented model:
给定2(N,1)个输入张量(分别对应于输入和输出样本),NN按行分割张量,使每个子张量通过嵌入层,连接输入和输出嵌入,然后叠加每个N复合嵌入单张量 . 请注意,根据DeepCoder论文中的规范,N对应于每个黑盒功能的样本数 . 然后张量通过具有256个ReLU / softmax单元的3个隐藏层(用两者测试),然后在最后一层平均成1D阵列 .
在训练NN时,将样本分类为(x x)或(x - x), the prediction always comes out the same (通常是训练分类的最后一个函数) . 我还使用(x / x),(x * x),cos(x),sin(x)和sqrt(x)测试了类似(错误)的结果 .
Question: 任何有神经网络设计经验的人都可以发现我的NN布局有任何缺点吗?也许偏离了DeepCoder的规格?
这是我的代码,用Keras编写(带有Tensorflow后端):
hidden_size = 256
inputs_per_sample = 1
n_attribs = 7
samples_per_iter = 5
# custom functions for lambdas
def split(x, i):
return x[0, i]
def average(x):
ave = K.mean(x, axis=0, keepdims=True)
return ave
def get_model(inputs_per_sample, embedding_dimension, samples_per_prog):
inp_tens = Input(shape=(samples_per_prog, inputs_per_sample,), name="INPUT")
out_tens = Input(shape=(samples_per_prog, 1,), name="OUTPUT")
# formats are (None, spp, sps)... The None represents the number of samples that can be fed *incrementally* (i.e. isolated)
inp_embedding = Embedding(50 + 1, embedding_dimension, input_length=inputs_per_sample, name='INPUT_EMBEDDER')
out_embedding = Embedding(512 + 1, embedding_dimension, input_length=1, name='OUTPUT_EMBEDDER')
l1 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_1')
l2 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_2')
l3 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_3')
decode = Dense(n_attribs, activation='softmax', name="DECODER")
combine = Concatenate(name='COMBINATOR')
stack = Concatenate(axis=0, name='STACKER')
reshape = Reshape((-1,), name='RESHAPER')
output_tensor = []
for i in range(samples_per_prog):
input_tens_inter = Lambda(split, arguments={'i': i}, name='INP_SPLITTER_' + str(i))(inp_tens)
outpt_tens_inter = Lambda(split, arguments={'i': i}, name='OUT_SPLITTER_' + str(i))(out_tens)
tens_inter = combine([reshape(inp_embedding(input_tens_inter)), reshape(out_embedding(outpt_tens_inter))])
if i == 0:
output_tensor = tens_inter
else:
output_tensor = stack([tens_inter, output_tensor])
l1_layers = Dropout(0.2, name='DROPOUT_1')(l1(output_tensor))
l2_layers = Dropout(0.2, name='DROPOUT_2')(l2(l1_layers))
l3_layers = Dropout(0.2, name='DROPOUT_3')(l3(l2_layers))
decoder = decode(l3_layers)
pooled_output = Lambda(average, name='AVERAGE_POOL', output_shape=(n_attribs,))(decoder)
print(pooled_output.get_shape())
model = Model(inputs=[inp_tens, out_tens], outputs=pooled_output)
model.compile('adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
model = get_model(inputs_per_sample, 20, samples_per_iter)
print(model.summary())
#plot_model(model, to_file='ml_graph.png', show_shapes=True)
inputs = pd.read_csv("G:\\s_is.csv", header=None)
outputs = pd.read_csv("G:\\s_os.csv", header=None)
avs = pd.read_csv("G:\\s_avs.csv", header=None)
ins = inputs
out = outputs
attribs = avs
for j in range(len(attribs)):
print("j=" + str(j))
_inps = np.array(ins[j * samples_per_iter:(j + 1) * samples_per_iter])
_outs = np.array(out[j * samples_per_iter:(j + 1) * samples_per_iter])
_inps = np.reshape(_inps, (1, samples_per_iter, 1))
_outs = np.reshape(_outs, (1, samples_per_iter, 1))
_atts = np.array([attribs.T.get(j)])
model.fit(x=[_inps, _outs], y=_atts, batch_size=10, epochs=1000)
# saving the model
model_json = model.to_json()
with open("C:\\Users\\James-MSI\\Desktop\\model.json", "w") as json_file:
json_file.write(model_json)
model.save_weights("C:\\Users\\James-MSI\\Desktop\\modelweights.h5")
谢谢 :)
詹姆士