我已经开始在python tensorflow库上使用K-Nearest-Neighbors方法开发机器学习项目 . 我没有使用tensorflow工具的经验,所以我在github中找到了一些代码并为我的数据修改了它 .
我的数据集是这样的:
2,2,2,2,0,0,3
2,2,2,2,0,1,0
2,2,2,4,2,2,1
...
2,2,2,4,2,0,0
这是实际工作正常的代码:
import tensorflow as tf
import numpy as np
# Whole dataset => 1428 samples
dataset = 'car-eval-data-1.csv'
# samples for train, remaining for test
samples = 1300
reader = np.loadtxt(open(dataset, "rb"), delimiter=",", skiprows=1, dtype=np.int32)
train_x, train_y = reader[:samples,:5], reader[:samples,6]
test_x, test_y = reader[samples:, :5], reader[samples:, 6]
# Placeholder you can assign values in future. its kind of a variable
# v = ("variable type",[None,4]) -- you can have multidimensional values here
training_values = tf.placeholder("float",[None,len(train_x[0])])
test_values = tf.placeholder("float",[len(train_x[0])])
# MANHATTAN distance
distance = tf.abs(tf.reduce_sum(tf.square(tf.subtract(training_values,test_values)),reduction_indices=1))
prediction = tf.arg_min(distance, 0)
init = tf.global_variables_initializer()
accuracy = 0.0
with tf.Session() as sess:
sess.run(init)
# Looping through the test set to compare against the training set
for i in range (len(test_x)):
# Tensor flow method to get the prediction near to the test parameters in the training set.
index_in_trainingset = sess.run(prediction, feed_dict={training_values:train_x,test_values:test_x[i]})
print("Test %d, and the prediction is %s, the real value is %s"%(i,train_y[index_in_trainingset],test_y[i]))
if train_y[index_in_trainingset] == test_y[i]:
# if prediction is right so accuracy increases.
accuracy += 1. / len(test_x)
print('Accuracy -> ', accuracy * 100, ' %')
我唯一不理解的是,如果它是 KNN 方法,那么必须有一些 K parameter 定义 number of neighbors for predicting the label for each test sample .
我们如何分配K参数来调整代码的最近邻居数?
有没有办法修改此代码以使用K参数?
1 回答
你是对的,上面的例子没有选择K-Nearest邻居的规定 . 在下面的代码中,我添加了添加此类参数(knn_size)以及其他更正的功能