我正在尝试 Build 一个神经网络 . 我有tenforflow 1.5与gpu支持和gpu gtx950 . 我遇到的问题是我的模型非常大,大约有3000万个参数,当我试图运行程序时,它给了我错误 . (它给出了许多行我将复制粘贴只有一些启动错误)
[INFO] training...
Epoch 1/10
2018-12-09 11:15:55.051823: I tensorflow/core/common_runtime /gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 950, pci bus id: 0000:01:00.0, compute capability: 5.2)
2018-12-09 11:16:07.836098: W tensorflow/core/common_runtime/bfc_allocator.cc:273] Allocator (GPU_0_bfc) ran out of memory trying to allocate 225.00MiB. Current allocation summary follows.
2018-12-09 11:16:07.836138: I tensorflow/core/common_runtime/bfc_allocator.cc:628] Bin (256): Total Chunks: 30, Chunks in use: 30. 7.5KiB allocated for chunks. 7.5KiB in use in bin. 1.9KiB client-requested in use in bin.
这是我第一次使用张量流,所以我不确定究竟是什么问题,但我认为它试图在gpu中加载整个程序一次,因此它不适合并给出错误 . 我有吗?我看到的一些建议是使用会话或允许_growth
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config = config)
但我不确切知道如何使用它,如果我做得很严格
我的代码就是这个
import glob
import cv2
import numpy as np
import os
import keras
import tensorflow as tf
from keras.optimizers import SGD
from autoencoder3 import autoencoder
def DataGenerator(trainorval = 'train' ,batch_size = 1 ,number_of_images = 1000):
#################################################################
path_input = '/home/user/Desktop/dataset/images/100k/'
path_drima = '/home/user/Desktop/dataset/drivable_maps_color_labels/'
######################################################################
path_input = path_input + trainorval
path_drima = path_drima + trainorval
files = glob.glob1(path_input,"*.jpg")
datain = np.empty(shape=[batch_size,720,1280,3])
seglabel = np.empty(shape=[batch_size,720,1280,2])
while True:
for image in files[0:number_of_images]:
im = cv2.imread(os.path.join(path_input, image))
im = im.astype(np.float32)
im[:,:,0] = im[:,:,0] - 73.9358
im[:,:,1] = im[:,:,1] - 74.6135
im[:,:,2] = im[:,:,2] - 71.0640
drima = cv2.imread(os.path.join(path_drima, image[:-4] + '_drivable_color.png'))
b = drima[:,:,0]
b = b//255
r = drima[:,:,2]
r = r//255
br = np.stack((b,r))
br = br.transpose((1,2,0))
datain[0] = im
seglabel[0] = br
yield (datain,seglabel)
if __name__ == "__main__":
opt = SGD(lr=0.01)
print(0)
model = autoencoder.build()
print(1)
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
print(2)
tgen = DataGenerator(trainorval = 'train',number_of_images = 700)
vgen = DataGenerator(trainorval = 'val',number_of_images = 100)
print("[INFO] training...")
model.fit_generator(
tgen,
epochs=10,
validation_data=vgen,
steps_per_epoch=700,
validation_steps=100,
verbose = 1
)
我尝试使用在cpu中运行的tensorflow的机器中的相同代码,并在一些警告之后它开始正常运行,但它非常慢 .
是否可以通过会话或其他方式运行具有低性能和memmory gpu的大型nn模型?