我正在尝试 Build 一个神经网络 . 我有tenforflow 1.5与gpu支持和gpu gtx950 . 我遇到的问题是我的模型非常大,大约有3000万个参数,当我试图运行程序时,它给了我错误 . (它给出了许多行我将复制粘贴只有一些启动错误)

[INFO] training...
Epoch 1/10
2018-12-09 11:15:55.051823: I tensorflow/core/common_runtime       /gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 950, pci bus id: 0000:01:00.0, compute capability: 5.2)
2018-12-09 11:16:07.836098: W tensorflow/core/common_runtime/bfc_allocator.cc:273] Allocator (GPU_0_bfc) ran out of memory trying to allocate 225.00MiB.  Current allocation summary follows.
2018-12-09 11:16:07.836138: I tensorflow/core/common_runtime/bfc_allocator.cc:628] Bin (256):   Total Chunks: 30, Chunks in use: 30. 7.5KiB allocated for chunks. 7.5KiB in use in bin. 1.9KiB client-requested in use in bin.

这是我第一次使用张量流,所以我不确定究竟是什么问题,但我认为它试图在gpu中加载整个程序一次,因此它不适合并给出错误 . 我有吗?我看到的一些建议是使用会话或允许_growth

config = tf.ConfigProto()
 config.gpu_options.allow_growth = True
 sess = tf.Session(config = config)

但我不确切知道如何使用它,如果我做得很严格

我的代码就是这个

import glob
import cv2
import numpy as np
import os
import keras
import tensorflow as tf
from keras.optimizers import SGD
from autoencoder3 import autoencoder

def DataGenerator(trainorval = 'train' ,batch_size = 1 ,number_of_images = 1000):
    #################################################################
    path_input = '/home/user/Desktop/dataset/images/100k/'
    path_drima = '/home/user/Desktop/dataset/drivable_maps_color_labels/'
    ######################################################################
    path_input = path_input + trainorval
    path_drima = path_drima + trainorval
    files = glob.glob1(path_input,"*.jpg")
    datain = np.empty(shape=[batch_size,720,1280,3])
    seglabel = np.empty(shape=[batch_size,720,1280,2])
    while True:
        for image in files[0:number_of_images]:
            im = cv2.imread(os.path.join(path_input, image))
            im = im.astype(np.float32)
            im[:,:,0] = im[:,:,0] - 73.9358
            im[:,:,1] = im[:,:,1] - 74.6135
            im[:,:,2] = im[:,:,2] - 71.0640
            drima = cv2.imread(os.path.join(path_drima, image[:-4] + '_drivable_color.png'))
            b = drima[:,:,0]
            b = b//255
            r = drima[:,:,2]
            r = r//255
            br = np.stack((b,r))
            br = br.transpose((1,2,0))
            datain[0] = im
            seglabel[0] = br
            yield (datain,seglabel)

if __name__ == "__main__":
    opt = SGD(lr=0.01) 
    print(0)
    model = autoencoder.build()
    print(1)
    model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
    print(2)
    tgen = DataGenerator(trainorval = 'train',number_of_images = 700)
    vgen = DataGenerator(trainorval = 'val',number_of_images = 100)
    print("[INFO] training...")
    model.fit_generator(
    tgen,
    epochs=10,
    validation_data=vgen,
    steps_per_epoch=700,
    validation_steps=100,
    verbose = 1
    )

我尝试使用在cpu中运行的tensorflow的机器中的相同代码,并在一些警告之后它开始正常运行,但它非常慢 .

是否可以通过会话或其他方式运行具有低性能和memmory gpu的大型nn模型?