目前,我正在尝试使用pyCaffe在caffe模型上进行转移学习 . 我的计算环境是使用来自Anaconda发行版的Python 3.5.4的Windows 10计算机[MSC v.1900 64位(AMD64)] . 我从source构建了caffe和pycaffe,对一些cmake和构建批处理文件进行了一些修改以使用Python 3.5 .

我已经成功为现有的semantic segmentation model构建了一个caffe网络对象,并且能够毫无问题地进行预测 . 我在IPython控制台(6.2.1)上通过Spyder IDE(3.2.6)运行我的代码 .

当我尝试在pycaffe中生成一个求解器时,IPython内核会一直崩溃 . 下面是我开头的Python代码:

import os
import sys

workDir = os.getcwd()
dirSep = os.sep

pyCaffeDir = dirSep.join([workDir, 'caffe', 'python'])

sys.path.insert(0, pyCaffeDir) # Add pycaffe to the PYTHONPATH

import caffe

caffe.set_device(0)
caffe.set_mode_gpu()

solverFile = dirSep.join([workDir, 'solver.prototxt'])

'solver.prototxt'文件遵循典型格式:

net: "\\absolute\\path\\to\\train.prototxt"
test_initialization: false
test_iter: 10
test_interval: 1000000
test_compute_loss: true
type: "SGD"
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 100000
display: 10
momentum: 0.9
max_iter: 200000
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "\\absolute\\path\\to\\snapshot\\dir"
solver_mode: GPU

此外,'train.prototxt'非常标准,但自定义python数据层除外:

layer {
  name: "data"
  type: "Python"
  top: "data"
  top: "label"
  python_param {
    module: "myLayers"
    layer: "myDataLayer"
    param_str: "{\'split\': \'train\', \'mean\': (131.6291, 134.9651,    125.4718)}"
  }
}

... Additional layers below define network architecture similarly to pretrained neural network.

到目前为止,我已经能够生成两种不同的崩溃:

  • 第一种是在我从上面执行Python代码后立即调用 caffe.get_solver()caffe.SGDSolver() 时生成的 . 我可以通过以下任何一种方式拨打电话: solver = caffe.get_solver('solver.prototxt')solver = caffe.get_solver(' ') . 在这样做之后,IPython内核崩溃了一个非描述性的故障堆栈跟踪,"*** Check failure stack trace: ***" .

  • 当我第一次使用预训练的caffe模型 net = caffe.Net('\\absolute\\path\\to\\deploy_model.prototxt', '\\absolute\\path\\to\\model.caffemodel', caffe.TRAIN) 生成caffe网络对象时,会生成第二种类型 . 然后我按照上面的方式调用caffe求解器,并获得更具描述性的故障堆栈跟踪:

WARNING: Logging before InitGoogleLogging() is written to STDERR
I0222 22:22:18.210124 41296 common.cpp:36] System entropy source not available, using fallback algorithm to generate seed instead.
W0222 22:22:21.850746 41296 _caffe.cpp:175] DEPRECATION WARNING ‑ deprecated use of Python interface
W0222 22:22:21.850746 41296 _caffe.cpp:176] Use this instead (with the named "weights" parameter):
W0222 22:22:21.850746 41296 _caffe.cpp:178] Net('\path\to\deploy_model.prototxt', 1, weights='\path\to\model.caffemodel')
I0222 22:22:21.855746 41296 net.cpp:51] Initializing net from parameters: 
state {
phase: TEST
level: 0
}
layer {
name: "input"
type: "Input"
top: "data"
input_param {
shape {
dim: 1
dim: 3
dim: 384
dim: 384
}
}
}

... Additional layers of the network

I0222 22:22:21.855746 41296 layer_factory.cpp:58] Creating layer input
I0222 22:22:21.855746 41296 net.cpp:84] Creating Layer input
I0222 22:22:21.855746 41296 net.cpp:380] input ‑> data
I0222 22:22:21.890769 41296 net.cpp:122] Setting up input
I0222 22:22:21.890769 41296 net.cpp:129] Top shape: 1 3 384 384 (442368)
I0222 22:22:21.890769 41296 net.cpp:137] Memory required for data: 1769472

... Additional layer creations and setup 

I0222 22:22:22.759393 41296 net.cpp:200] upsample does not need backward computation.

... Additional specifications on layers that do not require backward computation

I0222 22:22:22.759393 41296 net.cpp:242] This network produces output fc_final_up
I0222 22:22:22.759393 41296 net.cpp:255] Network initialization done.
[libprotobuf WARNING C:\Users\guillaume\work\caffe‑builder\build_v140_x64\packages\protobuf\protobuf_download‑prefix\src\protobuf_download\src\google\protobuf\io\coded_stream.cc:605] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING C:\Users\guillaume\work\caffe‑builder\build_v140_x64\packages\protobuf\protobuf_download‑prefix\src\protobuf_download\src\google\protobuf\io\coded_stream.cc:82] The total number of bytes read was 539674280
*** Check failure stack trace: ***

我对这些问题的原因有一些假设:

  • 错误的构建 . 经过多次重建并认识到我可以使用预训练模型进行预测后,我怀疑情况可能并非如此 .

  • 错误的自定义python数据层 . 我主要从here借用一个现有的python数据层,所以我怀疑这不是问题 .

  • 格式错误的'solver.prototxt'文件 . 虽然当我将空字符串传递给 caffe.get_solver() 时,我得到的错误与上面相同 .

这些是我生成的一些更合理的假设 . 任何指针/潜在的解决方案将非常感激 .