我们的一些Dataflow作业在读取源数据文件时随机崩溃 .
在作业日志中写入以下错误(工作日志中没有任何内容):
11 févr. 2016 à 08:30:54
(33b59f945cff28ab): Workflow failed.
Causes: (fecf7537c059fece): S02:read-edn-file2/TextIO.Read+read-edn-file2
/ParDo(ff19274a)+ParDo(ff19274a)5+ParDo(ff19274a)6+RemoveDuplicates
/CreateIndex+RemoveDuplicates/Combine.PerKey
/GroupByKey+RemoveDuplicates/Combine.PerKey/Combine.GroupedValues
/Partial+RemoveDuplicates/Combine.PerKey/GroupByKey/Reify+RemoveDuplicates
/Combine.PerKey/GroupByKey/Write faile
我们有时也会遇到这种错误(记录在工作日志中):
2016-02-15T10:27:41.024Z: Basic: S18: (43c8777b75bc373e): Executing operation group-by2/GroupByKey/Read+group-by2/GroupByKey/GroupByWindow+ParDo(ff19274a)19+ParDo(ff19274a)20+ParDo(ff19274a)21+write-edn-file3/ParDo(ff19274a)+write-bq-table-from-clj3/ParDo(ff19274a)+write-bq-table-from-clj3/BigQueryIO.Write+write-edn-file3/TextIO.Write
2016-02-15T10:28:03.994Z: Error: (af73c53187b7243a): java.io.IOException: com.google.api.client.googleapis.json.GoogleJsonResponseException: 410 Gone
{
"code" : 503,
"errors" : [ {
"domain" : "global",
"message" : "Backend Error",
"reason" : "backendError"
} ],
"message" : "Backend Error"
}
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:431)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:289)
at com.google.cloud.dataflow.sdk.runners.worker.TextSink$TextFileWriter.close(TextSink.java:243)
at com.google.cloud.dataflow.sdk.util.common.worker.WriteOperation.finish(WriteOperation.java:100)
at com.google.cloud.dataflow.sdk.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.executeWork(DataflowWorker.java:254)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.doWork(DataflowWorker.java:191)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:144)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.doWork(DataflowWorkerHarness.java:180)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:161)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:148)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
源数据文件存储在Google Cloud 端存储中 .
数据路径是正确的,工作通常在重新启动后工作 . 我们直到1月底才遇到这个问题 .
使用以下参数启动作业: - tryLocation ='gstoragelocation'--stagingLocation ='另一个gstorage位置' - runner = BlockingDataflowPipelineRunner --numWorkers ='几打'--zone = europe-west1-d
SDK版本:1.3.0
谢谢
1 回答
作为一个明确标记的"backend error",这应该在 Cloud 数据流公共问题跟踪器(google-cloud-dataflow)或更一般的Cloud Platform Public Issue Tracker上报告,并且Stack Overflow上的任何人都可以帮助您调试它 .