首页 文章

火花错误 - 十进制精度39超过最大精度38

提问于
浏览
0

当我尝试从Spark数据帧收集数据时,我收到一个错误说明

“java.lang.IllegalArgumentException:要求失败:十进制精度39超过最大精度38” .

Spark数据帧中的所有数据都来自Oracle数据库,我相信小数精度<38 . 有没有办法在不修改数据的情况下实现这一目标?

# Load required table into memory from Oracle database
df <- loadDF(sqlContext, source = "jdbc", url = "jdbc:oracle:thin:usr/pass@url.com:1521" , dbtable = "TBL_NM")

RawData <- df %>% 
    filter(DT_Column > DATE(‘2015-01-01’))

RawData <- as.data.frame(RawData)

给出错误

下面是堆栈跟踪:

WARN TaskSetManager:阶段0.0中的丢失任务1.0(TID 1,10 ... ***,执行程序0):java.lang.IllegalArgumentException:要求失败:在scala.Predef $ .require处,十进制精度39超过最大精度38( Predef.scala:224)org.apache.spark.sql.types.Decimal.set(Decimal.scala:113)at org.apache.spark.sql.types.Decimal $ .apply(Decimal.scala:426)at at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $$ anonfun $ $组织阿帕奇$火花$ SQL $执行$ $的数据源JDBC $ JdbcUtils $$ makeGetter $ 3 $$ anonfun $ 9.apply(JdbcUtils.scala:337 )org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $$ anonfun $ org $ apache $ spark $ sql $ execution $ datasources $ jdbc $ JdbcUtils $$ makeGetter $ 3 $$ anonfun $ 9.apply(JdbcUtils.scala :337)atg.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $ .org $ apache $ spark $ sql $ execution $ datasources $ jdbc $ JdbcUtils $$ nullSafeConvert(JdbcUtils.scala:438)at org.apache .spark.sql.execution.datasources.jdbc.JdbcUtils $$ anonfun $ $组织阿帕奇$火花$ SQL $执行$ $的数据源JDBC $ JdbcUtils $$马keGetter $ 3.apply(JdbcUtils.scala:337)at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $$ anonfun $ org $ apache $ spark $ sql $ execution $ datasources $ jdbc $ JdbcUtils $$ makeGetter $ 3位于org.apache.spark.spark . .jdbc.JdbcUtils $$ anon $ 1.getNext(JdbcUtils.scala:268)org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator) .scala:32)org.apache.spark.sql.catalyst.expressions.GeneratedClass $ GeneratedIterator.processNext(未知来源)org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)at org .apache.spark.sql.execution.WholeStageCodegenExec $$ anonfun $ 8 $$ anon $ 1.hasNext(WholeStageCodegenExec.scala:377)org.apache.spark.sql.execution.SparkPlan $$ anonfun $ 2.apply(SparkPlan.scala: 231)在org.apache.spark.sql.execution.SparkPlan $$ anonfun $ 2.apply(SparkPlan.scala:225)at org.apache.spark.rdd.RDD $$ anonfun $ mapPartitionsInternal $ 1 $$ anonfun $ apply $ 25.apply(RDD.scala:826)at org.apache.spark.rdd . RDD $$ anonfun $ mapPartitionsInternal $ 1 $$ anonfun $在org.apache.spark.rdd上的org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)申请$ 25.apply(RDD.scala:826) . org.apache.spark.rdd.RDD.iterator(RDD.scala:287)的RDD.computeOrReadCheckpoint(RDD.scala:323)位于org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)org .apache.spark.scheduler.Task.run(Task.scala:99)at org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:282)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor) .java:1142)java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)

请建议任何解决方案 . 谢谢 .

1 回答

相关问题