首页 文章

Spark“错误:类型不匹配”与scala 2.11而不是2.12 [重复]

提问于
浏览
1

这个问题在这里已有答案:

我使用scala 2.11而不是2.12(使用spark 2.2.1)有一个奇怪的编译错误

这是我的 scala 代码

val spark = SparkSession.builder.
      master("local")
      .appName("spark rmd connect import")
      .enableHiveSupport()
      .getOrCreate()


    //LOAD
    var time = System.currentTimeMillis()
    val r_log_o = spark.read.format("orc").load("log.orc")
    val r_log = r_log_o.drop(r_log_o.col("id"))
    System.currentTimeMillis() - time


    time = System.currentTimeMillis()
    r_log_o.toJavaRDD.cache().map((x:Row) => {x(4).asInstanceOf[Timestamp]}).reduce(minTs(_, _))
    System.currentTimeMillis() - time

where

def minTs(x: Timestamp, y: Timestamp): Timestamp = {
    if (x.compareTo(y) < 0) return x;
    else return y;
  }

我的 pom.xml 配置如下

<plugin>
    <groupId>net.alchim31.maven</groupId>
    <artifactId>scala-maven-plugin</artifactId>
    <version>3.3.1</version>
    <configuration>
      <scalaVersion>2.11</scalaVersion>
    </configuration>
  </plugin>

        <plugin>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.1</version>
            <configuration>
                <source>1.8</source>
                <target>1.8</target>
            </configuration>
        </plugin>
    </plugins>
</build>

<dependencies>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.11.12</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.2.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.2.1</version>
    </dependency>
</dependencies>

如果我编译 <scalaVersion>2.12</scalaVersion>

它编译,使用scala 2.11我得到以下错误

[INFO] / root / project / src / main / java:-1:info:compiling [INFO] / root / project / src / main / scala:-1:info:compiling [INFO]将2个源文件编译为/ root / rmd-connect-spark / target / classes at 1515426201592 [ERROR] /root/rmd-connect-spark/src/main/scala/SparkConnectTest.scala:40:error:type mismatch; [ERROR]发现:org.apache.spark.sql.Row => java.sql.Timestamp [ERROR] required:org.apache.spark.api.java.function.Function [org.apache.spark.sql.Row, ?] [ERROR] .map((x:Row)=> {x(4).asInstanceOf [Timestamp]})[ERROR] ^ [ERROR]发现一个错误[INFO] [INFO] BUILD FAILURE [INFO]

注意: this is not a problem of spark runtime is a problem of using scala 2.11 with spark api

1 回答

  • 3

    你有一个javaRDD,所以你需要使用Java api和 org.apache.spark.api.java.function.Function 而不是Scala函数 . 在Scala 2.12中添加了支持以自动将Scala函数转换为Java SAM接口,这就是为什么此代码在Scala 2.12中有效 .

    如果要在Scala中进行编码,请使用Scala API而不是Java .

相关问题