我在Yarn Cluster上运行我的Spark应用程序 . 无论我做什么,我都无法打印RDD功能中的日志 . 您可以在下面找到我为RDD处理函数编写的示例代码段 . 我简化了代码来说明我用来编写函数的语法 . 当我在本地运行它时,我能够看到日志但不能在群集模式下查看 . System.err.println和 Logger 似乎都没有工作 . 但我可以看到我的所有驱动程序日志 . 我甚至尝试使用Root Logger 进行记录,但它在RDD处理函数中根本不起作用 . 我非常想看到日志消息,所以最后我找到了一个使用logger作为瞬态的指南(https://www.mapr.com/blog/how-log-apache-spark),但事件没有没有帮助

class SampleFlatMapFunction implements PairFlatMapFunction <Tuple2<String,String>,String,String>{

    private static final long serialVersionUID = 6565656322667L;
    transient Logger  executorLogger = LogManager.getLogger("sparkExecutor");


    private void readObject(java.io.ObjectInputStream in)
            throws IOException, ClassNotFoundException {
            in.defaultReadObject();
            executorLogger = LogManager.getLogger("sparkExecutor");
    }
    @Override
    public Iterable<Tuple2<String,String>> call(Tuple2<String, String> tuple)        throws Exception {

        executorLogger.info(" log testing from  executorLogger ::");
        System.err.println(" log testing from  executorLogger system error stream ");


            List<Tuple2<String, String>> updates = new ArrayList<>();
            //process Tuple , expand and add it to list.
            return updates;

         }
 };

我的Log4j配置如下

log4j.appender.console=org.apache.log4j.ConsoleAppender
    log4j.appender.console.target=System.err
    log4j.appender.console.layout=org.apache.log4j.PatternLayout
    log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

    log4j.appender.stdout=org.apache.log4j.ConsoleAppender
    log4j.appender.stdout.target=System.out
    log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
    log4j.appender.stdout.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

    log4j.appender.RollingAppender=org.apache.log4j.DailyRollingFileAppender
    log4j.appender.RollingAppender.File=/var/log/spark/spark.log
    log4j.appender.RollingAppender.DatePattern='.'yyyy-MM-dd
    log4j.appender.RollingAppender.layout=org.apache.log4j.PatternLayout
    log4j.appender.RollingAppender.layout.ConversionPattern=[%p] %d %c %M - %m%n

    log4j.appender.RollingAppenderU=org.apache.log4j.DailyRollingFileAppender
    log4j.appender.RollingAppenderU.File=${spark.yarn.app.container.log.dir}/spark-app.log
    log4j.appender.RollingAppenderU.DatePattern='.'yyyy-MM-dd
    log4j.appender.RollingAppenderU.layout=org.apache.log4j.PatternLayout
    log4j.appender.RollingAppenderU.layout.ConversionPattern=[%p] %d %c %M - %m%n


    # By default, everything goes to console and file
    log4j.rootLogger=INFO, RollingAppender, console

    # My custom logging goes to another file
    log4j.logger.sparkExecutor=INFO, stdout, RollingAppenderU

我已经尝试过纱线日志,Spark UI Logs无处可查看RDD处理功能的日志语句 . 我试过下面的方法,但它没有用

yarn logs -applicationId 

I checked even below HDFS path also

/tmp/logs/

我通过传递下面的参数来运行我的spark-submit命令,即使那样它也不起作用

--master yarn --deploy-mode cluster   --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties"  --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.properties"

有人可以指导我记录火花RDD和 Map 功能吗?我在上述步骤中遗漏了什么?