我正在使用以下配置从日志文件中将数据推送到hdfs .
agent.channels.memory-channel.type = memory
agent.channels.memory-channel.capacity=5000
agent.sources.tail-source.type = exec
agent.sources.tail-source.command = tail -F /home/training/Downloads/log.txt
agent.sources.tail-source.channels = memory-channel
agent.sinks.log-sink.channel = memory-channel
agent.sinks.log-sink.type = logger
agent.sinks.hdfs-sink.channel = memory-channel
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.batchSize=10
agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/user/flume/data/log.txt
agent.sinks.hdfs-sink.hdfs.fileType = DataStream
agent.sinks.hdfs-sink.hdfs.writeFormat = Text
agent.channels = memory-channel
agent.sources = tail-source
agent.sinks = log-sink hdfs-sink
agent.channels = memory-channel
agent.sources = tail-source
agent.sinks = log-sink hdfs-sink
我没有收到错误消息,但仍然无法找到hdfs中的输出 . 在打断我可以看到接收器中断异常和该日志文件的一些数据 . 我正在运行以下命令:flume-ng agent --conf / etc / flume-ng / conf / --conf-file /etc/flume-ng/conf/flume.conf -Dflume.root.logger = DEBUG,console - n代理人;
3 回答
i had a similar issue
在我的情况下现在它正在工作下面是conf文件:
我希望这可以帮到你 .
我建议在HDFS中放置文件时使用前缀配置:
agent.sinks.hdfs-sink.hdfs.filePrefix = log.out
@bhavesh - 您确定,日志文件(agent.sources.tail-source.command = tail -F /home/training/Downloads/log.txt)会不断追加数据吗?由于您已使用带-F的Tail命令,因此只有已更改的数据(在文件中)将被转储到HDFS中