// Run streaming query1 in scheduler pool1
spark.sparkContext.setLocalProperty("spark.scheduler.pool", "pool1")
df.writeStream.queryName("query1").format("parquet").start(path1)
// Run streaming query2 in scheduler pool2
spark.sparkContext.setLocalProperty("spark.scheduler.pool", "pool2")
df.writeStream.queryName("query2").format("orc").start(path2)
1 回答
使用spark调度程序池 . 下面是使用调度程序池运行多个查询的示例(一直到文章末尾,为了方便复制),同样的逻辑也适用于DStreams:https://docs.databricks.com/spark/latest/structured-streaming/production.html