Home Articles

CSV地理数据使用logstash作为geo_point类型进入elasticsearch

Asked
Viewed 621 times
3

下面是我用于最新版本的logstash和elasticsearch的问题的可重现示例 .

我使用logstash将地理空间数据从csv输入到elasticsearch作为geo_points .

CSV如下所示:

$ head simple_base_map.csv 
"lon","lat"
-1.7841,50.7408
-1.7841,50.7408
-1.78411,50.7408
-1.78412,50.7408
-1.78413,50.7408
-1.78414,50.7408
-1.78415,50.7408
-1.78416,50.7408
-1.78416,50.7408

我创建了一个如下所示的映射模板:

$ cat simple_base_map_template.json 
{
  "template": "base_map_template",
  "order":    1,
  "settings": {
    "number_of_shards": 1
  },

      "mappings": {
        "node_points" : {
          "properties" : {
            "location" : { "type" : "geo_point" }
          }
        }
      }
}

并有一个logstash配置文件,如下所示:

$ cat simple_base_map.conf 
input {
  stdin {}
}

filter {
  csv {
      columns => [
        "lon", "lat"
      ]
  }

  if [lon] == "lon" {
      drop { }
  } else {
      mutate {
          remove_field => [ "message", "host", "@timestamp", "@version"     ]
      }
       mutate {
          convert => { "lon" => "float" }
          convert => { "lat" => "float" }
          }

      mutate {
          rename => {
              "lon" => "[location][lon]"
              "lat" => "[location][lat]"
          }
      }
  }
}

output {
  stdout { codec => dots }
  elasticsearch {
      index => "base_map_simple"
      template => "simple_base_map_template.json"
      document_type => "node_points"
  }
}

然后我运行以下内容:

$cat simple_base_map.csv | logstash-2.1.3/bin/logstash -f simple_base_map.conf 
Settings: Default filter workers: 16
Logstash startup completed
....................................................................................................Logstash shutdown completed

但是当查看索引base_map_simple时,它表明文档中没有位置:geo_point类型...而且它将是lat和lon的两个双倍 .

$ curl -XGET 'localhost:9200/base_map_simple?pretty'
{
  "base_map_simple" : {
    "aliases" : { },
    "mappings" : {
      "node_points" : {
        "properties" : {
          "location" : {
            "properties" : {
              "lat" : {
                "type" : "double"
              },
              "lon" : {
                "type" : "double"
              }
            }
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1457355015883",
        "uuid" : "luWGyfB3ToKTObSrbBbcbw",
        "number_of_replicas" : "1",
        "number_of_shards" : "5",
        "version" : {
          "created" : "2020099"
        }
      }
    },
    "warmers" : { }
  }
}

我如何更改上述任何文件以确保它作为geo_point类型进入弹性搜索?

最后,我希望能够使用如下命令在geo_points上执行最近邻搜索:

curl -XGET 'localhost:9200/base_map_simple/_search?pretty' -d'
{
    "size": 1,
    "sort": {
   "_geo_distance" : {
       "location" : {
            "lat" : 50,
            "lon" : -1
        },
        "order" : "asc",
        "unit": "m"
   } 
    }
}'

谢谢

1 Answer

  • 3

    问题是,在 elasticsearch 输出中,您将索引命名为 base_map_simple ,而在模板中 template 属性为 base_map_template ,因此在创建新索引时未应用模板 . template 属性needs to somehow match正在创建的索引的名称,以便模板启动 .

    如果您只是将后者更改为 base_map_* ,即如下所示,它将起作用:

    {
      "template": "base_map_*",             <--- change this
      "order": 1,
      "settings": {
        "index.number_of_shards": 1
      },
      "mappings": {
        "node_points": {
          "properties": {
            "location": {
              "type": "geo_point"
            }
          }
        }
      }
    }
    

    UPDATE

    确保首先删除当前索引以及模板 .

    curl -XDELETE localhost:9200/base_map_simple
    curl -XDELETE localhost:9200/_template/logstash
    

Related