首页 文章

如何对嵌套字段进行源过滤

提问于
浏览
1

Sample document

{
 "id" : "video1",
  "title" : "Gone with the wind",
  "timedTextLines" : [ 
    {
      "startTime" : "00:00:02",
      "endTime" :  "00:00:05",
      "textLine" : "Frankly my dear I don't give a damn."
    },
   {
      "startTime" : "00:00:07",
      "endTime" :  "00:00:09",
      "textLine" : " my amazing country."
    },
 {
      "startTime" : "00:00:17",
      "endTime" :  "00:00:29",
      "textLine" : " amazing country."
    }
  ]
}

Index Definition

{
  "mappings": {
    "video_type": {
      "properties": {
        "timedTextLines": {
          "type": "nested" 
        }
      }
    }
  }
}

在内部工作中没有源过滤的响应很好 .

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.91737854,
    "hits": [
      {
        "_index": "video_index",
        "_type": "video_type",
        "_id": "1",
        "_score": 0.91737854,
        "_source": {

        },
        "inner_hits": {
          "timedTextLines": {
            "hits": {
              "total": 1,
              "max_score": 0.6296964,
              "hits": [
                {
                  "_nested": {
                    "field": "timedTextLines",
                    "offset": 0
                  },
                  "_score": 0.6296964,
                  "_source": {
                    "startTime": "00:00:02",
                    "endTime": "00:00:05",
                    "textLine": "Frankly my dear I don't give a damn."
                  },
                  "highlight": {
                    "timedTextLines.textLine": [
                      "Frankly my dear I don't give a <em>damn</em>."
                    ]
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

响应包含嵌套属性的所有属性 . 即startTime,endTime和textLine . 如何在响应中仅返回endtime和startTime?

Failed query

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": "gone"
          }
        },
        {
          "nested": {
            "path": "timedTextLines",
            "query": {
              "match": {
                "timedTextLines.textLine": "damn"
              }
            },
            "inner_hits": {
             "_source":["startTime","endTime"],
              "highlight": {
                "fields": {
                  "timedTextLines.textLine": {

                  }
                }
              }
            }
          }
        }
      ]
    }
  },
  "_source":"false"
}

Error HTTP / 1.1 400错误请求内容类型:application / json; charset = UTF-8内容长度:265

{“error”:{“root_cause”:[{“type”:“illegal_argument_exception”,“reason”:“[inner_hits] _source不支持类型值:START_ARRAY”}],“type”:“illegal_argument_exception” ,“reason”:“[inner_hits] _source不支持类型值:START_ARRAY”},“status”:400}

1 回答

  • 3

    原因是因为ES 5.0 inner_hits 中的 _source 不再支持短格式,而只支持完整的对象格式(带有 includesexcludes )(see this open issue

    您的查询可以像这样重写,它将起作用:

    {
      "query": {
        "bool": {
          "should": [
            {
              "match": {
                "title": "gone"
              }
            },
            {
              "nested": {
                "path": "timedTextLines",
                "query": {
                  "match": {
                    "timedTextLines.textLine": "damn"
                  }
                },
                "inner_hits": {
                 "_source": {
                    "includes":[
                      "timedTextLines.startTime",
                      "timedTextLines.endTime"
                    ]
                 },
                  "highlight": {
                    "fields": {
                      "timedTextLines.textLine": {
    
                      }
                    }
                  }
                }
              }
            }
          ]
        }
      },
      "_source":"false"
    }
    

相关问题