首页 文章

Rails Searchkick has_many索引和搜索

提问于
浏览
3

我可以通过customer_id,姓名,姓氏和孩子ID,姓名,姓氏和“生日”进行搜索

按id搜索必须是准确的,它是 . 按姓名或姓氏搜索的拼写错误与距离2有效 . 但是我想通过kid_birthdate搜索匹配完全(拼错,距离0)

到目前为止,每当我通过birthdate搜索时,结果都会像拼写错误的距离2一样返回 . 我不知道如何搜索确切的日期 .

Rails 5.1.0.rc1

elasticsearch-5.0.3

searchkick-2.2.0

class Customer < ActiveRecord::Base
  include Searchable

  def search_data
    attributes.merge avatar_url: avatar.url, kids: kids
  end

  has_many :kids
  ...
end

class Kid < ActiveRecord::Base
    belongs_to :customer

    def reindex_customer
        customer.reindex async: true
    end 
    ...
end      

module Searchable
  extend ActiveSupport::Concern

  included do
    SEARCH_RESULTS_PER_PAGE = 10

    def self.elastic_search(query, opts = { page: 1 })
      # This regex accept string that contains digits or dates
      regexp = /(\d+)|(^(0[1-9]|1\d|2\d|3[01])-(0[1-9]|1[0-2])-(19|20)\d{2}$)/
      distance = query.match?(regexp) ? 0 : 2 #This is for calculate the distance for misspelling 0 for digits and dates and 2 for strings
      options = { load: false,
                  match: :word_middle,
                  misspellings: { edit_distance: distance },
                  per_page: SEARCH_RESULTS_PER_PAGE,
                  page: opts[:page] }
      search query, options
    end
  end
end

我的索引包含她/他孩子数据的客户数据 . 孩子们嵌套在她/他的父母顾客之下 . 如何强制搜索日期的精确匹配

对于此查询:

curl http://localhost:9200/customers_development/_search?pretty -d '{"query":{"dis_max":{"queries":[{"match":{"_all":{"query":"28388","boost":10,"operator":"and","analyzer":"searchkick_search"}}},{"match":{"_all":{"query":"28388","boost":10,"operator":"and","analyzer":"searchkick_search2"}}},{"match":{"_all":{"query":"28388","boost":1,"operator":"and","analyzer":"searchkick_search","fuzziness":0,"prefix_length":0,"max_expansions":3,"fuzzy_transpositions":true}}},{"match":{"_all":{"query":"28388","boost":1,"operator":"and","analyzer":"searchkick_search2","fuzziness":0,"prefix_length":0,"max_expansions":3,"fuzzy_transpositions":true}}}]}},"size":10,"from":0,"timeout":"11s"}'

这是索引的外观:

{
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 97.29381,
    "hits": [
      {
        "_index": "customers_development_20170913145033808",
        "_type": "customer",
        "_id": "28388",
        "_score": 97.29381,
        "_source": {
          "id": 28388,
          "created_at": "2017-07-10T19:49:43.856Z",
          "updated_at": "2017-09-13T03:01:51.727Z",
          "name": "Linda",
          "lastname": "Schott",
          "email": "linda.schott@web.de",
          "avatar": null,
          "phone": null,
          "mobile": null,
          "erster_kontakt": null,
          "memo": null,
          "brief_title": null,
          "newsletter": null,
          "avatar_url": "/no_customer.png",
          "kids": [
            {
              "id": 34229,
              "name": "Jakob",
              "lastname": "Schott",
              "birthdate": "2013-03-22",
              "age": "4,5",
              "avatar": {
                "url": "/avatars/kid/34229/Jellyfish.png",
                "thumb": {
                  "url": "/avatars/kid/34229/thumb_Jellyfish.png"
                }
              },
              "created_at": "2017-07-10T19:50:16.058Z",
              "updated_at": "2017-09-13T03:02:52.962Z",
              "customer_id": 28388,
              "member": null,
              "year_certified": null,
              "zahlart": null,
              "tn_merge_markiert": null,
              "family": null,
              "medal": "black",
              "score": 30,
              "current_level": "swimmys"
            },
            {
              "id": 34228,
              "name": "Lilith",
              "lastname": "Schott",
              "birthdate": "2013-03-22",
              "age": "4,5",
              "avatar": {
                "url": "/avatars/kid/34228/Penguins.png",
                "thumb": {
                  "url": "/avatars/kid/34228/thumb_Penguins.png"
                }
              },
              "created_at": "2017-07-10T19:50:16.058Z",
              "updated_at": "2017-09-13T03:02:52.962Z",
              "customer_id": 28388,
              "member": null,
              "year_certified": null,
              "zahlart": null,
              "tn_merge_markiert": null,
              "family": null,
              "medal": "green",
              "score": 17,
              "current_level": "beginner"
            },
            {
              "id": 27718,
              "name": "Johanna",
              "lastname": "Plischke",
              "birthdate": "2010-12-29",
              "age": "6,8",
              "avatar": {
                "url": "/avatars/kid/27718/Koala.png",
                "thumb": {
                  "url": "/avatars/kid/27718/thumb_Koala.png"
                }
              },
              "created_at": "2017-07-10T19:50:16.034Z",
              "updated_at": "2017-09-13T04:01:15.261Z",
              "customer_id": 28388,
              "member": null,
              "year_certified": null,
              "zahlart": null,
              "tn_merge_markiert": null,
              "family": null,
              "medal": "red",
              "score": 27,
              "current_level": ""
            }
          ]
        }
      }
    ]
  }
}

1 回答

  • 0

    让我们分析一下查询的部分:

    "match":{
        "_all":{
            "query":"28388",
            "boost":1,
            "operator":"and",
            "analyzer":"searchkick_search",
            "fuzziness":0,
            "prefix_length":0,
            "max_expansions":3,
            "fuzzy_transpositions":true
        }
    }
    

    _all

    你说你的 kids 是嵌套字段,但你只是搜索 _all ,所以我们应该首先明确的是 _all 是否包含在_1454952中 .

    正如document所说:

    为嵌套对象中的所有属性设置默认的include_in_all值 . 嵌套文档没有自己的_all字段 . 而是将值添加到主“根”文档的_all字段中 .

    因此,第一个问题是索引嵌套类型是否已将 include_in_all 设置为 false ,这使得嵌套字段无法通过 _all 进行搜索 .

    嵌套查询

    或者您可以选择嵌套查询来查询嵌套对象:

    GET /_search
    {
        "query": {
            "nested" : {
                "path" : "kids",
                "score_mode" : "avg",
                "query" : {
                    "query_string": {
                      "fields": ["kids.birthdate"],
                      "query": "xxx"
                    } 
                }
            }
        }
    }
    

    模糊

    在拼写错误时,Elasticsearch建议我们使用模糊查询:

    GET /_search
    {
        "query": {
            "fuzzy" : {
                "name" : {
                    "value" :         "xxx",
                     "boost" :         1.0,
                     "fuzziness" :     2,
                     "prefix_length" : 0,
                     "max_expansions": 100
                }
            }
        }
    }
    

    组合查询

    最后,我们可以使用 bool 查询来组合它们:

    POST _search
    {
      "query": {
        "bool" : {
          "must" : [{
                "nested" : {
                    "path" : "kids",
                    "query" : {
                        "query_string": {
                          "fields": ["kids.birthdate"],
                          "query": "xxx"
                        } 
                    }
                }            
           },
            {  "fuzzy" : {
                    "name" : {
                        "value" :         "xxx",
                         "boost" :         1.0,
                         "fuzziness" :     2,
                         "prefix_length" : 0,
                         "max_expansions": 100
                    }
               }
           }]
        }
      }
    }
    

    我不熟悉Ruby,所以我可以提供帮助 . 希望有帮助 .

相关问题