Elasticsearch 部分匹配
时间:2017-12-16 20:59:55
参考:
环境:
- 系统版本:Win10 64位
- Elasticsearch版本:6.0.0
部分匹配#
- 前缀匹配
- 实时搜索,根据输入信息实时搜索
- 长组合词
构造数据#
- 
指定字段数据类型以及分析器 PUT /my_index/ { "settings": { "number_of_shards": 1, "analysis": { "filter": { "autocomplete_filter": { "type": "edge_ngram", "min_gram": 1, "max_gram": 20 } }, "analyzer": { "autocomplete": { "type": "custom", "tokenizer": "standard", "filter": [ "lowercase", "autocomplete_filter" ] } } } }, "mappings": { "address": { "properties": { "postcode": { "type": "keyword" }, "name": { "type": "text", "analyzer": "autocomplete" } } } } }注: autocomplete分析器会把单词按如下形式分割quick会分解成qququiquicquick。
- 
插入数据 POST /my_index/address/_bulk { "index": { "_id": 1 }} { "postcode": "W1V 3DG","title": "Sue ate the alligator", "name": "Brown foxes"} { "index": { "_id": 2 }} { "postcode": "W2F 8HW" ,"title": "The alligator ate Sue", "name":"Kobe Briant" } { "index": { "_id": 3 }} { "postcode": "WC1N 1LZ","title" : "Sue never goes anywhere without her alligator skin purse", "name": "Yellow furballs"} { "index": { "_id": 3 }} { "postcode": "SW5 0BE","title":"The alligator eat Sue", "name":"Sun Feilong" }1数据含义:这些数据是美国邮编,以 `W1V 3DG` 为例:W1V: 这是邮编的外部,它定义了邮件的区域和行政区: W 代表区域( 1 或 2 个字母) 
 1V 代表行政区( 1 或 2 个数字,可能跟着一个字符)3DG: 内部定义了街道或建筑: 3 代表街区区块( 1 个数字) 
 DG 代表单元( 2 个字母)
查询示例#
- 
前缀查询 prefix。 根据前缀进行匹配。
- 
模糊查询 wildcard。下面查询匹配以W开头,后面一个任意字符,后面是F,再后面是任意个任意字符后面跟着 HW 的数据。
- 
正则查询 regexp。下面查询匹配以W开头,后面跟一个数字,后面至少有一个任意字符的数据,W1不会匹配。
- 
短语前缀匹配 match_phrase_prefix。 下面查询会匹配 title 包含the alligator at*的内容。
- 
短语前缀匹配 match_phrase_prefix指定slop,。只有最后一个词才能当前缀使用。GET /my_index/address/_search { "query": { "match_phrase_prefix": { "title": { "query": "The alligator at", "slop" :10 } } } }查询结果: "hits": [ { "_index": "my_index", "_type": "address", "_id": "2", "_score": 0.8630463, "_source": { "postcode": "W2F 8HW", "title": "The alligator ate Sue" } }, { "_index": "my_index", "_type": "address", "_id": "1", "_score": 0.3273624, "_source": { "postcode": "W1V 3DG", "title": "Sue ate the alligator" } } ]
- 
短语前缀匹配 match_phrase_prefix指定slop,同时限制前缀查询查询数据量。
- 
n-grams边界查询。对单词进行特殊的切分,如quick会分解成qququiquicquick。查询结果: "hits": [ { "_index": "my_index", "_type": "address", "_id": "1", "_score": 2.0325346, "_source": { "postcode": "W1V 3DG", "title": "Sue ate the alligator", "name": "Brown foxes" } }, { "_index": "my_index", "_type": "address", "_id": "2", "_score": 1.2379456, "_source": { "postcode": "W2F 8HW", "title": "The alligator ate Sue", "name": "Kobe Briant" } }, { "_index": "my_index", "_type": "address", "_id": "3", "_score": 0.5361201, "_source": { "postcode": "SW5 0BE", "title": "The alligator eat Sue", "name": "Sun Feilong" } } ]匹配结果和预期的不一样,是因为我们查询参数也是用了 autocomplete分析器,使用下面的语句进行验证:json GET /my_index/address/_validate/query?explain { "query": { "match": { "name": "brown fo" } } }验证结果: "explanations": [ { "index": "my_index", "valid": true, "explanation": "+(Synonym(name:b name:br name:bro name:brow name:brown) Synonym(name:f name:fo)) #*:*" } ]为了查找到想要的结果,我们可以在查询时指定其他分析器或者再定义索引的时候,配置相应的查询解析器 方案一:查询时,指定查询参数的分析器 GET /my_index/address/_search { "query": { "match": { "name": { "query": "brown fo", "analyzer": "standard" } } } }方案二(推荐):创建索引时指定对应的查询分析器。 
- 
解决长组合词问题,长组合词时多个单词包含连接起来组成的一个单词,例如 MyNameIsSunFeiLong在分词的时候,会被分割成一个完整的单词,如果查询name或其包含的任意一个单词,并不会与之匹配,我们可以通过分词器把长词组合分割成一段段的单词,如mynynanamame... ,我们查询的词也被分割成namame,这样就能和结果匹配,为了达到这种效果,我们需要自定义的分析器。定义分析器: PUT /my_index { "settings": { "analysis": { "filter": { "trigrams_filter": { "type": "ngram", "min_gram": 3, "max_gram": 3 } }, "analyzer": { "trigrams": { "type": "custom", "tokenizer": "standard", "filter": [ "lowercase", "trigrams_filter" ] } } } }, "mappings": { "my_type": { "properties": { "text": { "type": "text", "analyzer": "trigrams" } } } } }插入数据 POST /my_index/my_type/_bulk { "index": { "_id": 1 }} { "text": "Aussprachewörterbuch" } { "index": { "_id": 2 }} { "text": "Militärgeschichte" } { "index": { "_id": 3 }} { "text": "Weißkopfseeadler" } { "index": { "_id": 4 }} { "text": "Weltgesundheitsorganisation" } { "index": { "_id": 5 }} { "text": "Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz" }测试查询: 查询结果: