我是靠谱客的博主 开朗过客,这篇文章主要介绍Elasticsearch实用的23个查询示例,现在分享给大家,希望可以做个参考。

ElasticSearch是一个基于Lucene的搜索服务器,它是用Java开发的,并作为Apache许可条款下的开放源码发布,是当前流行的企业级搜索引擎。本文介绍了几种常用的Elasticsearch查询方式,并分别进行了举例,希望它们对你有帮助。(注:文章翻译自Tim Ojo的23 Useful Elasticsearch Example Queries。若有翻译不到位的地方,欢迎大家进行指正。喜欢的也不要忘了打赏、点赞、收藏哦:))

为了介绍Elasticsearch中的不同查询类型,我们将对带有下列字段的文档进行搜索:title(标题),authors(作者),summary(摘要),release date(发布时间)以及number of reviews(评论数量)。
首先,让我们创建一个新的索引,并通过bulk API查询文档:

复制代码
1
2
3
PUT /bookdb_index     { "settings": { "number_of_shards": 1 }}
复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
POST /bookdb_index/book/_bulk     { "index": { "_id": 1 }}     { "title": "Elasticsearch: The Definitive Guide", "authors": ["clinton gormley", "zachary tong"], "summary" : "A distibuted real-time search and analytics engine", "publish_date" : "2015-02-07", "num_reviews": 20, "publisher": "oreilly" }     { "index": { "_id": 2 }}     { "title": "Taming Text: How to Find, Organize, and Manipulate It", "authors": ["grant ingersoll", "thomas morton", "drew farris"], "summary" : "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization", "publish_date" : "2013-01-24", "num_reviews": 12, "publisher": "manning" }     { "index": { "_id": 3 }}     { "title": "Elasticsearch in Action", "authors": ["radu gheorge", "matthew lee hinman", "roy russo"], "summary" : "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms", "publish_date" : "2015-12-03", "num_reviews": 18, "publisher": "manning" }     { "index": { "_id": 4 }}     { "title": "Solr in Action", "authors": ["trey grainger", "timothy potter"], "summary" : "Comprehensive guide to implementing a scalable search engine using Apache Solr", "publish_date" : "2014-04-05", "num_reviews": 23, "publisher": "manning" }

举例

基本匹配查询

有两种方式执行基本全文(匹配)查询:使用Search Lite API,它将搜索参数作为URL的一部分传递;使用完整的JSON请求消息体,它允许你使用完整的Elasticsearch DSL。

以下是基本的匹配查询,在所有字段中查询字符串“guide”:

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
GET /bookdb_index/book/_search?q=guide [Results]"hits": [     {         "_index": "bookdb_index",         "_type": "book",         "_id": "1",         "_score": 0.28168046,         "_source": {             "title": "Elasticsearch: The Definitive Guide",             "authors": [                 "clinton gormley",                 "zachary tong"             ],             "summary": "A distibuted real-time search and analytics engine",             "publish_date": "2015-02-07",             "num_reviews": 20,             "publisher": "manning"         }     },     {         "_index": "bookdb_index",         "_type": "book",         "_id": "4",         "_score": 0.24144039,         "_source": {             "title": "Solr in Action",             "authors": [                 "trey grainger",                 "timothy potter"             ],             "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",             "publish_date": "2014-04-05",             "num_reviews": 23,             "publisher": "manning"         }     } ]

这个查询的完整消息体如下,它产生的结果与上述查询相同:

复制代码
1
2
3
4
5
6
7
8
9
10
{     "query": {         "multi_match": {             "query": "guide",             "fields": [                 "_all"             ]         }     } }

作为对多个字段运行相同查询的简便方法,multi_match关键字可以用在match关键字的位置。fields属性指定要查询的字段,在这种情况下,我们要对文档中的所有字段进行查询。

两种API都允许你指定你想查询的字段。比如,指定搜索标题字段中含“in Action”的图书:

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
GET /bookdb_index/book/_search?q=title:in action [Results ]"hits": [     {         "_index": "bookdb_index",         "_type": "book",         "_id": "4",         "_score": 0.6259885,         "_source": {             "title": "Solr in Action",             "authors": [                 "trey grainger",                 "timothy potter"             ],             "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",             "publish_date": "2014-04-05",             "num_reviews": 23,             "publisher": "manning"         }     },     {         "_index": "bookdb_index",         "_type": "book",         "_id": "3",         "_score": 0.5975345,         "_source": {             "title": "Elasticsearch in Action",             "authors": [                 "radu gheorge",                 "matthew lee hinman",                 "roy russo"             ],             "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",             "publish_date": "2015-12-03",             "num_reviews": 18,             "publisher": "manning"         }     } ]

然而,完整的DSL能提供更大的灵活性,让你可以创建更复杂的查询(我们在下文会提到)以及指定查询结果的返回方式。在下列示例中,我们指定了要返回的结果数量、偏移位置(对分页有用)、要返回的文档字段和高亮显示的项。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
POST /bookdb_index/book/_search {     "query": {         "match": {             "title": "in action"         }     },     "size": 2,     "from": 0,     "_source": [         "title",         "summary",         "publish_date"     ],     "highlight": {         "fields": {             "title": {}         }     } } [Results ]"hits": {     "total": 2,     "max_score": 0.9105287,     "hits": [         {             "_index": "bookdb_index",             "_type": "book",             "_id": "3",             "_score": 0.9105287,             "_source": {                 "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",                 "title": "Elasticsearch in Action",                 "publish_date": "2015-12-03"             },             "highlight": {                 "title": [                     "Elasticsearch <em>in</em> <em>Action</em>"                 ]             }         },         {             "_index": "bookdb_index",             "_type": "book",             "_id": "4",             "_score": 0.9105287,             "_source": {                 "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",                 "title": "Solr in Action",                 "publish_date": "2014-04-05"             },             "highlight": {                 "title": [                     "Solr <em>in</em> <em>Action</em>"                 ]             }         }     ] }

注:对于多词(multi-word)查询,相应的匹配(match)查询允许你指定是否使用and运算符,而不是默认使用or运算符。你也可以指定minimum_should_match选项来调整返回结果的相关性。详细信息可以在Elasticsearch指南中找到。

多字段查询

为了在一次查询中查找多个字段(如,在title和summary中查找相同的字符串),你使用了multi_match查询:

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
POST /bookdb_index/book/_search {     "query": {         "multi_match": {             "query": "elasticsearch guide",             "fields": [                 "title",                 "summary"             ]         }     } } [Results ]"hits": {     "total": 3,     "max_score": 0.9448582,     "hits": [         {             "_index": "bookdb_index",             "_type": "book",             "_id": "1",             "_score": 0.9448582,             "_source": {                 "title": "Elasticsearch: The Definitive Guide",                 "authors": [                     "clinton gormley",                     "zachary tong"                 ],                 "summary": "A distibuted real-time search and analytics engine",                 "publish_date": "2015-02-07",                 "num_reviews": 20,                 "publisher": "manning"             }         },         {             "_index": "bookdb_index",             "_type": "book",             "_id": "3",             "_score": 0.17312013,             "_source": {                 "title": "Elasticsearch in Action",                 "authors": [                     "radu gheorge",                     "matthew lee hinman",                     "roy russo"                 ],                 "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",                 "publish_date": "2015-12-03",                 "num_reviews": 18,                 "publisher": "manning"             }         },         {             "_index": "bookdb_index",             "_type": "book",             "_id": "4",             "_score": 0.14965448,             "_source": {                 "title": "Solr in Action",                 "authors": [                     "trey grainger",                     "timothy potter"                 ],                 "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",                 "publish_date": "2014-04-05",                 "num_reviews": 23,                 "publisher": "manning"             }         }     ] }

注:上面的查询匹配了3个结果,因为单词“guide”在summary(摘要)中有出现。

Boosting 算法

有时候,我们在多个字段中进行搜索,可能会希望提高某个字段中的权重。如,在下列设计示例中,我们将summary字段的权重提高三倍,以提高这个字段的重要性,从而增强文档 _id 4的相关性。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
POST /bookdb_index/book/_search {     "query": {         "multi_match": {             "query": "elasticsearch guide",             "fields": [                 "title",                 "summary^3"             ]         }     },     "_source": [         "title",         "summary",         "publish_date"     ] } [Results ]"hits": [     {         "_index": "bookdb_index",         "_type": "book",         "_id": "1",         "_score": 0.31495273,         "_source": {             "summary": "A distibuted real-time search and analytics engine",             "title": "Elasticsearch: The Definitive Guide",             "publish_date": "2015-02-07"         }     },     {         "_index": "bookdb_index",         "_type": "book",         "_id": "4",         "_score": 0.14965448,         "_source": {             "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",             "title": "Solr in Action",             "publish_date": "2014-04-05"         }     },     {         "_index": "bookdb_index",         "_type": "book",         "_id": "3",         "_score": 0.13094766,         "_source": {             "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",             "title": "Elasticsearch in Action",             "publish_date": "2015-12-03"         }     } ]

注:Boosting并不意味着计算的权重会被boost因子翻倍。实际的boost值会进行一些规范化和内部优化。想了解更多boost工作原理的信息,可参考Elasticsearch指南。

Bool 查询

为获得更具相关性和更具体的查询结果,AND / OR / NOT运算符可在我们的搜索查询进行微调。这在搜索API中作为bool查询实现。bool查询接受must参数(等效于AND),must_not参数(等效于NOT),should参数(等效于OR)。比如,我想查询标题中带有“Elasticsearch” 或(OR) “Solr”的书,并且(AND)是由“clinton gormley”创作,而不是(NOT) “radu gheorge”。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
POST /bookdb_index/book/_search {     "query": {         "bool": {             "must": {                 "bool": {                     "should": [                         {                             "match": {                                 "title": "Elasticsearch"                             }                         },                         {                             "match": {                                 "title": "Solr"                             }                         }                     ]                 }             },             "must": {                 "match": {                     "authors": "clinton gormely"                 }             },             "must_not": {                 "match": {                     "authors": "radu gheorge"                 }             }         }     } } [Results ]"hits": [     {         "_index": "bookdb_index",         "_type": "book",         "_id": "1",         "_score": 0.3672021,         "_source": {             "title": "Elasticsearch: The Definitive Guide",             "authors": [                 "clinton gormley",                 "zachary tong"             ],             "summary": "A distibuted real-time search and analytics engine",             "publish_date": "2015-02-07",             "num_reviews": 20,             "publisher": "oreilly"         }     } ]

注:如你所见,bool查询囊括所有其他的搜索类型,包括其他类型的bool查询,以构建复杂和深层嵌套的查询体系。

模糊查询

模糊匹配可以在匹配和多重匹配查询上启用以捕获拼写错误。模糊程度由原始词之间的Levenshtein距离决定。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
POST /bookdb_index/book/_search {     "query": {         "multi_match": {             "query": "comprihensiv guide",             "fields": [                 "title",                 "summary"             ],             "fuzziness": "AUTO"         }     },     "_source": [         "title",         "summary",         "publish_date"     ],     "size": 1 } [Results ]"hits": [     {         "_index": "bookdb_index",         "_type": "book",         "_id": "4",         "_score": 0.5961596,         "_source": {             "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",             "title": "Solr in Action",             "publish_date": "2014-04-05"         }     } ]

注:当术语长度大于5个字符时,"AUTO"的模糊值等同于指定值“2”。但是,80%的人类拼写错误的编辑距离为1,所以,将模糊值设置为“1”可能会提高您的整体搜索性能。更多详细信息,请参阅Elasticsearch指南中的“排版和拼写错误”(Typos and Misspellings)章节。

通配符查询

通配符查询允许你指定匹配的模式,而不是整个术语。? 匹配任何字符,*匹配零个或多个字符。例如,要查找名称以字母't'开头的所有作者的记录:

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
POST /bookdb_index/book/_search {     "query": {         "wildcard": {             "authors": "t*"         }     },     "_source": [         "title",         "authors"     ],     "highlight": {         "fields": {             "authors": {}         }     } } [Results ]"hits": [     {         "_index": "bookdb_index",         "_type": "book",         "_id": "1",         "_score": 1,         "_source": {             "title": "Elasticsearch: The Definitive Guide",             "authors": [                 "clinton gormley",                 "zachary tong"             ]         },         "highlight": {             "authors": [                 "zachary <em>tong</em>"             ]         }     },     {         "_index": "bookdb_index",         "_type": "book",         "_id": "2",         "_score": 1,         "_source": {             "title": "Taming Text: How to Find, Organize, and Manipulate It",             "authors": [                 "grant ingersoll",                 "thomas morton",                 "drew farris"             ]         },         "highlight": {             "authors": [                 "<em>thomas</em> morton"             ]         }     },     {         "_index": "bookdb_index",         "_type": "book",         "_id": "4",         "_score": 1,         "_source": {             "title": "Solr in Action",             "authors": [                 "trey grainger",                 "timothy potter"             ]         },         "highlight": {             "authors": [                 "<em>trey</em> grainger",                 "<em>timothy</em> potter"             ]         }     } ]

正则查询

正则查询允许你指定比通配符查询更复杂的查询模式。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
POST /bookdb_index/book/_search {    "query": {        "regexp" : {            "authors" : "t[a-z]*y"         }     },    "_source": ["title", "authors"],    "highlight": {        "fields" : {            "authors" : {}         }     } } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "4",        "_score": 1,        "_source": {          "title": "Solr in Action",          "authors": [            "trey grainger",            "timothy potter"           ]         },        "highlight": {          "authors": [            "<em>trey</em> grainger",            "<em>timothy</em> potter"           ]         }       }     ]

匹配短语查询

匹配短语查询要求查询字符串中的所有字词都在文档中存在,要遵循查询字符串的指定顺序还要彼此接近。默认情况下,术语要求彼此相同,但你可以指定slop值,进行文档匹配时,该值可以指定词的距离。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
POST /bookdb_index/book/_search {    "query": {        "multi_match" : {            "query": "search engine",            "fields": ["title", "summary"],            "type": "phrase",            "slop": 3         }     },    "_source": [ "title", "summary", "publish_date" ] } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "4",        "_score": 0.22327082,        "_source": {          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",          "title": "Solr in Action",          "publish_date": "2014-04-05"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "1",        "_score": 0.16113183,        "_source": {          "summary": "A distibuted real-time search and analytics engine",          "title": "Elasticsearch: The Definitive Guide",          "publish_date": "2015-02-07"         }       }     ]

注:在上述例子中,对于非短语类型查询,文档_id 1通常会以较高的权重出现在文档_id 4之前,因为其字段长度更加短。然而,作为短语查询,术语的接近度也需要考虑在内,因此文档_id 4权重会更高。

匹配短语前缀查询

匹配短语前缀查询在查询时提供“自动搜索”功能(search-as-you-type)或者说词穷时的自动补充功能,你无需以任何方式准备数据。和match_phrase查询一样,它接受slop参数,使得字的顺序和相对位置的调整不那么死板。它还接受max_expansions参数,以限制匹配的术语数量,减少资源强度。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
POST /bookdb_index/book/_search {    "query": {        "match_phrase_prefix" : {            "summary": {                "query": "search en",                "slop": 3,                "max_expansions": 10             }         }     },    "_source": [ "title", "summary", "publish_date" ] } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "4",        "_score": 0.5161346,        "_source": {          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",          "title": "Solr in Action",          "publish_date": "2014-04-05"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "1",        "_score": 0.37248808,        "_source": {          "summary": "A distibuted real-time search and analytics engine",          "title": "Elasticsearch: The Definitive Guide",          "publish_date": "2015-02-07"         }       }     ]

注:查询时(query-time)搜索类型具有性能成本。 所以你可以选择将索引时(index-time)搜索作为搜索类型。更多详情,请查看Completion Suggester API或使用Edge-Ngram filters获取。

查询字符串查询

查询字符串查询提供了以简明的速记语法执行multi_match查询,bool查询,boosting查询,模糊匹配查询,通配符查询,regexp和范围查询的方法。下面示例中,我对“search algorithm”执行了模糊查询,其中一本书的作者是“grant ingersoll” 或 “tom morton”,我对所有字段都进行查询,但在summary字段,boost值设为“2”。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
POST /bookdb_index/book/_search {    "query": {        "query_string" : {            "query": "(saerch~1 algorithm~1) AND (grant ingersoll)  OR (tom morton)",            "fields": ["_all", "summary^2"]         }     },    "_source": [ "title", "summary", "authors" ],    "highlight": {        "fields" : {            "summary" : {}         }     } } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "2",        "_score": 0.14558059,        "_source": {          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",          "title": "Taming Text: How to Find, Organize, and Manipulate It",          "authors": [            "grant ingersoll",            "thomas morton",            "drew farris"           ]         },        "highlight": {          "summary": [            "organize text using approaches such as full-text <em>search</em>, proper name recognition, clustering, tagging, information extraction, and summarization"           ]         }       }

简单查询字符串查询

简单查询字符串(simple_query_string)查询是字符串(query_string)查询的一个版本,更适合用户在单个搜索框中使用。它分别用+ / | / - 替换AND / OR / NOT的使用,并且自动过滤掉查询的无效部分,而不是在用户犯错误时抛出异常。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
POST /bookdb_index/book/_search {    "query": {        "simple_query_string" : {            "query": "(saerch~1 algorithm~1) + (grant ingersoll)  | (tom morton)",            "fields": ["_all", "summary^2"]         }     },    "_source": [ "title", "summary", "authors" ],    "highlight": {        "fields" : {            "summary" : {}         }     } }

术语查询

以上都是全文搜索的例子。但是有些盆友对结构化搜索更感兴趣,希望在其中找到完全匹配并返回结果。这时,术语查询便可以帮到我们。在下面例子中,我们将搜索Manning Publications出版的所有书籍。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
POST /bookdb_index/book/_search {    "query": {        "term" : {            "publisher": "manning"         }     },    "_source" : ["title","publish_date","publisher"] } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "2",        "_score": 1.2231436,        "_source": {          "publisher": "manning",          "title": "Taming Text: How to Find, Organize, and Manipulate It",          "publish_date": "2013-01-24"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "3",        "_score": 1.2231436,        "_source": {          "publisher": "manning",          "title": "Elasticsearch in Action",          "publish_date": "2015-12-03"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "4",        "_score": 1.2231436,        "_source": {          "publisher": "manning",          "title": "Solr in Action",          "publish_date": "2014-04-05"         }       }     ]

可以使用术语关键字来指定多个术语,并传入搜索术语数组。

复制代码
1
2
3
4
5
6
7
{    "query": {        "terms" : {            "publisher": ["oreilly", "packt"]         }     } }

术语查询——排序

术语查询结果(与所有其他查询结果一样)可以轻松排序, 也允许多级排序:

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
POST /bookdb_index/book/_search {    "query": {        "term" : {            "publisher": "manning"         }     },    "_source" : ["title","publish_date","publisher"],    "sort": [         { "publish_date": {"order":"desc"}},         { "title": { "order": "desc" }}     ] } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "3",        "_score": null,        "_source": {          "publisher": "manning",          "title": "Elasticsearch in Action",          "publish_date": "2015-12-03"         },        "sort": [          1449100800000,          "in"         ]       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "4",        "_score": null,        "_source": {          "publisher": "manning",          "title": "Solr in Action",          "publish_date": "2014-04-05"         },        "sort": [          1396656000000,          "solr"         ]       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "2",        "_score": null,        "_source": {          "publisher": "manning",          "title": "Taming Text: How to Find, Organize, and Manipulate It",          "publish_date": "2013-01-24"         },        "sort": [          1358985600000,          "to"         ]       }     ]

范围查询

另一个结构化查询示例是范围查询。 在此示例中,我们将搜索在2015年出版的图书:

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
POST /bookdb_index/book/_search {    "query": {        "range" : {            "publish_date": {                "gte": "2015-01-01",                "lte": "2015-12-31"             }         }     },    "_source" : ["title","publish_date","publisher"] } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "1",        "_score": 1,        "_source": {          "publisher": "oreilly",          "title": "Elasticsearch: The Definitive Guide",          "publish_date": "2015-02-07"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "3",        "_score": 1,        "_source": {          "publisher": "manning",          "title": "Elasticsearch in Action",          "publish_date": "2015-12-03"         }       }     ]

注:范围查询适用于日期,数字和字符串类型字段。

过滤查询

过滤查询允许您过滤查询的结果。 例如,我们要查询标题或摘要中包含术语“Elasticsearch”的书籍,但要求结果过滤到包含20条以上评论的书。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
POST /bookdb_index/book/_search {    "query": {        "filtered": {            "query" : {                "multi_match": {                    "query": "elasticsearch",                    "fields": ["title","summary"]                 }             },            "filter": {                "range" : {                    "num_reviews": {                        "gte": 20                     }                 }             }         }     },    "_source" : ["title","summary","publisher", "num_reviews"] } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "1",        "_score": 0.5955761,        "_source": {          "summary": "A distibuted real-time search and analytics engine",          "publisher": "oreilly",          "num_reviews": 20,          "title": "Elasticsearch: The Definitive Guide"         }       }     ]

注:过滤查询不要求过滤的查询的存在。如果没有指定查询,则运行match_all查询,它基本上能返回索引中的所有文档,然后对其进行过滤。 实际上,首先运行的是过滤器,这减少了需要查询的面积。 此外,过滤器在第一次使用后缓存,这能使它更高效。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
POST /bookdb_index/book/_search {    "query": {        "bool": {            "must" : {                "multi_match": {                    "query": "elasticsearch",                    "fields": ["title","summary"]                 }             },            "filter": {                "range" : {                    "num_reviews": {                        "gte": 20                     }                 }             }         }     },    "_source" : ["title","summary","publisher", "num_reviews"] }

这同样适用于下面示例中的过滤器。

多项过滤器

多项过滤器可以通过bool过滤器结合起来,在下一个示例中,过滤器指定返回的结果必须至少有20条评论,发布时间在2015年之后,并应由oreilly发布。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
POST /bookdb_index/book/_search {    "query": {        "filtered": {            "query" : {                "multi_match": {                    "query": "elasticsearch",                    "fields": ["title","summary"]                 }             },            "filter": {                "bool": {                    "must": {                        "range" : { "num_reviews": { "gte": 20 } }                     },                    "must_not": {                        "range" : { "publish_date": { "lte": "2014-12-31" } }                     },                    "should": {                        "term": { "publisher": "oreilly" }                     }                 }             }         }     },    "_source" : ["title","summary","publisher", "num_reviews", "publish_date"] } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "1",        "_score": 0.5955761,        "_source": {          "summary": "A distibuted real-time search and analytics engine",          "publisher": "oreilly",          "num_reviews": 20,          "title": "Elasticsearch: The Definitive Guide",          "publish_date": "2015-02-07"         }       }     ]

函数权重:字段值要素

可能有这样的情况,您希望将文档中特定字段的值考虑到相关性权重的计算中。 这在脚本中很常见,基于其受欢迎程度,你会希望boost文档的相关性。 在我们的例子中,我们希望更受欢迎的书(根据评论的数量判断)得到boost。 这就可能使用到field_value_factor函数权重:

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
POST /bookdb_index/book/_search {    "query": {        "function_score": {            "query": {                "multi_match" : {                    "query" : "search engine",                    "fields": ["title", "summary"]                 }             },            "field_value_factor": {                "field" : "num_reviews",                "modifier": "log1p",                "factor" : 2             }         }     },    "_source": ["title", "summary", "publish_date", "num_reviews"] } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "1",        "_score": 0.44831306,        "_source": {          "summary": "A distibuted real-time search and analytics engine",          "num_reviews": 20,          "title": "Elasticsearch: The Definitive Guide",          "publish_date": "2015-02-07"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "4",        "_score": 0.3718407,        "_source": {          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",          "num_reviews": 23,          "title": "Solr in Action",          "publish_date": "2014-04-05"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "3",        "_score": 0.046479136,        "_source": {          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",          "num_reviews": 18,          "title": "Elasticsearch in Action",          "publish_date": "2015-12-03"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "2",        "_score": 0.041432835,        "_source": {          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",          "num_reviews": 12,          "title": "Taming Text: How to Find, Organize, and Manipulate It",          "publish_date": "2013-01-24"         }       }     ]

注1:我们可以只运行一个常规的multi_match查询并按num_reviews字段排序,但是我们失去了获得相关性分值的好处。

注2:有许多额外的参数在原始相关性权重上增强boost的程度,比如“modifier”, “factor”,“boost_mode”等。这些在Elasticsearch指南中进行了详细探讨。

函数权重:关联功能递减函数

假设想要的不是让某个字段值按某种关联度递增,而是想让你关注的值按照同关联度递减。 这在基于lat / long,数字字段(如价格或日期)的boost中非常有用。 在下列示例中,我们要在“搜索引擎”上搜索于2014年6月发布的书籍。

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
POST /bookdb_index/book/_search {    "query": {        "function_score": {            "query": {                "multi_match" : {                    "query" : "search engine",                    "fields": ["title", "summary"]                 }             },            "functions": [                 {                    "exp": {                        "publish_date" : {                            "origin": "2014-06-15",                            "offset": "7d",                            "scale" : "30d"                         }                     }                 }             ],            "boost_mode" : "replace"         }     },    "_source": ["title", "summary", "publish_date", "num_reviews"] } [Results]"hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "4",        "_score": 0.27420625,        "_source": {          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",          "num_reviews": 23,          "title": "Solr in Action",          "publish_date": "2014-04-05"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "1",        "_score": 0.005920768,        "_source": {          "summary": "A distibuted real-time search and analytics engine",          "num_reviews": 20,          "title": "Elasticsearch: The Definitive Guide",          "publish_date": "2015-02-07"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "2",        "_score": 0.000011564,        "_source": {          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",          "num_reviews": 12,          "title": "Taming Text: How to Find, Organize, and Manipulate It",          "publish_date": "2013-01-24"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "3",        "_score": 0.0000059171475,        "_source": {          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",          "num_reviews": 18,          "title": "Elasticsearch in Action",          "publish_date": "2015-12-03"         }       }     ]

函数权重: 脚本权重

在内置评分函数不能满足您的需要的情况下,可以选择指定一个Groovy脚本用于评分。在我们的示例中,我们想要指定一个考虑发布日期的脚本,然后再决定评论数,因为新出版的书可能没有足够的评论数。

权重脚本如下所示:

复制代码
1
2
3
4
5
6
7
8
9
10
11
publish_date = doc['publish_date'].value num_reviews = doc['num_reviews'].valueif (publish_date > Date.parse('yyyy-MM-dd', threshold).getTime()) {   my_score = Math.log(2.5 + num_reviews) } else {   my_score = Math.log(1 + num_reviews) }return my_score

要想动态使用权重脚本,我们需要使用脚本权重参数:

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
POST /bookdb_index/book/_search {    "query": {        "function_score": {            "query": {                "multi_match" : {                    "query" : "search engine",                    "fields": ["title", "summary"]                 }             },            "functions": [                 {                    "script_score": {                        "params" : {                            "threshold": "2015-07-30"                         },                        "script": "publish_date = doc['publish_date'].value; num_reviews = doc['num_reviews'].value; if (publish_date > Date.parse('yyyy-MM-dd', threshold).getTime()) { return log(2.5 + num_reviews) }; return log(1 + num_reviews);"                     }                 }             ]         }     },    "_source": ["title", "summary", "publish_date", "num_reviews"] } [Results]"hits": {    "total": 4,    "max_score": 0.8463001,    "hits": [       {        "_index": "bookdb_index",        "_type": "book",        "_id": "1",        "_score": 0.8463001,        "_source": {          "summary": "A distibuted real-time search and analytics engine",          "num_reviews": 20,          "title": "Elasticsearch: The Definitive Guide",          "publish_date": "2015-02-07"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "4",        "_score": 0.7067348,        "_source": {          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",          "num_reviews": 23,          "title": "Solr in Action",          "publish_date": "2014-04-05"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "3",        "_score": 0.08952084,        "_source": {          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",          "num_reviews": 18,          "title": "Elasticsearch in Action",          "publish_date": "2015-12-03"         }       },       {        "_index": "bookdb_index",        "_type": "book",        "_id": "2",        "_score": 0.07602123,        "_source": {          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",          "num_reviews": 12,          "title": "Taming Text: How to Find, Organize, and Manipulate It",          "publish_date": "2013-01-24"         }       }     ]   }

注1:要使用动态脚本,必须在config / elasticsearch.yaml文件的Elasticsearch实例中激活。 当然,我们也可以使用存储在Elasticsearch服务器上的脚本。 更多相关信息,请参阅Elasticsearch参考文档。

注2:JSON不能包含嵌入的换行符,因此分号用来分隔语句。


最后

以上就是开朗过客最近收集整理的关于Elasticsearch实用的23个查询示例的全部内容,更多相关Elasticsearch实用内容请搜索靠谱客的其他文章。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(2176)

评论列表共有 0 条评论

立即
投稿
返回
顶部