失眠网,内容丰富有趣,生活中的好帮手!
失眠网 > 白话Elasticsearch08-深度探秘搜索技术之基于boost的细粒度搜索条件权重控制

白话Elasticsearch08-深度探秘搜索技术之基于boost的细粒度搜索条件权重控制

时间:2022-11-21 00:46:19

相关推荐

白话Elasticsearch08-深度探秘搜索技术之基于boost的细粒度搜索条件权重控制

文章目录

概述boost示例

概述

继续跟中华石杉老师学习ES,第八篇

课程地址: /view/55

boost

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-boost.html

知识点:

如果给某个字段设置boost 为2 ,则意味着改字段的权重比其他的值的权重大一倍 。权重值默认为1

The boost is applied only for term queries (prefix, range and fuzzy queries are not boosted).

示例

数据如下:

{"_index": "forum","_type": "article","_id": "5","_score": 1,"_source": {"articleID": "DHJK-B-1395-#Ky5","userID": 3,"hidden": false,"postDate": "-05-01","tag": ["elasticsearch"],"tag_cnt": 1,"view_cnt": 10,"title": "this is spark blog"}},{"_index": "forum","_type": "article","_id": "2","_score": 1,"_source": {"articleID": "KDKE-B-9947-#kL5","userID": 1,"hidden": false,"postDate": "-01-02","tag": ["java"],"tag_cnt": 1,"view_cnt": 50,"title": "this is java blog"}},{"_index": "forum","_type": "article","_id": "4","_score": 1,"_source": {"articleID": "QQPX-R-3956-#aD8","userID": 2,"hidden": true,"postDate": "-01-02","tag": ["java","elasticsearch"],"tag_cnt": 2,"view_cnt": 80,"title": "this is java, elasticsearch, hadoop blog"}},{"_index": "forum","_type": "article","_id": "1","_score": 1,"_source": {"articleID": "XHDK-A-1293-#fJ3","userID": 1,"hidden": false,"postDate": "-01-01","tag": ["java","hadoop"],"tag_cnt": 2,"view_cnt": 30,"title": "this is java and elasticsearch blog"}},{"_index": "forum","_type": "article","_id": "3","_score": 1,"_source": {"articleID": "JODL-X-1937-#pV7","userID": 2,"hidden": false,"postDate": "-01-01","tag": ["hadoop"],"tag_cnt": 1,"view_cnt": 100,"title": "this is elasticsearch blog"}}

需求: 搜索标题中必须包含blog的帖子,同时如果标题中包含java或elasticsearch或hadoop或spark也要搜索出来,同时如果一个帖子包含spark,包含spark的帖子要优先其他帖子搜索出来

需求实现DSL如下:

GET /forum/article/_search{"query": {"bool": {"must": {"match": {"title": "blog"}},"should": [{"match": {"title": {"query": "java"}}},{"match": {"title": {"query": "elasticsearch"}}},{"match": {"title": {"query": "hadoop"}}},{"match": {"title": {"query": "spark","boost": 5}}}]}}}

返回结果 :

{"took": 5,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 5,"max_score": 1.7260925,"hits": [{"_index": "forum","_type": "article","_id": "5","_score": 1.7260925,"_source": {"articleID": "DHJK-B-1395-#Ky5","userID": 3,"hidden": false,"postDate": "-05-01","tag": ["elasticsearch"],"tag_cnt": 1,"view_cnt": 10,"title": "this is spark blog"}},{"_index": "forum","_type": "article","_id": "4","_score": 1.6185135,"_source": {"articleID": "QQPX-R-3956-#aD8","userID": 2,"hidden": true,"postDate": "-01-02","tag": ["java","elasticsearch"],"tag_cnt": 2,"view_cnt": 80,"title": "this is java, elasticsearch, hadoop blog"}},{"_index": "forum","_type": "article","_id": "1","_score": 0.8630463,"_source": {"articleID": "XHDK-A-1293-#fJ3","userID": 1,"hidden": false,"postDate": "-01-01","tag": ["java","hadoop"],"tag_cnt": 2,"view_cnt": 30,"title": "this is java and elasticsearch blog"}},{"_index": "forum","_type": "article","_id": "3","_score": 0.5753642,"_source": {"articleID": "JODL-X-1937-#pV7","userID": 2,"hidden": false,"postDate": "-01-01","tag": ["hadoop"],"tag_cnt": 1,"view_cnt": 100,"title": "this is elasticsearch blog"}},{"_index": "forum","_type": "article","_id": "2","_score": 0.3971361,"_source": {"articleID": "KDKE-B-9947-#kL5","userID": 1,"hidden": false,"postDate": "-01-02","tag": ["java"],"tag_cnt": 1,"view_cnt": 50,"title": "this is java blog"}}]}}

可以看到spark的帖子,相关度得分最高,排在了第一位。

搜索条件的权重,boost,可以将某个搜索条件的权重加大,此时当匹配这个搜索条件和匹配另一个搜索条件的document,计算relevance score时,匹配权重更大的搜索条件的document,relevance score会更高,当然也就会优先被返回回来

我们如果把boost去掉会怎样呢? 来看下

GET /forum/article/_search{"query": {"bool": {"must": {"match": {"title": "blog"}},"should": [{"match": {"title": {"query": "java"}}},{"match": {"title": {"query": "elasticsearch"}}},{"match": {"title": {"query": "hadoop"}}},{"match": {"title": {"query": "spark"}}}]}}}

返回:

{"took": 11,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 5,"max_score": 1.6185135,"hits": [{"_index": "forum","_type": "article","_id": "4","_score": 1.6185135,"_source": {"articleID": "QQPX-R-3956-#aD8","userID": 2,"hidden": true,"postDate": "-01-02","tag": ["java","elasticsearch"],"tag_cnt": 2,"view_cnt": 80,"title": "this is java, elasticsearch, hadoop blog"}},{"_index": "forum","_type": "article","_id": "1","_score": 0.8630463,"_source": {"articleID": "XHDK-A-1293-#fJ3","userID": 1,"hidden": false,"postDate": "-01-01","tag": ["java","hadoop"],"tag_cnt": 2,"view_cnt": 30,"title": "this is java and elasticsearch blog"}},{"_index": "forum","_type": "article","_id": "5","_score": 0.5753642,"_source": {"articleID": "DHJK-B-1395-#Ky5","userID": 3,"hidden": false,"postDate": "-05-01","tag": ["elasticsearch"],"tag_cnt": 1,"view_cnt": 10,"title": "this is spark blog"}},{"_index": "forum","_type": "article","_id": "3","_score": 0.5753642,"_source": {"articleID": "JODL-X-1937-#pV7","userID": 2,"hidden": false,"postDate": "-01-01","tag": ["hadoop"],"tag_cnt": 1,"view_cnt": 100,"title": "this is elasticsearch blog"}},{"_index": "forum","_type": "article","_id": "2","_score": 0.3971361,"_source": {"articleID": "KDKE-B-9947-#kL5","userID": 1,"hidden": false,"postDate": "-01-02","tag": ["java"],"tag_cnt": 1,"view_cnt": 50,"title": "this is java blog"}}]}}

spark的帖子并没有优先展示出来 ,可见boost权重确实起了作用。

如果觉得《白话Elasticsearch08-深度探秘搜索技术之基于boost的细粒度搜索条件权重控制》对你有帮助,请点赞、收藏,并留下你的观点哦!

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。