失眠网,内容丰富有趣,生活中的好帮手!
失眠网 > 白话Elasticsearch15-深度探秘搜索技术之使用copy_to定制组合field解决cross-fields搜索弊端

白话Elasticsearch15-深度探秘搜索技术之使用copy_to定制组合field解决cross-fields搜索弊端

时间:2021-03-15 01:58:28

相关推荐

白话Elasticsearch15-深度探秘搜索技术之使用copy_to定制组合field解决cross-fields搜索弊端

文章目录

概述官网例子总结

概述

继续跟中华石杉老师学习ES,第15篇

课程地址: /view/55

官网

https://www.elastic.co/guide/en/elasticsearch/reference/current/copy-to.html

---------

例子

新增字段,用作测试

PUT /forum/_mapping/article{"properties": {"new_author_first_name": {"type": "text","copy_to": "new_author_full_name"},"new_author_last_name": {"type": "text","copy_to": "new_author_full_name"},"new_author_full_name": {"type": "text"}}}

更新数据

POST /forum/article/_bulk{"update": {"_id": "1"} }{"doc" : {"new_author_first_name" : "Peter", "new_author_last_name" : "Smith"} }{"update": {"_id": "2"} }{"doc" : {"new_author_first_name" : "Smith", "new_author_last_name" : "Williams"} }{"update": {"_id": "3"} }{"doc" : {"new_author_first_name" : "Jack", "new_author_last_name" : "Ma"} }{"update": {"_id": "4"} }{"doc" : {"new_author_first_name" : "Robbin", "new_author_last_name" : "Li"} }{"update": {"_id": "5"} }{"doc" : {"new_author_first_name" : "Tonny", "new_author_last_name" : "Peter Smith"} }

查询

GET /forum/article/_search{"query": {"match": {"new_author_full_name": "Peter Smith"}}}

返回结果

{"took": 3,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": 2,"max_score": 2.3258216,"hits": [{"_index": "forum","_type": "article","_id": "1","_score": 2.3258216,"_source": {"articleID": "XHDK-A-1293-#fJ3","userID": 1,"hidden": false,"postDate": "-01-01","tag": ["java","hadoop"],"tag_cnt": 2,"view_cnt": 30,"title": "this is java and elasticsearch blog","content": "i like to write best elasticsearch article","sub_title": "learning more courses","author_first_name": "Peter","author_last_name": "Smith","new_author_last_name": "Smith","new_author_first_name": "Peter"}},{"_index": "forum","_type": "article","_id": "5","_score": 1.7770995,"_source": {"articleID": "DHJK-B-1395-#Ky5","userID": 3,"hidden": false,"postDate": "-05-01","tag": ["elasticsearch"],"tag_cnt": 1,"view_cnt": 10,"title": "this is spark blog","content": "spark is best big data solution based on scala ,an programming language similar to java","sub_title": "haha, hello world","author_first_name": "Tonny","author_last_name": "Peter Smith","new_author_last_name": "Peter Smith","new_author_first_name": "Tonny"}}]}}

总结

cross field的问题,是否解决了呢?

问题1:只是找到尽可能多的field匹配的doc,而不是某个field完全匹配的doc

答: 解决,最匹配的document被最先返回

问题2:most_fields,没办法用minimum_should_match去掉长尾数据,就是匹配的特别少的结果

答: 解决,可以使用minimum_should_match去掉长尾数据

问题3:TF/IDF算法,比如Peter Smith和Smith Williams,搜索Peter

Smith的时候,由于first_name中很少有Smith的,所以query在所有document中的频率很低,得到的分数很高,可能Smith Williams反而会排在Peter Smith前面

答: 解决,Smith和Peter在一个field了,所以在所有document中出现的次数是均匀的,不会有极端的偏差

如果觉得《白话Elasticsearch15-深度探秘搜索技术之使用copy_to定制组合field解决cross-fields搜索弊端》对你有帮助,请点赞、收藏,并留下你的观点哦!

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。