我是靠谱客的博主 自信猫咪,最近开发中收集的这篇文章主要介绍Elasticsearch数组的理解,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

最近在对Elasticsearch对数组相似度处理的时候产生了疑惑:

    Elasticsearch在对数组做相似度处理的时候和对一串字符文档相似度处理的区别在哪里?

 

(Elasticsearch 5.4版本)建立的索引结构如下:

POST user
{
	"mappings": {
		"app": {
			"properties": {
				"appPackageNameLists": {
					"type": "keyword",
					"index": true
				},
				"gaid": {
					"type": "text",
					"fields": {
						"keyword": {
							"type": "keyword",
							"ignore_above": 256
						}
					}
				},
				"region": {
					"type": "keyword"
				}
			}
		}
	}
}

 

插入数据:

POST user/app/4ad86eae-4477-40a1-abaf-bf45ba2dbe1c
{
	"gaid": "4ad86eae-4477-40a1-abaf-bf45ba2dbe1c",
	"appPackageNameLists": [
		"com.mi.global.shop",
		"com.nox.mopen.app",
		"com.ludashi.dualspace",
		"info.cloneapp.mochat.in.goast",
		"com.parallel.space.lite.arm64",
		"com.freecharge.android",
		"com.swiftkey.languageprovider"
	],
	"region": "IN"
}



POST user/app/4af0a32b-6995-4c92-8189-79aad0e6fecb
{
	"gaid": "4af0a32b-6995-4c92-8189-79aad0e6fecb",
	"appPackageNameLists": [
		"com.joynow.killplane2",
		"com.google.android.youtube",
		"root.rootchecker",
		"com.mxtech.videoplayer.pro",
		"com.teslacoilsw.launcher",
		"com.whatsapp",
		"com.lbe.parallel.intl"
	],
	"region": "IN"
}


POST user/app/4be2411e-089d-4a6c-b773-6ca31e36c675
{
	"gaid": "4be2411e-089d-4a6c-b773-6ca31e36c675",
	"appPackageNameLists": [
		"com.joynow.killplane2",
		"com.google.android.youtube",
		"root.rootchecker",
		"com.mxtech.videoplayer.pro",
		"com.teslacoilsw.launcher",
		"com.whatsapp",
		"com.lbe.parallel.intl",
		"com.outfit7.mytalkingtomfree",
		"fbs.com",
		"com.tencent.mm"
	],
	"region": "IN"
}

 

查看一条数据的数组字段的分词结果:

    因为appPackageNameLists是keyword类型,没有进行分词,

    所以结果如下(app列表被拆分成为单个的数据元素)

    只是很简单的记录了起始结束坐标和元素位置:

GET /user/app/4be2411e-089d-4a6c-b773-6ca31e36c675/_termvectors?fields=appPackageNameLists

{
  "_index": "user",
  "_type": "app",
  "_id": "4be2411e-089d-4a6c-b773-6ca31e36c675",
  "_version": 1,
  "found": true,
  "took": 1,
  "term_vectors": {
    "appPackageNameLists": {
      "field_statistics": {
        "sum_doc_freq": 10,
        "doc_count": 1,
        "sum_ttf": -1
      },
      "terms": {
        "com.google.android.youtube": {
          "term_freq": 1,
          "tokens": [
            {
              "position": 1,
              "start_offset": 22,
              "end_offset": 48
            }
          ]
        },
        "com.joynow.killplane2": {
          "term_freq": 1,
          "tokens": [
            {
              "position": 0,
              "start_offset": 0,
              "end_offset": 21
            }
          ]
        },
        "com.lbe.parallel.intl": {
          "term_freq": 1,
          "tokens": [
            {
              "position": 6,
              "start_offset": 131,
              "end_offset": 152
            }
          ]
        },
        "com.mxtech.videoplayer.pro": {
          "term_freq": 1,
          "tokens": [
            {
              "position": 3,
              "start_offset": 66,
              "end_offset": 92
            }
          ]
        },
        "com.outfit7.mytalkingtomfree": {
          "term_freq": 1,
          "tokens": [
            {
              "position": 7,
              "start_offset": 153,
              "end_offset": 181
            }
          ]
        },
        "com.tencent.mm": {
          "term_freq": 1,
          "tokens": [
            {
              "position": 9,
              "start_offset": 190,
              "end_offset": 204
            }
          ]
        },
        "com.teslacoilsw.launcher": {
          "term_freq": 1,
          "tokens": [
            {
              "position": 4,
              "start_offset": 93,
              "end_offset": 117
            }
          ]
        },
        "com.whatsapp": {
          "term_freq": 1,
          "tokens": [
            {
              "position": 5,
              "start_offset": 118,
              "end_offset": 130
            }
          ]
        },
        "fbs.com": {
          "term_freq": 1,
          "tokens": [
            {
              "position": 8,
              "start_offset": 182,
              "end_offset": 189
            }
          ]
        },
        "root.rootchecker": {
          "term_freq": 1,
          "tokens": [
            {
              "position": 2,
              "start_offset": 49,
              "end_offset": 65
            }
          ]
        }
      }
    }
  }
}

 

取一条完整的包名进行match结果如下:

GET user/app/_search
{
  "from": 0, 
	"query": {
		"bool": {
			"should": [
			  {
					"match": {
						"appPackageNameLists": {
							"query": "com.outfit7.mytalkingtomfree",
							"boost": 1
						}
					}
				}
			]
		}
	},
	"size": 100
}



{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "user2",
        "_type": "app",
        "_id": "4be2411e-089d-4a6c-b773-6ca31e36c675",
        "_score": 0.2876821,
        "_source": {
          "gaid": "4be2411e-089d-4a6c-b773-6ca31e36c675",
          "appPackageNameLists": [
            "com.joynow.killplane2",
            "com.google.android.youtube",
            "root.rootchecker",
            "com.mxtech.videoplayer.pro",
            "com.teslacoilsw.launcher",
            "com.whatsapp",
            "com.lbe.parallel.intl",
            "com.outfit7.mytalkingtomfree",
            "fbs.com",
            "com.tencent.mm"
          ],
          "region": "IN"
        }
      }
    ]
  }
}

 

取这个包名中的部分数据如com进行match结果(是匹配不到的):

GET user/app/_search
{
  "from": 0, 
	"query": {
		"bool": {
			"should": [
			  {
					"match": {
						"appPackageNameLists": {
							"query": "com",
							"boost": 1
						}
					}
				}
			]
		}
	},
	"size": 100
}



{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

 

不包含对象嵌套的数组,

其实没有那么复杂

 

 

 

 

 

最后

以上就是自信猫咪为你收集整理的Elasticsearch数组的理解的全部内容,希望文章能够帮你解决Elasticsearch数组的理解所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(84)

评论列表共有 0 条评论

立即
投稿
返回
顶部