我是靠谱客的博主 典雅斑马,最近开发中收集的这篇文章主要介绍python单词相似度计算_计算单词表之间的相似度,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

既然你还没能真正演示出晶体输出,那么我最好的办法是:list_A = ['email','user','this','email','address','customer']

list_B = ['email','mail','address','netmail']

在上面两个列表中,我们将找到列表中每个元素与其余元素之间的余弦相似性。i、 e.email来自list_B,其中list_A中的每个元素:

^{pr2}$

输出:The cosine similarity between : email and : email is: 100.0

The cosine similarity between : mail and : email is: 89.44271909999159

The cosine similarity between : address and : email is: 26.967994498529684

The cosine similarity between : netmail and : email is: 84.51542547285166

The cosine similarity between : email and : user is: 22.360679774997898

The cosine similarity between : mail and : user is: 0.0

The cosine similarity between : address and : user is: 60.30226891555272

The cosine similarity between : netmail and : user is: 18.89822365046136

The cosine similarity between : email and : this is: 22.360679774997898

The cosine similarity between : mail and : this is: 25.0

The cosine similarity between : address and : this is: 30.15113445777636

The cosine similarity between : netmail and : this is: 37.79644730092272

The cosine similarity between : email and : email is: 100.0

The cosine similarity between : mail and : email is: 89.44271909999159

The cosine similarity between : address and : email is: 26.967994498529684

The cosine similarity between : netmail and : email is: 84.51542547285166

The cosine similarity between : email and : address is: 26.967994498529684

The cosine similarity between : mail and : address is: 15.07556722888818

The cosine similarity between : address and : address is: 100.0

The cosine similarity between : netmail and : address is: 22.79211529192759

The cosine similarity between : email and : customer is: 31.62277660168379

The cosine similarity between : mail and : customer is: 17.677669529663685

The cosine similarity between : address and : customer is: 42.640143271122085

The cosine similarity between : netmail and : customer is: 40.08918628686365Note: I have also commented the threshold part in the code, in case

you only want the words if their similarity exceeds a certain

threshold i.e. 80%

编辑:

OP:但我真正想做的不是逐字比较,而是逐项列出

使用Counter和{}:from collections import Counter

import math

counterA = Counter(list_A)

counterB = Counter(list_B)

def counter_cosine_similarity(c1, c2):

terms = set(c1).union(c2)

dotprod = sum(c1.get(k, 0) * c2.get(k, 0) for k in terms)

magA = math.sqrt(sum(c1.get(k, 0)**2 for k in terms))

magB = math.sqrt(sum(c2.get(k, 0)**2 for k in terms))

return dotprod / (magA * magB)

print(counter_cosine_similarity(counterA, counterB) * 100)

输出:53.03300858899106

最后

以上就是典雅斑马为你收集整理的python单词相似度计算_计算单词表之间的相似度的全部内容,希望文章能够帮你解决python单词相似度计算_计算单词表之间的相似度所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(50)

评论列表共有 0 条评论

立即
投稿
返回
顶部