我是靠谱客的博主 冷酷月饼,最近开发中收集的这篇文章主要介绍智能信息检索——邻近搜索中两个倒排记录表的搜索算法1.实验目的2.实验任务与要求3.实验说明书4.实验成果5.程序调试过程,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

智能信息检索——邻近搜索中两个倒排记录表的搜索算法

  • 1.实验目的
  • 2.实验任务与要求
  • 3.实验说明书
    • ⑴功能描述
    • ⑵概要设计
    • ⑶详细设计
    • ⑷代码实现
  • 4.实验成果
  • 5.程序调试过程

《信息检索导论》部分实验python实现汇总请进入此博客查看。

1.实验目的

掌握搜索系统中的邻近搜索,并实现临近搜索中两个倒排记录表的搜索算法。

2.实验任务与要求

充分理解邻近搜索中两个倒排记录表的搜索算法,并通过python编程实现。当用户在提示后输入查询语句即可以实现临近搜索中两个倒排记录表的搜索算法。

3.实验说明书

⑴功能描述

系统读取预设文档返回所有可查询的词项,用户通过提示输入查询词项,系统分别计算所有词项所在的文档及其倒排记录表,然后执行临近搜索中两个倒排记录表的搜索算法,并将合并结果输出。

⑵概要设计

分为提示输入模块与临近搜索中两个倒排记录表的搜索算法模块两个功能模块。

⑶详细设计

  1. 总体流程图

图1 总体流程图

  1. 各功能模块流程图
  • 提示输入模块
    图2 提示输入模块

  • 临近搜索中两个倒排记录表的搜索算法模块

图3临近搜索中两个倒排记录表的搜索算法模块

⑷代码实现

  • 创建文档字典

createdict函数为功能函数,用来创建文档字典。createdict函数调用了python字符串处理的re库,处理预设的文档,返回所有词项用于提示用户可选词项,并计算所有词项的倒排记录表。

def createdict(f0):
    dl = list(set(re.split('[ n?!,.;]', f0)))
    dl.pop(0)
    d = f0.split('n')
    dict1 = {}
    dict0 = {}
    for word in dl:
        for i in range(len(d)): 
            d0 = re.split('[ n?!,.;]', d[i])
            if word in d0:
                dict1[i + 1] = []                                
                for j in range(len(d0)):
                    if word == d0[j]:                      
                        dict1[i + 1].append(j + 1)
        dict0[word] = dict1
        dict1 = {}
    return dict0
  • 临近搜索中两个倒排记录表的搜索算法模块

PositionalIntersect函数为临近搜索中两个倒排记录表的搜索算法模块,首先获取输入倒排记录表的所有文档ID,然后循环文档ID,获取该文档ID下的倒排记录表,再通过循环获取倒排记录表元素,判断两个元素的距离是否小于等于预设距离,满足则存储文档ID、词项1的倒排记录表、词项2的倒排记录表,不满足则继续循环,最终返回该结果列表。

def PositionalIntersect(p1, p2, k):
    r = []
    k1, k2 = [key for key in p1], [key for key in p2]
    i, j = 0, 0
    while(i < len(p1) and j < len(p2)):
        if(k1[i] == k2[j]):
            l = []
            pp1, pp2 = p1[k1[i]], p2[k2[j]]
            i1, j1 = 0, 0
            while(i1 < len(pp1)):
                while(j1 < len(pp2)):
                    if(abs(pp1[i1] - pp2[j1]) <= k):
                        l.append(pp2[j1])
                    elif(pp2[j1] > pp1[i1]):
                        break
                    j1 = j1 + 1
                while(l != [] and abs(l[0] - pp1[i1]) > k):
                    del(l[0])
                for n in range(0, len(l)):
                    r.append([k1[i], pp1[i1], l[n]])
                i1 = i1 + 1
            i = i + 1
            j = j + 1
        elif(k1[i] > k2[j]):
            j = j + 1
        else:
            i = i + 1  
    return r
  • 代码补全

下面的p1、p2为调试的倒排记录表。

import re
p1 = {1: [7, 18, 33, 72, 86, 231], 2: [1, 17, 74, 222, 255], 4: [8, 16, 190, 429, 433], 5: [363, 367], 7: [13, 23, 191]}
p2 = {1: [17, 25], 4: [17, 191, 291, 430, 434], 5: [14, 19, 101]}
f = open("document.txt", "r")
f0 = f.read()
f.close()
dict0 = createdict(f0)
k = [key for key in dict0]
print("可供查询的词项为:", k, "n")
print("请输入临近搜索要查询的第一个词项:", end = '')
p1 = dict0[input()]
print("请输入临近搜索要查询的第二个词项:", end = '')
p2 = dict0[input()]
print("临近搜索结果为:n", PositionalIntersect(p1, p2, 1))    

document.txt模拟文档如下,应该可以用任意一篇英文文档尝试。

There are moments in life when you miss someone so much that you just want to pick them from your dreams and hug them for real! Dream what you want to dream;go where you want to go;be what you want to be,because you have only one life and one chance to do all the things you want to do.
May you have enough happiness to make you sweet,enough trials to make you strong,enough sorrow to keep you human,enough hope to make you happy? Always put yourself in others’shoes.If you feel that it hurts you,it probably hurts the other person, too.
The happiest of people don’t necessarily have the best of everything;they just make the most of everything that comes along their way.Happiness lies for those who cry,those who hurt, those who have searched,and those who have tried,for only they can appreciate the importance of people
Who have touched their lives.Love begins with a smile,grows with a kiss and ends with a tear.The brightest future will always be based on a forgotten past, you can’t go on well in life until you let go of your past failures and heartaches.
When you were born,you were crying and everyone around you was smiling.Live your life so that when you die,you’re the one who is smiling and everyone around you is crying.
Please send this message to those people who mean something to you,to those who have touched your life in one way or another,to those who make you smile when you really need it,to those that make you see the brighter side of things when you are really down,to those who you want to let them know that you appreciate their friendship.And if you don’t, don’t worry,nothing bad will happen to you,you will just miss out on the opportunity to brighten someone’s day with this message.

4.实验成果

根据提示分别输入词项,临近距离即程序PositionalIntersect(p1, p2, k)中的k值设置为1,得到结果如下图。

图4 临近距离为1结果

搜索结果的三个位置数字的含义分别为:第6个文档、send在当前文档第2个位置、this在当前文档第3个位置。现在将距离设置为5再次查询,结果如下图。

图5 临近距离为5结果

5.程序调试过程

在程序调试过程中,预设p1 = {1: [7, 18, 33, 72, 86, 231], 2: [1, 17, 74, 222, 255], 4: [8, 16, 190, 429, 433], 5: [363, 367], 7: [13, 23, 191]},p2 = {1: [17, 25], 4: [17, 191, 291, 430, 434], 5: [14, 19, 101]},运行得到结果如下图所示。
图6 调试过程

最后

以上就是冷酷月饼为你收集整理的智能信息检索——邻近搜索中两个倒排记录表的搜索算法1.实验目的2.实验任务与要求3.实验说明书4.实验成果5.程序调试过程的全部内容,希望文章能够帮你解决智能信息检索——邻近搜索中两个倒排记录表的搜索算法1.实验目的2.实验任务与要求3.实验说明书4.实验成果5.程序调试过程所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(39)

评论列表共有 0 条评论

立即
投稿
返回
顶部