Django和elasticsearch搜索引擎網站后端功能實現

本文轉載自查看原文 2019-08-01 09:22 798

一、輸入框智能提示（es提供了接口）
修改type
需要在mapping中設置一個字段 suggest:{“type”:“completion”}
所以要修改我們定義的type：
在type中新增一個字段：suggest，由於es-dsl源碼有一些問題，所以這樣定義是會報錯的，要自己定義一個CustomAnalyzer，再聲明一個自定義的對象，ik_analyzer，再把對象賦給type中的suggest ：

...
from elasticsearch_dsl.analysis import CustomAnalyzer as _CustomAnalyzer

class CustomAnalyzer(_CustomAnalyzer):

def get_analysis_definition(self):
# 這里什么都不做，只是為了避免報錯的問題
return ()

# 聲明一個自定義的對象，傳遞ik_max_word並且做大小寫轉換
ik_analyzer = CustomAnalyzer('ik_max_word', filter=['lowercase'])

class DuowanType(DocType):
...
# 定義suggest是為了完成自動補全功能。
# 由於es-dsl源碼有一些問題，所以這樣定義是會報錯的，要自己定義一個CustomAnalyzer
suggest = Completion(analyzer=ik_analyzer)

生成suggest值
在save_to_es里面生成搜索建議
要通過生成suggest的接口來生成自己的結構。
在items類中定義一個全局函數gen_suggests，傳遞index和info_tuple用於weight信息，新建一個set用於去重，一個suggest數組用於保存返回的內容。遍歷info_tuple，如果text字符串不為空，則調用es的analyze接口來分析字符串，再整理好需要返回的結構

def gen_suggests(index, info_tuple): # 用tuple 就可以傳遞多個weight信息並且還可以按順序
# 根據字符串生成搜索建議數組
uesd_words = set() # 用於去重
suggests = [] # 用於返回
for text, weight in info_tuple:
if text: # 排除空字符串
# 調用es的analyze接口來分析字符串
words = es.indices.analyze(index=index, analyzer='ik_max_word', params={'filter': ['lowercase']}, body=text)
anylyzed_words = set(r["token"] for r in words["tokens"] if len(r["token"]) > 1) # 用來過濾單個字
new_words = anylyzed_words - uesd_words # 去重
else:
new_words = set()
if new_words:
suggests.append({'input': list(new_words), 'weight': weight})
return suggests

然后在save_to_es中調用這個函數：

info_tuple = ((duowan.title, 10), (duowan.author, 7))
duowan.suggest = (gen_suggests(DuowanType._doc_type.index, info_tuple))

搭建django搜索網站
創建新的虛擬環境
進入虛擬環境並安裝django包 pip install -i https://pypi.douban.com/simple/ django
然后用pycharm新建一個django的項目，直接運行，可以在log中看到服務器地址。
再新建一個static目錄，把css,html,js文件粘貼進去，把html文件粘貼到templates目錄下。
在urls文件中新增一個url

from django.contrib import admin
from django.urls import path
from django.conf.urls import url
from django.views.generic import TemplateView

urlpatterns = [
path('admin/', admin.site.urls),
url(r'^$', TemplateView.as_view(template_name='index.html'), name='index'),
]

在settings中添加一行設置;

# 這里也可以用tuple 用tuple的話路徑后面要加逗號
STATICFILES_DIRS = [
os.path.join(BASE_DIR, 'static') # 可以傳遞多個
]

在index.html中把<link href="css/style.css" rel="stylesheet" type="text/css" />之類的導入css和js的語句改為

{% load staticfiles %}
<head>
...
<link href="{% static 'css/style.css'%}" rel="stylesheet" type="text/css" />
...</head>

這樣就可以把settings中的static_url join到‘ ’ 內容前面，這樣就可以找得到html文件了。

搜索建議
在虛擬環境中安裝同版本的es-dsl
f模糊搜索
fuzzy:

GET duowan/video/_search
{
"query": {
"fuzzy": {
"title": {
"value": "軍團騎士",
"fuzziness": 2,
"prefix_length": 3
}
}
},
"_source": ["title"]
}

fuzziness：編輯距離
prefix_length：前面的不參與變換的詞的長度
“_source”: [“title”]：指明字段

suggest:

POST duowan/video/_search
{
"suggest": {
"my-suggest": {
"text":"PVQ",
"completion": {
"field": "suggest",
"fuzzy": {
"fuzziness":1
}
}
}
},
"_source": ["title"]
}

my-suggest可以自定義，field不能變，

在index.html文件中嵌入了js腳本，綁定了input事件，當里面的內容發生變化時，向服務器發送請求，參數包括input內容，和type類型

$(function(){
$('.searchInput').bind(' input propertychange ',function(){
var searchText = $(this).val();
var tmpHtml = ""
$.ajax({
cache: false,
type: 'get', //get方法獲取
dataType:'json',
url:suggest_url+"?s="+searchText+"&s_type="+$(".searchItem.current").attr('data-type'),
async: true,
success: function(data) {
for (var i=0;i<data.length;i++){
tmpHtml += '<li><a href="'+search_url+'?q='+data[i]+'">'+data[i]+'</a></li>'
}
$(".dataList").html("")
$(".dataList").append(tmpHtml);
if (data.length == 0){
$('.dataList').hide()
}else {
$('.dataList').show()
}
}
});
} );
})

在urls中新增：

url(r'^suggest/$', TemplateView.as_view(template_name='index.html'), name='index')
1
然后把爬蟲文件中es-type中的內容復制到django項目的models中
再編輯views文件：

import json
from django.shortcuts import render
from django.views.generic.base import View
from search.models import DuowanType
from django.http import HttpResponse

# Create your views here.
# 繼承 View
class SearchSuggest(View):
def get(self, request):
key_words = request.GET.get('s', '') # 用request獲取傳過來的參數s 默認值為空
re_dates = [] # 用來保存搜索建議返回來的title
if key_words:
s = DuowanType.search()
# 寫好查詢語句
s = s.suggest('my_suggest', key_words, completion={
"field": "suggest",
"fuzzy": {
"fuzziness": 2
},
"size": 10
})
# 執行並獲取結果
suggestions = s.execute_suggest()
for match in suggestions.my_suggest[0].options:
source = match._source
re_dates.append(source['title'])
# 用HttpResponse來返回結果，把數組轉成json返回
return HttpResponse(json.dumps(re_dates), content_type='application/json')

把urls中的

url(r'^suggest/$', TemplateView.as_view(template_name='index.html'), name='index')
改為

url(r'^suggest/$', SearchSuggest.as_view(), name='suggest')

記得是SearchSuggest.as_view（），不是SearchSuggest.as_view，否則會報錯如下：
TypeError: as_view() takes 1 positional argument but 2 were given

二搜索功能
urls中：

from search.views import SearchSuggest, SearchView
...
url(r'^search/$', SearchView.as_view(), name='search')

在views中添加一個 SearchView(View)：
接收傳過來的查詢關鍵詞參數和頁碼參數，
創建一個client連接es服務器，使用client.search可以執行原始的語句，使用client.search來執行查詢語句，在接收返回來的值，把返回來的結果取出來存放到list中，最后用render返回給頁面，
查詢時間：記錄client.search運行前后的時間，再做減法

from elasticsearch import Elasticsearch
from datetime import datetime
client = Elasticsearch(hosts='127.0.0.1')
.......
class SearchView(View):
def get(self, request):
key_words = request.GET.get('q', '')
pagesize = 10
page = request.GET.get('p', '')
try:
page = int(page)
except:
page = 1

# client.search允許像最原始的寫法一樣寫
body = {
"query": {
"multi_match": {
"query": key_words,
"fields": ["title", "author"]
}
},
"from": (page-1)*pagesize,
"size": pagesize,
# 高亮返回來的值會把高亮的內容放到highlight字段中
"highlight": {
# 可以指明想要加進去的html tag tag里面可以寫想知道的值
"pre_tags": ["<span class='keyword'>"],
"post_tags": ["</span>"],
"fields": {
"title": {},
"content": {}
}
}
}
start_time = datetime.now()
response = client.search(
index="duowan",
body=body
)
end_time = datetime.now()
last_seconds = (end_time-start_time).total_seconds()
# 不管分不分頁都有的總數量
total_nums = response['hits']['total']
if (page % 10) > 0:
page_nums = int(total_nums/10)+1
else:
page_nums = int(page/10)
# 構造一些值傳到數組在返回給html
hit_list = []
for hit in response['hits']['hits']:
hit_dict = {}
if 'title' in hit['highlight']:
hit_dict['title'] = hit['highlight']['title'][0]
else:
# 截取長度 hit_dict['title'] = hit['_source']['title'][:100
hit_dict['title'] = hit['_source']['title']
hit_dict['len'] = hit['_source']['len']
hit_dict['tag'] = hit['_source']['tag']
hit_dict['update_time'] = hit['_source']['update_time']
hit_dict['author'] = hit['_source']['author']
hit_dict['playnum_text'] = hit['_source']['playnum_text']
hit_dict['url'] = hit['_source']['url']
hit_list.append(hit_dict)
return render(request, 'result.html', {'page': page,
'total_nums': total_nums,
'all_hits': hit_list,
'key_words': key_words,
'page_nums': page_nums,
'last_seconds': last_seconds})

在頁面中：
找到item的div，用{% for hit in all_hits %} <div>...</div> {% endfor %}來使用for循環，遍歷傳過來的查詢結果list all_hits.在頁面中填充值

{% for hit in all_hits %}
<div class="resultItem">
<div class="itemHead">
<a href="{{ hit.url }}" target="_blank" class="title">{{ hit.title }}</a>
<span class="divsion">-</span>
<span class="fileType">
<span class="label">分類：</span>
<span class="value">{{ hit.tag }}</span>
</span>
<span class="dependValue">
<span class="label">播放次數：</span>
<span class="value">{{ hit.playnum_text }}</span>
</span>
</div>
<div class="itemBody">

</div>
<div class="itemFoot">
<span class="info">
<label>網站：</label>
<span class="value">伯樂在線</span>
</span>
<span class="info">
<label>發布時間：</label>
<span class="value">{{ hit.update_time }}</span>
</span>
</div>
</div>
{% endfor %}

用js實現搜索記錄：
點擊搜索按鈕的時候觸發add_search()方法，獲取關鍵詞，再用KillRepeat()給搜索記錄去重,去重后把數組存儲到瀏覽器localStorage,然后再把搜索內容顯示出來

//點擊搜索的時候觸發
function add_search(){
var val = $(".searchInput").val();
if (val.length>=2){
//點擊搜索按鈕時，去重
KillRepeat(val);
//去重后把數組存儲到瀏覽器localStorage
localStorage.search = searchArr;
//然后再把搜索內容顯示出來
MapSearchArr();
}

window.location.href=search_url+'?q='+val+"&s_type="+$(".searchItem.current").attr('data-type')

}

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 django之搜索引擎功能實現 Elasticsearch 搜索引擎 Python分布式爬蟲打造搜索引擎完整版-基於Scrapy、Redis、elasticsearch和django打造一個完整的搜索引擎網站 django使用haystack來調用Elasticsearch搜索引擎 ElasticSearch搜索引擎的入門實戰 Spring集成ElasticSearch搜索引擎全文搜索引擎 Elasticsearch 全文搜索引擎 ElasticSearch 還是 Solr？全文搜索引擎 Elasticsearch 入門