Django Integrated Search Engine Elasticserach Method Example

  • 2021-06-28 09:21:35
  • OfStack

1. Background

When a user enters a keyword in the search box, we provide the user with relevant search results.You can choose to use the fuzzy query like keyword implementation, but the like keyword is extremely inefficient.Queries need to be made in multiple fields, the like keyword is not convenient to use, and the effect of word segmentation is not ideal.

Full Text Retrieval Scheme

Full-text retrieval refers to a search query in any specified field. Full-text retrieval scheme needs to be implemented with search engines.

Search engine principles

When a search engine retrieves full text, it preprocesses the data in the database once and establishes a separate index structure. Index structure data is like a dictionary index retrieval page, which contains the corresponding relationship between keywords and entries and records the location of entries. Search engines perform full-text searches by quickly comparing keywords in indexed data to find the real location where the data is stored.

2. Introduction to Elasticsearch

Elasticsearch is the preferred search engine for full-text retrieval.

Elasticsearch is an open source search engine implemented with Java. It can store, search, and analyze huge amounts of data quickly.Wikipedia, Stack Overflow, Github, and others use it. At the bottom of Elasticsearch is the open source library Lucene.However, you cannot use Lucene directly, you must write your own code to call its interface.

Separation Description

Search engines need word breaking when building indexes on data.

A word breaker refers to the breaking up of a sentence into multiple words or words, which are the keywords of the sentence.

Elasticsearch does not support indexing Chinese word segmentation. It needs to be combined with the extension of elasticsearch-analysis-ik to achieve Chinese word segmentation.

3. Integrate Elasticsearch

3.1. Introduction and installation of Haystack

Haystack is the framework for docking search engines in Django, which establishes a communication bridge between users and search engines. We can call the Elasticsearch search engine in Django by using Haystack. Haystack can use different search backends (such as Elasticsearch, Whoosh, Solr, and so on) without modifying the code.

Haystack Installation


$ pip install django-haystack
$ pip install elasticsearch==2.4.1

Haystack Registration Applications and Routing

Register in the django configuration file.


INSTALLED_APPS = [ 'haystack', #  Full Text Retrieval Registration ]​

Create a new route for haystack in the total route.


urlpatterns = [url(r'^search/', include('haystack.urls')),]

Haystack Configuration

Configure Haystack as the search engine backend in the configuration file


# Haystack
HAYSTACK_CONNECTIONS = {
 'default': {
  'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
  'URL': 'http://192.168.103.158:9200/', # Elasticsearch The server ip Address, port number fixed to 9200
  'INDEX_NAME': 'serach_mall', # Elasticsearch Name of index library created 
 },
}

#  Automatically generate indexes when data is added, modified, or deleted 
HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'
#  Size per page of search 
HAYSTACK_SEARCH_RESULTS_PER_PAGE = 3

HAYSTACK_SIGNAL_The PROCESSOR configuration item ensures that when new data is generated after Django is running, Haystack can still enable Elasticsearch to generate an index of new data in real time.

3.2 Haystack Indexing Data

1. Create Index Class

By creating an index class, you indicate which fields are indexed by the search engine, that is, the keywords of which fields can be used to retrieve data.

In this project, the model class SKU information is retrieved in full text, so a new search_is created in the application of the model class (goods)indexes.py file, used to store index classes.Index classes must inherit haystack.indexes.SearchIndex and haystack.indexes.Indexable.


from haystack import indexes

from .models import SKU


class SKUIndex(indexes.SearchIndex, indexes.Indexable):
 """SKU Index Data Model Class """
 text = indexes.CharField(document=True, use_template=True)

 def get_model(self):
  """ Return indexed model class """
  return SKU

 def index_queryset(self, using=None):
  """ Returns the data query set to be indexed """
  return self.get_model().objects.filter(is_launched=True) 

Index class SKUIndex description:

Fields created in SKUIndex can be queried by the Elasticsearch search engine using Haystack. Where the text field is declared document=True, the table name is the field that is used primarily for keyword queries. The index value of the text field can be made up of several database model class fields, which model class fields are made up of. We use use_template=True indicates that it is subsequently specified by a template.

2. Create text field index value template file

Create the template file used by the text field in the project templates directory

Specifically in templates/search/indexes/goods/sku_Defined in the text.txt file, where goods is the application name and sku_sku in text.txt is lowercase for the model class.


{{ object.id }}
{{ object.name }}
{{ object.caption }}

Template file description: When keywords are passed through the text parameter name

This template specifies id, name, and caption of SKU as index values for the text field for keyword index queries.

3. Manually generate the initial index


$ python manage.py rebuild_index

The first time an index needs to be generated, the command above needs to be executed, and the index will be automatically generated later.

3.3 Full Text Retrieval Test

Prepare test form

Request method: GET Request Address: /search/ Request parameter: q

<div class="search_wrap fl">
 <form method="get" action="/search/" class="search_con">
  <input type="text" class="input_text fl" name="q" placeholder=" Search for Goods ">
  <input type="submit" class="input_btn fr" name="" value=" search ">
 </form>
 ...
 ...
</div>

Then create a new search.html in the templates/search/directory to receive and render the results of the full-text retrieval.

3.4 Rendering search results

The data returned by Haystack includes:

query: Search keywords paginator: Paged paginator object page: The page object on the current page (traverse through the objects in page to get the result object) result.objects: The currently traversed SKU object.

<div class="main_wrap clearfix">
 <div class=" clearfix">
  <ul class="goods_type_list clearfix">
   {% for result in page %}
   <li>
    {# object Acquire is sku object  #}
    <a href="/detail/{{ result.object.id }}/" rel="external nofollow" rel="external nofollow" ><img src="{{ result.object.default_image.url }}"></a>    
    <h4><a href="/detail/{{ result.object.id }}/" rel="external nofollow" rel="external nofollow" >{{ result.object.name }}</a></h4>
    <div class="operate">
     <span class="price"> _ {{ result.object.price }}</span>
     <span>{{ result.object.comments }} evaluate </span>
    </div>
   </li>
   {% else %}
    <p> No items were found for your inquiry. </p>
   {% endfor %}
  </ul>
  <div class="pagenation">
   <div id="pagination" class="page"></div>
  </div>
 </div>
</div>

Here Elasticsearch writes the view functions in django for us.

Search Page Pager


<div class="main_wrap clearfix">
 <div class=" clearfix">
  ......
  <div class="pagenation">
   <div id="pagination" class="page"></div>
  </div>
 </div>
</div>

<script type="text/javascript">
 $(function () {
  $('#pagination').pagination({
   currentPage: {{ page.number }},
   totalPage: {{ paginator.num_pages }},
   callback:function (current) {
    window.location.href = '/search/?q={{ query }}&page=' + current;
   }
  })
 });
</script>

The jquery.pagination.js used here receives the data to be rendered, but it can also be received using page breaks or customizations from other frameworks.


Related articles: