MerixGames

Aug. 23, 2018

/ code & tools

How to use Elasticsearch with Django and REST Framework

Bartosz Płóciennik

In the majority of software development projects comes a time when an application needs to search through a vast collection of data. It can use a standard relational database, but this solution is quite slow and not entirely functional when working with more advanced requirements. Luckily, with help comes Elasticsearch.

What is Elasticsearch?

Elasticsearch is a Lucene-based search engine that works on an HTTP web interface and JSON schema-free documents. It simply makes searching, filtering, and sorting easier, thanks to what you can quickly give results to your clients. Also, a server doesn't need so much time for an operation, what lowers the overall cost of the project.

Can Elasticsearch be used as a database? The answer is no - it wasn't built for this purpose. Due to that, it’s best if you use it as an additional service in your project next to PostgreSQL, MySQL, or other databases.

Who uses Elasticsearch?

Plenty of the world’s biggest companies uses Elasticsearch to provide search functionality for their users. The proof of the popularity of this tools lies in their names, among whom are:

First steps of setting up your Elasticsearch project

The first thing you need to do is clone your base code project and install `docker-compose`. If you don't have Docker Compose, check out how it install it in the official Docker documentation.

Your goal is to create an endpoint with a list of articles from an Elasticsearch server. For the purpose of this article To make the interaction with Elasticsearch easier, I will work with ready-to-use Python packages: 

  • elasticsearch-dsl,
  • django-elasticsearch-dsl,
  • django-elasticsearch-dsl-drf.

Haystack vs Elasticsearch DSL

Haystack is a great open-source tool that provides modular search for Django. Unfortunately, it doesn’t fully support the newest version of Elasticsearch. To overcome that, I will use elasticsearch-dsl, which was created by Elasticsearch original development team and therefore has a better support for new releases.

To achieve your goals faster, you will need to use django-elasticsearch-dsl-drf. You don’t need any advanced solutions - it’s efficient enough and offers a lot of basic components that are ready-to-use in your project.  

How to run elasticsearch project?

Add connections config in settings.py:

ELASTICSEARCH_DSL = {
    'default': {
        'hosts': 'elasticsearch:9200'
    },
}

You need to give Django an information where your elasticsearch server is.

You’ve downloaded the repository - now it’s time to run your project. You can do it by using `docker-compose up --build` command.

Look at your code and you see for yourself that you have one app articles with model Article, which will be the basis of our work.

Using the manage.py command, create migration in your database: 

docker exec -it elastic_drf_example_django python manage.py migrate 

Create articles/documents.py file. First, you need to import necessary modules and create article index. At the same time, you can also set replicas and shards: 

from elasticsearch_dsl import analyzer

from django_elasticsearch_dsl import DocType, Index, fields 

from articles import models as articles_models

article_index = Index('articles')
article_index.settings(
    number_of_shards=1,
    number_of_replicas=0
)

html_strip = analyzer(
    'html_strip',
    tokenizer="standard",
    filter=["standard", "lowercase", "stop", "snowball"],
    char_filter=["html_strip"]
)  

After that, you can create your Article document. To do so, you need to add article index decorator to ArticleDocument class:

@article_index.doc_type
class ArticleDocument(DocType):
    """Article elasticsearch document"""

    id = fields.IntegerField(attr='id')
    title = fields.StringField(
        analyzer=html_strip,
        fields={
            'raw': fields.StringField(analyzer='keyword'),
        }
    )
    body = fields.TextField(
        analyzer=html_strip,
        fields={
            'raw': fields.TextField(analyzer='keyword'),
        }
    )
    author = fields.IntegerField(attr='author_id')
    created = fields.DateField()
    modified = fields.DateField()
    pub_date = fields.DateField()

    class Meta:
        model = articles_models.Article 

You’re halfway there - now it’s time to move on and create an articles/serializers.py file. Creating serializer is similar to the creation of a normal DRF endpoint. The only difference is that in this case you need to provide the document.

from django_elasticsearch_dsl_drf.serializers import DocumentSerializer

from articles import documents as articles_documents
 
 
class ArticleDocumentSerializer(DocumentSerializer):
    class Meta:
        document = articles_documents.ArticleDocument
        fields = (
            'id',
            'title',
            'body',
            'author',
            'created',
            'modified',
            'pub_date',
        )   

At this point, you can create your endpoint in articles/views.py. Let’s start by adding imports: 

from django_elasticsearch_dsl_drf.constants import (
    LOOKUP_FILTER_RANGE,
    LOOKUP_QUERY_IN,
    LOOKUP_QUERY_GT,
    LOOKUP_QUERY_GTE,
    LOOKUP_QUERY_LT,
    LOOKUP_QUERY_LTE,
)
from django_elasticsearch_dsl_drf.filter_backends import (
    FilteringFilterBackend,
    OrderingFilterBackend,
    DefaultOrderingFilterBackend,
    SearchFilterBackend,
)
from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet
 
from articles import documents as articles_documents
from articles import serializers as articles_serializers  

Below, create your article endpoint at the end of the file:

class ArticleViewSet(DocumentViewSet):
    document = articles_documents.ArticleDocument
    serializer_class = articles_serializers.ArticleDocumentSerializer
 
    lookup_field = 'id'
    filter_backends = [
        FilteringFilterBackend,
        OrderingFilterBackend,
        DefaultOrderingFilterBackend,
        SearchFilterBackend,
    ]
 
    # Define search fields
    search_fields = (
        'title',
        'body',
    )
 
    # Filter fields
    filter_fields = {
        'id': {
            'field': 'id',
            'lookups': [
                LOOKUP_FILTER_RANGE,
                LOOKUP_QUERY_IN,
                LOOKUP_QUERY_GT,
                LOOKUP_QUERY_GTE,
                LOOKUP_QUERY_LT,
                LOOKUP_QUERY_LTE,
            ],
        },
        'title': 'title.raw',
        'body': 'body.raw',
        'author': {
            'field': 'author_id',
            'lookups': [
                LOOKUP_QUERY_IN,
            ]
        },
        'created': 'created',
        'modified': 'modified',
        'pub_date': 'pub_date',
    }
 
    # Define ordering fields
    ordering_fields = {
        'id': 'id',
        'title': 'title.raw',
        'author': 'author_id',
        'created': 'created',
        'modified': 'modified',
        'pub_date': 'pub_date',
    }

    # Specify default ordering
    ordering = ('id', 'created',)   

In the code above, there are some attributes that will be necessary in your project:

  • search_fields - help us to search in all fields in one request,
  • filter_fields - list which available fields to filter on,
  • ordering_fields - list which available fields to order on,
  • ordering - you can set default order.

In this code, I’ve also introduced id filter lookups: 

       'id': {
            'field': 'id',
            'lookups': [
                LOOKUP_FILTER_RANGE,
                LOOKUP_QUERY_IN,
                LOOKUP_QUERY_GT,
                LOOKUP_QUERY_GTE,
                LOOKUP_QUERY_LT,
                LOOKUP_QUERY_LTE,
            ],
        },  

Id field has a variety of lookups and gives developers the possibility to define their own custom list that suits their project best. Among those lookups can be: 

  • LOOKUP_FILTER_RANGE - to set the extent of your search,
  • LOOKUP_QUERY_GT - to search for the elements greater than the given value,
  • LOOKUP_QUERY_GTE - to search for the elements equal and greater than the given value,
  • LOOKUP_QUERY_LT - to search for the elements lesser than the given value,
  • LOOKUP_QUERY_LTE - to search for the elements equal and lesser than the given value. 

The final step is to add a router and URL to your project by creating articles/urls.py

from rest_framework.routers import SimpleRouter

from articles import views
 

app_name = 'articles' 

router = SimpleRouter()
router.register(
    prefix=r'',
    base_name='articles',
    viewset=views.ArticleViewSet
)
urlpatterns = router.urls
  

Now add include in django_elastic_drf_example/urls.py:

path('articles/', include('articles.urls')),  

Good news! You’ve just finished writing endpoint for the list of your articles! However, don’t start celebrating yet - now in your application, you need to create said articles, index, and user.

How elasticsearch stores data?

Let’s start by creating a new superuser with the help of Django manage.py command:

docker exec -it elastic_drf_example_django python manage.py createsuperuser

After that, open the admin panel to log or sign in by using the credentials you’ve set in the previous command. Now you can create new articles:

Django admin panel - list of articles

Django admin panel - list of articles

The last step is to create an index and populate articles’ data to search by using Elasticsearch server instead of basing it on the backend side of the project:

docker exec -it django_elastic_drf_example_django python manage.py search_index --create

What can Elasticsearch do in your project? 

With the project finished, now you can:

Those are of course only the examples of the possible searches - check out the complete list in the elasticsearch-dsl-drf documentation.

What are the key features of Elasticsearch-dsl-drf?

Django-elasticsearch-dsl-drf provides plenty of functionalities, such as:

  • suggestions - suggests similar looking terms based on a provided text in the query,
  • highlighting - enables you to receive highlighted snippets in your search results, thanks to what you can present users’ with the best match for their queries.

It is possible to create this project without django-elasticsearch-dsl-drf and to use only django-elasticsearch-dsl. However, this package is best for the basic usage and works great if you don’t want to waste time.

Why should you use Elasticsearch?

In this article, I’ve shown you how to create a simple endpoint with lists of articles. Thanks to Elasticsearch you were also able to create a search functionality. Like I wrote at the beginning, Elasticsearch provides us with a lot of full-text searching functionalities and is great for bigger applications with a lot of users where time is of great importance.

Thanks for reading - don’t hesitate to comment and share your views on working with Django and Elasticsearch! 

We use cookies on this site to improve performance. By browsing this site you are agreeing to this. For more information see our Privacy policy.