A Quick Guide to Elasticsearch

Stretch your understanding of Elasticsearch with Suvda Myagmar of Expect Labs! The video above will give you all the basics to start using this valuable tool.

TRANSCRIPT:

Hi, I’m Suvda Myagmar. In this talk, I’ll give an overview of ElasticSearch, and briefly describe how to configure and scale it, how to index your documents and optimize your search.

What is ElasticSearch? It’s is a an open-source search engine that allows you to index your documents and perform a full-text search. It’s built on top of Apache Lucene. ElasticSearch can be used not only as a search engine, but also as a NoSQL storage for your data.

Here are some features that make ElasticSearch so great:

you can interface with it using REST API and this makes it easy to integrate it with any backend system in any language.
it accepts documents for indexing in JSON format.
it’s schema-free, meaning it can automatically derived document mappings at indexing time.

For search, ElasticSearch supports such nifty features as real-time search, faceted search, query suggest, filtered query, highlighting, and custom score functions.

It is fairly easy to install, configure, and scale your own ElasticSearch server. To scale ElasticSearch cluster, you simple need to launch additional nodes with the server instance, and as long as these nodes are configured with the same cluster name, they can automatically discover each other and join the cluster.

You can create several indices for your data. For example, an index for news content and a separate index for a product catalog. Each index gets stored in multiple horizontal segments called shards. Sharding speeds up performance by parallel processing. For high availability, it’s recommended to configure replicas for index shards.

In order to achieve good search quality, it’s important that your document mappings are optimal. Mapping can be either explicitly set via the REST API or it can be implicitly derived by ElasticSearch while indexing documents. Besides supporting base types like string, integer, and Boolean, ElasticSearch also supports types like geopoint, geoshape, attachment, IP address, nested objects, and parent-child documents.

To speed up your operations, you can use bulk request and multi-get (_mget) operations. Routing is used to store related documents in one shard and you can use routing to speed up data retrieval because ElasticSearch routes requests to relevant shards.

Filtered queries can be used to speed up complex search queries, especially those that use custom score functions. The filter is applied first to narrow down on relevant documents, and then the text matching and custom ranking are applied on the filtered documents.

And this was my quick guide to ElasticSearch. I hope it’s helpful.

Languages Magazine

A Quick Guide to Elasticsearch

About the author

Author's Latest Articles

Magazines

COMMUNITY LANGUAGES