Elasticsearch — Ingest Pipeline — 2

Haydar Külekci
3 min readAug 2, 2022

--

Photo by Victor on Unsplash

I had written an article about changing the index name according to the timestamp field of the document with ingest pipeline:

Thanks to the Philipp “xeraa” Krenn, who shared it on Twitter. With this tweet, I happily got some feedback from Luca and Shaunak, which made me do more research on the subject.

To remember again, in that example, we assume that the indices are monthly indices. And we are using monthly names for the indices. When you start from scratch, the indices will be created automatically while indexing documents. At this point, I need to speak about some subjects :

Date index name processor

In the previous article, I used a date processor to parse the date and set it as a temporary field. Then, I set this temporary field as an index name suffix. Then I removed the temporary field from the data. So, Shaunak Kashyap suggests using “Date Index Name” processor instead of running multiple processors. So, we can use this processor as below :

PUT _ingest/pipeline/change_index_according_to_timestamp
{
"description": "change index name according to timestamp",
"processors": [
{
"date_index_name": {
"field" : "@timestamp",
"index_name_prefix" : "{{_index}}.",
"date_formats": ["ISO8601"],
"date_rounding" : "M",
"index_name_format": "yyyy.MM"
}
}
]
}

You can check the documentation for more information about this processor.

Index Template

I had not mentioned the index template in the previous article. So, we need to be sure all the indices have the same mapping not to face errors while searching. Maybe in another article, we can jump into this subject more, but for now, I want to just share with you what I used to be sure my mappings are similar.

PUT _index_template/book_events-index-template
{
"index_patterns": ["book_events.*.*"],
"template": {
"settings": {
"index": {
"default_pipeline": "change_index_according_to_timestamp"
}
},
"aliases": {
"book_events": {}
},
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
}
}
}
}
}

As you know, I did not do any special mapping here. This is just an example to ensure the timestamp field is a date. You can put any other fields to make sure that fields are shaped exactly what you want. On the other hand, I put book_events as an alias on the index. This means that Elasticsearch will attach this alias by default after the index is created. In the end, I used default_pipeline here inside the template.settings.index field of my template. This makes this index force processing change_index_according_to_timestamp pipeline for every index operation. You can check again documentation for more details :

Index Lifecycle Management

In the tweet, Luca said, “Better to clarify this would not work with ILM. Or better, it will make ILM useless.” So, I hadn’t mentioned ILM in the article. In fact, at first, I did not think about it even too. Because I just use this for separating indices, I used snapshot daily here. Nothing more. But he is right. There can be some consequences to automatically creating an index, and we need to discuss them. For example, after some time, the number of indices can be massive if you don’t have control over the data. And also, the ILM feature can be useless because of this solution. I know ILM is not just for the rollover of the indices, but the rollover feature is one of the good features of ILM. Rollover is not splitting the index by just only date. It has a feature that splits indices by size or number of documents. In our example, we are already splitting the indices by month, so we are doing rollover ourselves. Instead of using our ingest pipeline, we can also use rollover to split the index with aliases. But the mentality will be a little bit different.

Anyhow, as I said before, we have a different solutions for the same problem. I won’t elaborate more. I just want to mention these subjects to enrich my previous article.

Thanks.

Don’t forget to follow me on Twitter and Medium.

--

--

Haydar Külekci
Haydar Külekci

Written by Haydar Külekci

Elastic Certified Engineer - Open to new opportunities & seeking sponsorship for UK/Netherland relocation 🇳🇱🇬🇧 https://www.linkedin.com/in/hkulekci/