Batch import of python data into Elasticsearch instances

  • 2020-10-23 20:10:57
  • OfStack

ES has been introduced in previous blogs and provides many interfaces. This article shows how to use python for bulk imports. ES has a lot of documentation on its website, so it shouldn't be hard to use it if you study it carefully and combine it with search engines.

The code first

from datetime import datetime
from elasticsearch import Elasticsearch
from elasticsearch import helpers
es = Elasticsearch()
actions = []
for line in f:
 line = line.strip().split(' ')
  u" Pictures of ":line[0].decode('utf8'),
  u" source ":line[1].decode('utf8'),
  u" authority ":line[2].decode('utf8'),
  u" The size of the ":line[3].decode('utf8'),
  u" The quality of ":line[4].decode('utf8'),
  u" category ":line[5].decode('utf8'),
  u" model ":line[6].decode('utf8'),
  u" country ":line[7].decode('utf8'),
  u" Gathering people ":line[8].decode('utf8'),
  u" Subordinate departments ":line[9].decode('utf8'),
  u" keywords ":line[10].decode('utf8'),
  u" Access permissions ":line[11].decode('utf8') 
 helpers.bulk(es, actions)
 del actions[0:len(actions)]
if (len(actions) > 0):
 helpers.bulk(es, actions)

First of all, index.txt is encoded with utf8, so decode('utf8') needs to be converted to an unicode object, and u needs to be added before "picture name", otherwise ES will report an error

The speed of the import is still very fast, more than 2,000 records per second.

Related articles: