Spring batch batch framework

  • 2020-06-19 10:15:23
  • OfStack

An introduction to the spring batch framework

Batch processing is an important part of most IT projects. Batch processing is responsible for processing massive amounts of data in the business system and can automatically and efficiently perform complex data analysis and processing without human intervention. Batch processing will periodically read in batch data, through the corresponding business processing for archiving business operations, batch processing is characterized by automatic execution, processing of large amounts of data, timed execution. The whole batch process can be logically divided into read data, process data and write data.

spring batch abstracts the characteristics of batch processing itself, abstracts the batch jobs as job and job step, and divides the processing process of batch processing into data reading, data processing and data writing.

Divides the exception handling mechanism into skip, restart, and retry. Partitions the jobs into multithreaded, parallel remote, and partitioned.

spring batch is not a scheduling framework, but need to cooperate to complete the batch task scheduling framework, it is only concerned about the problem of batch processing tasks but did not provide the corresponding scheduling function, if you need to use the scheduling function, you need to use the scheduling framework, introduces here a quartz scheduling framework, which are frequently used can cooperate spring batch batch task scheduling.

The spring batch architecture is divided into three layers: the infrastructure layer, the core layer and the application layer. The application layer contains all batch jobs. The core layer mainly provides JobLauncher, Job and step. The infrastructure layer mainly provides common read (ItemReader), write (ItemWriter) and service processing (RetryTemplate retry template, for example). RepeatTemplate: Repeat template),Spring

The three-tier architecture of batch enables the Spring batch framework to be extended at different levels, avoiding interactions between different levels.

The introduction of job

The batch jobs consist of a set of step and job itself is the top-level element of the configuration file. Each job has its own name and can define the order in which step is executed and whether the job can be restarted. When job executes, one job instance(job instance) and one job execution(job executor) are generated. job instance contains the data generated during job execution and the status information of job execution; One job can correspond to multiple job instance and one job instance can correspond to multiple job execution.

The main attributes of the configuration of job are id(the only id for the job), ES67en-ES68en (defining the job repository), incrementer(job parameter increasing), restartable(whether the job is restarted or not), parent(specifying the parent of the job), and abstract(defining whether the job is abstract or not).

The introduction of step

step represents a complete step in a job, and an job can consist of one or more step, which is responsible for the implementation of the main business logic during a batch run. Each time step executes, one or more job execution is generated, and each time a task fails, one step execution is generated for the task's step the next time the task is re-executed.

step can be configured with tasklet, partition, job, flow.

step1 is mainly configured with itemReader, itemProcess, and itemWriter for the business logic processing of batch processing.

Introduction to job repository

job repository is mainly used to store metadata during the operation of job (these metadata include job instance, job execution, job parameters, step execution, execution context, etc.).

When the spring batch framework conducts metadata management, there are altogether 9 tables, among which 3 tables (with the suffix SEQ) are used to allocate the primary key. These 9 tables are respectively

BATCH_JOB_INSTANCE: Table of job instances

BATCH_JOB_EXECUTION: Job executor table

BATCH_JOB_EXECUTION_PARAMS: Job parameter table

BATCH_STEP_EXECUTION: Job walker table

BATCH_JOB_EXECUTION_CONTEXT: Job execution context table

BATCH_STEP_EXECUTION_CONTEXT: Job step execution context table

BATCH_JOB_EXECUTION_SEQ: Job executor sequence table

BATCH_STEP_EXECUTION_SEQ: Job step sequence table

BATCH_JOB_SEQ: Job sequence table

The introduction of itemReader

itemReader is the read processing of resources in Step. Spring batch framework provides a large number of directly available read components that can quickly complete the development and construction of batch processing applications. Meanwhile, the framework also provides better reuse and extension components for developers to customize the implementation.

ListItemReader: Read List data only once.

ItemReaderAdapter: ItemReader adapter, which can reuse existing read operations.

FlatFileItemReader: Read Flat type files.

StaxEventItemReader: Read the XML type file.

JdbcCursorItemReader: Reads database based on JDBC cursor mode.

HibernateCursorItemReader: Reads database based on hibernate cursor mode.

StoredProcedureItemReader: Reads the database based on stored procedures.

IbatisPagingItemReader: Reads the database based on Ibatis pagination.

JpaPagingItemReader: Paging reads the database based on jpa mode.

JdbcPagingItemReader: Paging read database based on jdbc.

HibernatePagingItemReader: Paging reads the database based on Hibernate.

JmsItemReader: Read the jms queue.

IteratorItemReader: Iteratively read components.

MultiResourceItemReader: Multi-file read component.

MongoItemReader: Distributed storage based MongoDB read component.

Neo4jItemReader: Grid database Neo4j read component.

ResourcesItemReader: Read components based on batch resources.

AmqpItemReader: Reads the AMQP queue component.

RepositoryItemReader: Read component based on Spring Data.

The introduction of itemProcess

The itemProcess phase represents the processing of the read data, where developers can implement their own business operations.

CompositeItemProcessor: A composite processor that encapsulates multiple business processing services.

ItemProcessorAdapter: Adapters that can reuse existing business processing services.

PassThroughItemProcessor: Does not do the business processing, directly returns reads the data.

ValidatingItemProcessor: Data validation processor, supports validation of data, and can filter or skip skip if validation does not pass.

The introduction of itemWriter

itemWriter is the write processing of resources in step. Spring batch framework provides a large number of directly available write components that can quickly complete the development and construction of batch processing applications. Meanwhile, the framework also provides better reuse and extension components for developers to customize the implementation.

FlatFileItemWriter: Write flat type files.

MultiResourceItemWriter: Multi-file write component.

StaxEventItemWriter: Write XML type files.

AmqpItemWriter: Write AMQP type messages.

ClassifierCompositeItemWriter: According to Classifier routing different Item to specific ItemWriter processing.

HibernateItemWriter: Writes to the database based on Hibernate.

IbatisBatchItemWriter: Writes to the database based on Ibatis.

ItemWriterAdapter: Adapter that can reuse existing write services.

JdbcBatchItemWriter: Write the database based on JDBC.

JmsItemWriter: Write JMS queue.

JpaItemWriter: Write the database based on jpa.

GemfireItemWriter: Write component based on distributed database Gemfire.

SpELMappingGemfireItemWriter: Write components of distributed database Gemfire based on spring expression language.

MimeMessageItemWriter: Write component that sends mail.

MongoItemWriter: Write component based on distributed file storage database MongoDB.

Neo4jItemWriter: Read component for network database Neo4j.

PropertyExtractingDelegatingItemWriter: Property extraction agent write component.

RepositoryItemWriter: Write component based on Spring Data.

SimpleMailMessageItemWriter: Write component that sends mail.

CompositeItemWriter: Composition mode for entries written to support assembling multiple ItemWriter.


Related articles: