Tuesday, May 28, 2013

Spring Batch in Action 1st edition, Arnaud Cogoluegnes



Summary: Spring Batch in Action is a flawlessly written book. The topic of batch processing in Java often gets a second class coverage and this book puts an end to that. The book is very visual; filled with logical diagrams (UML and other formats) the book instantly becomes a referential manual for anyone willing to assail the topic of Batch Processing with (or even without) Spring Batch.
Written in a sequential manner on which every chapter builds on the previous chapter the book aims to tackle the topic of batch processing with Spring Batch. However, although the book is aimed squarely at Spring Batch, the reader who is less enthused about Spring in general, would benefit greatly reading about batch processing in the context of a `separation-of-concerns (SoC)' concept.

Chapter 1: The book begins with a thorough definition of batch applications. From data volume, automation, reliability, robustness, and performance; all topics are explained in a manner all developers with at least an intermediate understanding of data processing should have no problem following. Once the initial topic of batch applications is introduced the book quickly introduces the title framework; Spring Batch. It runs through the architecture of Spring Batch and how it enhances the existing SpringFramework to perform batch processing while Spring Batch itself utilizes SpringFramework to perform its internal tasks. This symbiotic relationship is indeed fascinating and the book does not miss this point. Once the formal introduction to Spring Batch is complete, a quick example is shown.

Chapter 2: The second chapter starts with the formal `getting started' in the world of Spring Batch. As part of the reading and writing product data, chunk processing is formerly introduced with diagrams and a forward looking statement about later chapters which will delve into more detail about chunk processing. If you have never performed batch processing in this manner, this chapter alone is mind-bending as it introduces a simple concept that will forever change your personal view on how batch processing should be done. I personally believe if the book stopped right here I would still consider it a major success. An honorable mention in this chapter is the quick overview of ETL (Extract, transform, load); this is a topic worth mentioning in a book of this caliber and the authors did not overlook it. Overall, Chapter 2 is a quick overview of what Spring Batch can do and how a developer willing to use Spring Batch can quickly delve into developing an application with Spring Batch.

Chapter 3 starts defining the domain language for batch processing. It quickly introduces the developer how to configure the Spring Batch infrastructure. At this point the job repository is introduced and the Spring Batch database support is shown through an example utilizing the H2 database; to those familiar with Spring, this section should look very familiar as the configuration itself is nearly identical to a typical Spring database configuration application context XML; of course, the repository configuration would be new to anyone not familiar to Spring Batch, but the XML configuration file will still remain very familiar despite the Spring Batch namespace. At this point as logic would have it; Spring Batch Admin, would be a nice addition; well, the book does exactly that; it shows the admin interface. It then continues to explore the anatomy of a job. Linear and non-linear job flows are explained and demonstrated through XML configuration steps via the job/step/tasklet tags. This chapter also delves into job instances and job executions. The formula for a job instance shown in the book is simple (JobInstance = Job + JobParameters). Lifecycle of a job and job execution is briefly explained followed by the topic of running multiple instances of the same job.

Chapter 4 delves into batch configuration by exploring the Spring Batch XML vocabulary; first the Spring Batch XML namespace and Spring Batch XML features are introduced; then the chapter naturally delves into details on how to configure the job and the steps and how the configuration hierarchy flows in Spring Batch. Nested job configurations are also demonstrated. In this chapter you will also learn about configuring streams as well as transactions and the job repository. Advanced topics are introduced at the end of the chapter with a quick introduction to various Spring topics such as scopes and Spring Expression Language (SpEL). If you have never delved into those topics before, they are worth a look as they apply well beyond Spring Batch.

Chapter 5 demystifies the launching process of Spring Batch jobs. At this point it is worth mentioning that, you are really stepping into a new territory. The book is now into what I would call "Part II" - you have learned the basics; you have created a few jobs; configured them and launched them. At this point the book takes a further refinement on topics already covered before. And so this chapter takes the launching process over different approaches; via schedulers, command line or embedding jobs in a web application. The as part of the launching concepts, Spring Batch launching API is revisited in greater detail followed by an overview of launching solutions. Then, each solution is explored in greater detail; the command line execution model by first introducing the already provided CommandLineJobRunner. Job schedulers are nicely introduced. The topic is kept to the three most popular implementations of such concepts. The quintessential cron is first followed by Spring scheduler (which in Spring 3 supports cron syntax) and as the last and far more advanced option; Quartz Scheduler. This section explains the cron syntax in great detail and the reader would be well advised to use this book as one of the main references for such topic. The chapter continues on the topic of integration with a web application. For a reader who has never integrated a Spring-based application, this chapter is well worth it as it shows how to integrate Spring Batch in a web application. Taking Spring Batch out of the picture, this topic applies to any Spring application. To end on a high note, Chapter 5 describes the process of stopping Spring Batch jobs gracefully; using the JobOperator via JMX or via StepExecution.

Chapter 6 continues the refinement process by delving deeper into the `reading data' process. The data reading concepts are first introduced to the reader. The Spring Batch related reader Java interfaces are displayed with an introduction to the ItemStream interface (an interface well worth reading about). Once the concepts are out of the way each approach is introduced; reading files; at first the supported formats are demonstrated and how they are handled internally as well as how the configuration is performed to each type of reader. XML formats are also explained with a word of caution on DOM. Once file readers are out of the way, databases follow; first starting with simple readers such as JDBC item readers such as the JdbcCursorItemReader. In this section the JDBC support classes built in SpringFramework come in handy and they are demonstrated as well. ORM item readers are introduced as well and examples using HibernateCursorItemReader and HibernatePagingItemReader are shown. At this point the chapter turns and explores other input sources via ItemReaderAdapter, JmsItemReader or custom readers.

Chapter 7 refines the concept of writing data; starting with the concepts moving on to file writers using the FlatFileItemWriter and StaxEventItemWriter; the concept of writing to file sets is introduced at the end of the file writer section. The chapter continues with a section on writing to databases, just like the `reading data' counterpart introduces the JDBC classes such as the JdbcBatchItemWriter as well as the ORM classes such as the HibernateItemWriter. The adapting existing services for reuse introduces the ItemWriterAdapter class which allows delegation to another service. PropertyExtractingDelegatingItemWriter is another useful class and it is demonstrated in this chapter with examples. JMS logically has its own writer; JmsItemWriter is introduced in a short and powerful example. The SimpleMailMessageItemWriter is demonstrated to show that you can also send emails using Spring Batch; this is a very useful feature of Spring Batch. Chapter 7 continues with the custom item writers which as a developer you would have to write if the built in writers do not suffice. The book demonstrates how one might go about implementing such writer using the Spring Batch ItemWriter interface. Advanced writing techniques are reserved as the closing arguments in Chapter 7; in here the concept of chaining writers is introduced and how a developer could use the CompositeItemWriter to implement complex writers.

Chapter 8 focuses on processing data by enforcing and placement of business logic in the processing phase. The `in-between' item processor concept is introduced by revisiting existing code from previous chapters. In here the transformation and filtering concepts are introduced and applied to data. A short example implementation through the ItemProcessor interface is exemplified at first; while the Spring Batch built in processors are exemplified later in the chapter (ItemProcessorAdapter, ValidatingItemProcessor and CompositeItemProcessor). As part of the data processing the item transformation process is also introduced. Filtering and validation receive a good overview and the example code is very easy to follow. To simplify validation Valang, a validation language, is used; this is yet another nice addition and once again shows how to create a collaborative project using various open source projects in Java.

Chapter 9 - bulletproof jobs takes the concepts of robustness, traceability and control to a great introduction as part of the batch processing.

Chapter 10 - although transaction management is built into Spring Batch it is well worth the effort to learn what it takes to have such functionality as part of the batch processing and how to override the defaults in case of such requirement is necessary. This chapter introduces the concept of transactions via a primer; it then moves on to Spring Batch and transaction management. Common pitfalls are exemplified with common applications to JMS ad well as databases. A worthy mention is the idempotency example in the context of batch jobs.

Chapter 11 - handling/controlling execution is about controlling the jobs flow in case of failure or unexpected events. As this is a part of the Spring Batch step-based architecture, the book takes the reader through various paths of control while explaining various subtleties such as batch status vs. exit status. The chapter then moves to sharing data between steps and what techniques can be used to enable data sharing between steps. As the chapter winds down, the book explores the externalization of flow definitions to enable reusability. The final topic of this chapter is about choosing how to end jobs after the execution based on various conditions.

Chapter 12 - Enterprise integration is a complex matter and in this chapter the book first starts to explain what this topic means and what the challenges are as well as what different styles of integration exit in accordance with the "EIP book" (Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions ISBN-10: 0321200683). Once the introduction is out of the way, the topic zooms into Spring Batch and enterprise integration. At this point another Spring project is introduced; Spring Integration. The project gets a quick overview and then the combination of Spring Batch and Spring Integration is exemplified in great detail. It is worth mentioning that Dave Syer is mentioned by name at this point; Dave is a great asset to Spring batch and he has demonstrated numerous times how Spring Batch can be integrated with Spring Integration (this is also mentioned in the book). The RESTful submission of jobs brings over Spring MVC as well the RestTemplate introduced as part Spring 3.

Chapter 13 - Monitoring is a part of Spring Batch and this chapter introduces the monitoring concepts and then moves on to demonstrating how to access batch execution data and then demonstrates monitoring using various examples via Spring Batch Admin and JMX.

Chapter 14 - scalability is important and the book delivers a special chapter dedicated to this topic. Before tacking scalability, the chapter first explores performance improvements. Then the scale up and scale out concepts are introduced and then the build in features of Spring Batch are explored. This is a very attentive chapter which requires the reader to pay attention to the advice given. Multithreading examples are provided via the Spring task executor which is a very powerful functionality provided by Spring. As threading is a topic often confusing this chapter dedicates a good part to multithreading. It then moves on to remote chunking using channels with Spring Integration. Partitioning is introduced to demonstrate parallelism as part of the data processing. The chapter ends with a comparison of patterns and a bit of advice how to choose the best data processing pattern.

Chapter 15 - testing is an important part of any development process. This chapter first focuses on the basics of what it testing is and how it can be performed using JUnit and Mockito. Every phase introduced in the previous chapters is given ample coverage.

Appendix B: Setting up - this is about getting you up to speed to develop your projects using Spring Batch. The appendix does a brilliant job at quickly introducing Maven and its plug-ins relevant to Spring Batch context. Once that part is out of the way, Spring STS is introduced and a quick tour is given. At the end of this appendix one should feel ready to develop Spring Batch projects using Maven and Spring STS.

Appendix B: Spring Batch Admin - this tool is an essential piece of the puzzle in Spring Batch development and as a consequence it gets an entire and thorough appendix dedicated to it.

Conclusion: I would highly recommend this book to anyone willing to explore the topic of Batch Processing; the book is filled with meaningful examples relevant to the real world. To make matters better the code is properly divided by chapters and built with maven. This allows the projects to be easily imported using any modern Java IDE (Eclipse via m2eclipse, IntelliJ and NetBeans). By the end of this book, I became more aware of what it takes to develop batch applications and how to develop them using Spring Batch.

Product Details :
Paperback: 504 pages
Publisher: Manning Publications; 1 edition (October 7, 2011)
Language: English
ISBN-10: 1935182951
ISBN-13: 978-1935182955
Product Dimensions: 7.4 x 1 x 9.2 inches

More Details about Spring Batch in Action 1st edition

or

Download Spring Batch in Action 1st edition PDF Ebook

No comments:

Post a Comment