On IBM mainframes, BatchPipes is a batch job processing utility which runs under the MVS/ESA operating system and later versions—OS/390 and z/OS.[1]
Core function
In traditional processing, if data records are written out to sequential (QSAM and BSAM) data set on disk or tape, they cannot be read concurrently by another job. The "writer" and "reader" cannot run at the same time. This is termed file-level interlock or data-set-level interlock.
With BatchPipes an installation can arrange for the data to be "piped" between the two jobs. The advantage is that the jobs can run concurrently and it is possible, and very usual, to avoid the time to write the data to secondary storage and to read it back. The combination of these two characteristics, if used judiciously, leads to a reduction in the combined elapsed time of the two jobs, as measured from the start of the writer job to the end of the reader job.
BatchPipes maintains a short queue of records being passed between the writer and the reader. The writer adds records to the back of the queue and the reader takes them from the front. This is deemed record-level interlock and allows the reader and the writer to run concurrently.
A sort is a special case: all the input records must be read before the first output record can be written. Hence there can be no overlap between the input and output phases of a sort. But the input phase can be overlapped with the previous job's output phase. Similarly, the output phase of sort can be overlapped with a downstream job that reads the sorted data.
Advanced pipe topologies
More complex topologies than "one reader one writer" are possible.
- "Two readers one writer" is a good example of an attempt to balance reader's speed against a writer's speed. Because the queue is short a faster writer will often be forced to wait for a slower reader to take records off the queue before the writer can continue processing. Using two readers helps to utilize writers capabilities.
- "One job as a reader from one pipe and a writer to another" is often seen where this job edits the records. While traditional batch streams often contain such jobs, this kind of processing can be introduced using, for example IBM's DFSORT product or BatchPipeWorks (part of BatchPipes).
Criticism
One of the key implementation considerations is scheduling the reader and writer jobs to run together. In practical batch schedules this might not be feasible. Furthermore, if any job in the pipeline fails, recovery actions will be wider than just recovering this single job. For these reasons some installations have found it difficult to implement BatchPipes.
BatchPipePlex
BatchPipes can use the IBM mainframe Coupling Facility to pipe data between different members of a Parallel Sysplex, using the BatchPipePlex facility.
BatchPipeWorks
BatchPipes includes a set of pipeline stages based on IBM's CMS Pipelines product developed for the VM/ESA operating system. These stages provide additional processing, without the need for additional batch jobs in the pipeline.
History
BatchPipes Version 1 was developed in the late 1980s and early 1990s simply as a technique to speed up MVS/ESA batch processing. In 1997 the functionality of BatchPipes was integrated into a larger IBM product - SmartBatch (which incorporated two BMC Corporation product features: DataAccelerator and BatchAccelerator). However SmartBatch was discontinued in April 2000.
APT International, based in Monaco, produced a competitive product trademarked as WARP. A few months after the launch of this product, IBM renamed their OS/2 product OS/2 Warp 4, conflicting with the marketing of the performance product that was the only competitor to BatchPipes. This resulted in 7 years of litigation at Tribunal de grande instance de Paris[2][3]
Subsequently, BatchPipes Version 2 was released, incorporating BatchPipes Version 1 and some additional features from SmartBatch: BatchPipePlex and BatchPipeWorks. BatchPipes Version 2 is still a marketed IBM product.
See also
References