What is Pipelinization in Technology/Computing?

Question
In I.T. what is pipelinization?

Answer / Disambiguation
The configuration, creation, or execution of a repeatable process that involves a series of stages with a start and finish. Another definition of pipelinization would be making a procedure into a controlled stream (for reproducibility for parallel and independent development or parallel and increased throughput of the original process). A final definition would be adopting a manual or automated process by developing a sequence of substeps for a [batch] job to incrementally pass through. The pipeline could be a cyclic (a circular process) or acyclic (uni-directional) graph.

The three pipelinizations below refer to pipelines; none of those pipelines are operating system pipes (that convey output from one process to another process). The pipelines below are uni-directional.

Pipelinization for the CI/CD pipeline:
It is the transformation of a process into a pipeline. A pipeline is a sequence of operations usually across multiple servers or pods to build, compile, test, and release software code. It can be managed by a tool such as Jenkins, GitLab, TravisCI, CircleCI, Bamboo, Azure DevOps, AWS tools such as CodeBuild & CodePipeline, or others. A pipeline can be triggered automatically based on an event (a certain time of day or code being checked into a repository) or run manually.

Pipelines can create infrastructure (e.g., with Terraform), virtual networks, send automated messages, perform QA tests, and release software to multiple environments including production.

Pipelinization for an ETL process:
The implementation of a repeatable extract-transform-load (ETL) process. An ETL pipeline is a process where data is taken from one format (e.g. such as a .csv file), cleansed or modified so it can be inserted into a database table, and finally injected into the database table. Some database loads happen in an ad hoc way. To create a configuration or a platform for the integration of this loading of data into a database such that it happens on a regular basis or from a manually triggering is the pipelinization of an ETL process.

Pipelinization for programming a given processor:
Pipelinization is the utilization of parallel processing -- sending instructions to a CPU in a way that maximizes the efficiency and overall capabilities of the CPU itself. Pipelining is the sophisticated programmatic design of sending of instructions to a CPU for parallel computations to leverage superior memory of a computer. CPU registers have faster memory than the CPU's cache which is better than RAM (1). The slowest memory is that of virtual memory (saved to disk) (1). Swapping or paging is the I/O activity that happens in virtual memory.

Citation
(1) https://www.elprocus.com/memory-hierarchy-in-computer-architecture/
For processing, "[p]ipelining is a method to obtain high efficiency for processes inside computers." (Page 2 of The Art of High Performance Computing for Computational Science.)


See also this Quora question and answer.

Leave a comment

Your email address will not be published. Required fields are marked *