What Is the Difference between Erasure Coding and Negative Coding?

Question
Erasure coding versus negative coding: what is the difference?

Answer
In the verbose output of a Hadoop or mapred job, you may have seen "INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path:"

Erasure coding is a fault-tolerant storage mechanism.  Like replication the usable capacity of the disks participating in such a mechanism is not fully utilized as in a RAID 0 configuration.  (The aggregate amount of storage of disks in RAID 0 is the sum of the individual disks' capacity.)  Also like replication, one disk, and potentially more, could fail while the data written to the storage system with erasure coding is still safe.  To learn more about erasure coding, see this link from the Storage Networking Industry Association.

Negative coding is a programming technique where code is rewritten to preserve its functionality with fewer lines of code.  A benefit to programmers is that the code base is more concise.  With fewer lines of code, the program is potentially more readable and less intimidating to scrutinize.  A benefit from a computing hardware perspective is that the code base has a smaller footprint in storage (as it resides in a code versioning repository) and uses less RAM when in memory.  For more information on negative coding, see this external link.

"The real hero of programming is the one who writes negative code." -Douglas McIlroy
This quote was taken from here. McIlroy advanced the use of pipes at Bell Labs (the origin of Unix); to read more about this, see this link. He also championed Unix philosophy of having modular components in the 1960s (page 79 of this pdf).

Leave a comment

Your email address will not be published. Required fields are marked *