EFM[2] belongs to the class of DC-free run-length limited (RLL) codes; these have the following two properties:
the spectrum (power density function) of the encoded sequence vanishes at the low-frequency end, and
both the minimum and maximum number of consecutive bits of the same kind are within specified bounds.[3][4]
In optical recording systems, servo mechanisms accurately follow the track in three dimensions: radial, focus, and rotational speed. Everyday handling damage, such as dust, fingerprints, and tiny scratches, not only affects retrieved data, but also disrupts the servo functions. In some cases, the servos may skip tracks or get stuck. Specific sequences of pits and lands are particularly susceptible to disc defects, and disc playability can be improved if such sequences are barred from recording. The use of EFM produces a disc that is highly resilient to handling and solves the engineering challenge in a very efficient manner.
How it works
Under EFM rules, the data to be stored is first broken into eight-bit blocks (bytes). Each eight-bit block is translated into a corresponding fourteen-bit codeword using a lookup table.
The 14-bit words are chosen such that binary ones are always separated by a minimum of two and a maximum of ten binary zeros. This is because bits are encoded with NRZI encoding, or modulo-2 integration, so that a binary one is stored on the disc as a change from a land to a pit or a pit to a land, while a binary zero is indicated by no change. A sequence 0011 would be changed into 1101 or its inverse 0010, depending on the previous pit written. If there are two consecutive zeros between two ones, then the written sequence will have three consecutive zeros (or ones), for example, 010010 will translate into 100011 (or 011100). The EFM sequence 000100010010000100 will translate into 111000011100000111 (or its inverse).
Because EFM ensures that there are at least two zeros between every two ones, it is guaranteed that every pit and land is at least three bit-clock cycles long. This property is very useful, since it reduces the demands on the optical pickup used in the playback mechanism. The ten consecutive-zero maximum ensures worst-case clock recovery in the player.
EFM requires three merging bits between adjacent fourteen-bit codewords. Although they are not needed for decoding, they ensure that consecutive codewords can be concatenated without violating the specified minimum and maximum runlength constraint. They are also selected to maintain DC balance of the encoded sequence. Thus, in the final analysis, seventeen bits of disc space are needed to encode eight bits of data.[5]
The EFMPlus encoder is based on a deterministic finite automaton having four states, which translates eight-bit input words into sixteen-bit codewords. The binary sequence generated by the finite state machine encoder has at least two and at most ten zeros between consecutive ones, which is the same as in classic EFM. There are no packing (merging) bits as in classic EFM.
EFMPlus effectively reduces storage requirements by one channel bit per user byte, increasing storage capacity by 1/16 = 6.25%. Decoding of EFMPlus-generated sequences is accomplished by a sliding-block decoder of length two, that is, two consecutive codewords are required to uniquely reconstitute the sequence of input words.