In coding theory, expander codes form a class of error-correcting codes that are constructed from bipartite expander graphs. Along with Justesen codes, expander codes are of particular interest since they have a constant positive rate, a constant positive relative distance, and a constant alphabet size. In fact, the alphabet contains only two elements, so expander codes belong to the class of binary codes. Furthermore, expander codes can be both encoded and decoded in time proportional to the block length of the code.
In coding theory, an expander code is a [ n , n − m ] 2 {\displaystyle [n,n-m]_{2}\,} linear block code whose parity check matrix is the adjacency matrix of a bipartite expander graph. These codes have good relative distance 2 ( 1 − ε ) γ {\displaystyle 2(1-\varepsilon )\gamma \,} , where ε {\displaystyle \varepsilon \,} and γ {\displaystyle \gamma \,} are properties of the expander graph as defined later, rate ( 1 − m n ) {\displaystyle \left(1-{\tfrac {m}{n}}\right)\,} , and decodability (algorithms of running time O ( n ) {\displaystyle O(n)\,} exist).
Let B {\displaystyle B} be a ( c , d ) {\displaystyle (c,d)} -biregular graph between a set of n {\displaystyle n} nodes { v 1 , ⋯ , v n } {\displaystyle \{v_{1},\cdots ,v_{n}\}} , called variables, and a set of c n / d {\displaystyle cn/d} nodes { C 1 , ⋯ , C c n / d } {\displaystyle \{C_{1},\cdots ,C_{cn/d}\}} , called constraints.
Let b ( i , j ) {\displaystyle b(i,j)} be a function designed so that, for each constraint C i {\displaystyle C_{i}} , the variables neighboring C i {\displaystyle C_{i}} are v b ( i , 1 ) , ⋯ , v b ( i , d ) {\displaystyle v_{b(i,1)},\cdots ,v_{b(i,d)}} .
Let S {\displaystyle {\mathcal {S}}} be an error-correcting code of block length d {\displaystyle d} . The expander code C ( B , S ) {\displaystyle {\mathcal {C}}(B,{\mathcal {S}})} is the code of block length n {\displaystyle n} whose codewords are the words ( x 1 , ⋯ , x n ) {\displaystyle (x_{1},\cdots ,x_{n})} such that, for 1 ≤ i ≤ c n / d {\displaystyle 1\leq i\leq cn/d} , ( x b ( i , 1 ) , ⋯ , x b ( i , d ) ) {\displaystyle (x_{b(i,1)},\cdots ,x_{b(i,d)})} is a codeword of S {\displaystyle {\mathcal {S}}} .[1]
It has been shown that nontrivial lossless expander graphs exist. Moreover, we can explicitly construct them.[2]
The rate of C {\displaystyle C\,} is its dimension divided by its block length. In this case, the parity check matrix has size m × n {\displaystyle m\times n\,} , and hence C {\displaystyle C\,} has rate at least ( n − m ) / n = 1 − m / n {\displaystyle (n-m)/n=1-m/n\,} .
Suppose ε < 1 2 {\displaystyle \varepsilon <{\tfrac {1}{2}}\,} . Then the distance of a ( n , m , d , γ , 1 − ε ) {\displaystyle (n,m,d,\gamma ,1-\varepsilon )\,} expander code C {\displaystyle C\,} is at least 2 ( 1 − ε ) γ n {\displaystyle 2(1-\varepsilon )\gamma n\,} .
Note that we can consider every codeword c {\displaystyle c\,} in C {\displaystyle C\,} as a subset of vertices S ⊂ L {\displaystyle S\subset L\,} , by saying that vertex v i ∈ S {\displaystyle v_{i}\in S\,} if and only if the i {\displaystyle i\,} th index of the codeword is a 1. Then c {\displaystyle c\,} is a codeword iff every vertex v ∈ R {\displaystyle v\in R\,} is adjacent to an even number of vertices in S {\displaystyle S\,} . (In order to be a codeword, c P = 0 {\displaystyle cP=0\,} , where P {\displaystyle P\,} is the parity check matrix. Then, each vertex in R {\displaystyle R\,} corresponds to each column of P {\displaystyle P\,} . Matrix multiplication over GF ( 2 ) = { 0 , 1 } {\displaystyle {\text{GF}}(2)=\{0,1\}\,} then gives the desired result.) So, if a vertex v ∈ R {\displaystyle v\in R\,} is adjacent to a single vertex in S {\displaystyle S\,} , we know immediately that c {\displaystyle c\,} is not a codeword. Let N ( S ) {\displaystyle N(S)\,} denote the neighbors in R {\displaystyle R\,} of S {\displaystyle S\,} , and U ( S ) {\displaystyle U(S)\,} denote those neighbors of S {\displaystyle S\,} which are unique, i.e., adjacent to a single vertex of S {\displaystyle S\,} .
For every S ⊂ L {\displaystyle S\subset L\,} of size | S | ≤ γ n {\displaystyle |S|\leq \gamma n\,} , d | S | ≥ | N ( S ) | ≥ | U ( S ) | ≥ d ( 1 − 2 ε ) | S | {\displaystyle d|S|\geq |N(S)|\geq |U(S)|\geq d(1-2\varepsilon )|S|\,} .
Trivially, | N ( S ) | ≥ | U ( S ) | {\displaystyle |N(S)|\geq |U(S)|\,} , since v ∈ U ( S ) {\displaystyle v\in U(S)\,} implies v ∈ N ( S ) {\displaystyle v\in N(S)\,} . | N ( S ) | ≤ d | S | {\displaystyle |N(S)|\leq d|S|\,} follows since the degree of every vertex in S {\displaystyle S\,} is d {\displaystyle d\,} . By the expansion property of the graph, there must be a set of d ( 1 − ε ) | S | {\displaystyle d(1-\varepsilon )|S|\,} edges which go to distinct vertices. The remaining d ε | S | {\displaystyle d\varepsilon |S|\,} edges make at most d ε | S | {\displaystyle d\varepsilon |S|\,} neighbors not unique, so U ( S ) ≥ d ( 1 − ε ) | S | − d ε | S | = d ( 1 − 2 ε ) | S | {\displaystyle U(S)\geq d(1-\varepsilon )|S|-d\varepsilon |S|=d(1-2\varepsilon )|S|\,} .
Every sufficiently small S {\displaystyle S\,} has a unique neighbor. This follows since ε < 1 2 {\displaystyle \varepsilon <{\tfrac {1}{2}}\,} .
Every subset T ⊂ L {\displaystyle T\subset L\,} with | T | < 2 ( 1 − ε ) γ n {\displaystyle |T|<2(1-\varepsilon )\gamma n\,} has a unique neighbor.
Lemma 1 proves the case | T | ≤ γ n {\displaystyle |T|\leq \gamma n\,} , so suppose 2 ( 1 − ε ) γ n > | T | > γ n {\displaystyle 2(1-\varepsilon )\gamma n>|T|>\gamma n\,} . Let S ⊂ T {\displaystyle S\subset T\,} such that | S | = γ n {\displaystyle |S|=\gamma n\,} . By Lemma 1, we know that | U ( S ) | ≥ d ( 1 − 2 ε ) | S | {\displaystyle |U(S)|\geq d(1-2\varepsilon )|S|\,} . Then a vertex v ∈ U ( S ) {\displaystyle v\in U(S)\,} is in U ( T ) {\displaystyle U(T)\,} iff v ∉ N ( T ∖ S ) {\displaystyle v\notin N(T\setminus S)\,} , and we know that | T ∖ S | ≤ 2 ( 1 − ε ) γ n − γ n = ( 1 − 2 ε ) γ n {\displaystyle |T\setminus S|\leq 2(1-\varepsilon )\gamma n-\gamma n=(1-2\varepsilon )\gamma n\,} , so by the first part of Lemma 1, we know | N ( T ∖ S ) | ≤ d ( 1 − 2 ε ) γ n {\displaystyle |N(T\setminus S)|\leq d(1-2\varepsilon )\gamma n\,} . Since ε < 1 2 {\displaystyle \varepsilon <{\tfrac {1}{2}}\,} , | U ( T ) | ≥ | U ( S ) ∖ N ( T ∖ S ) | ≥ | U ( S ) | − | N ( T ∖ S ) | > 0 {\displaystyle |U(T)|\geq |U(S)\setminus N(T\setminus S)|\geq |U(S)|-|N(T\setminus S)|>0\,} , and hence U ( T ) {\displaystyle U(T)\,} is not empty.
Note that if a T ⊂ L {\displaystyle T\subset L\,} has at least 1 unique neighbor, i.e. | U ( T ) | > 0 {\displaystyle |U(T)|>0\,} , then the corresponding word c {\displaystyle c\,} corresponding to T {\displaystyle T\,} cannot be a codeword, as it will not multiply to the all zeros vector by the parity check matrix. By the previous argument, c ∈ C ⟹ w t ( c ) ≥ 2 ( 1 − ε ) γ n {\displaystyle c\in C\implies wt(c)\geq 2(1-\varepsilon )\gamma n\,} . Since C {\displaystyle C\,} is linear, we conclude that C {\displaystyle C\,} has distance at least 2 ( 1 − ε ) γ n {\displaystyle 2(1-\varepsilon )\gamma n\,} .
The encoding time for an expander code is upper bounded by that of a general linear code - O ( n 2 ) {\displaystyle O(n^{2})\,} by matrix multiplication. A result due to Spielman shows that encoding is possible in O ( n ) {\displaystyle O(n)\,} time.[3]
Decoding of expander codes is possible in O ( n ) {\displaystyle O(n)\,} time when ε < 1 4 {\displaystyle \varepsilon <{\tfrac {1}{4}}\,} using the following algorithm.
Let v i {\displaystyle v_{i}\,} be the vertex of L {\displaystyle L\,} that corresponds to the i {\displaystyle i\,} th index in the codewords of C {\displaystyle C\,} . Let y ∈ { 0 , 1 } n {\displaystyle y\in \{0,1\}^{n}\,} be a received word, and V ( y ) = { v i ∣ the i th position of y is a 1 } {\displaystyle V(y)=\{v_{i}\mid {\text{the }}i^{\text{th}}{\text{ position of }}y{\text{ is a }}1\}\,} . Let e ( i ) {\displaystyle e(i)\,} be | { v ∈ R ∣ v i ∈ N ( v ) and N ( v ) ∩ V ( y ) is even } | {\displaystyle |\{v\in R\mid v_{i}\in N(v){\text{ and }}N(v)\cap V(y){\text{ is even}}\}|\,} , and o ( i ) {\displaystyle o(i)\,} be | { v ∈ R ∣ v i ∈ N ( v ) and N ( v ) ∩ V ( y ) is odd } | {\displaystyle |\{v\in R\mid v_{i}\in N(v){\text{ and }}N(v)\cap V(y){\text{ is odd}}\}|\,} . Then consider the greedy algorithm:
Input: received word y {\displaystyle y\,} .
initialize y' to y while there is a v in R adjacent to an odd number of vertices in V(y') if there is an i such that o(i) > e(i) flip entry i in y' else fail
Output: fail, or modified codeword y ′ {\displaystyle y'\,} .
We show first the correctness of the algorithm, and then examine its running time.
We must show that the algorithm terminates with the correct codeword when the received codeword is within half the code's distance of the original codeword. Let the set of corrupt variables be S {\displaystyle S\,} , s = | S | {\displaystyle s=|S|\,} , and the set of unsatisfied (adjacent to an odd number of vertices) vertices in R {\displaystyle R\,} be c {\displaystyle c\,} . The following lemma will prove useful.
If 0 < s < γ n {\displaystyle 0<s<\gamma n\,} , then there is a v i {\displaystyle v_{i}\,} with o ( i ) > e ( i ) {\displaystyle o(i)>e(i)\,} .
By Lemma 1, we know that U ( S ) ≥ d ( 1 − 2 ε ) s {\displaystyle U(S)\geq d(1-2\varepsilon )s\,} . So an average vertex has at least d ( 1 − 2 ε ) > d / 2 {\displaystyle d(1-2\varepsilon )>d/2\,} unique neighbors (recall unique neighbors are unsatisfied and hence contribute to o ( i ) {\displaystyle o(i)\,} ), since ε < 1 4 {\displaystyle \varepsilon <{\tfrac {1}{4}}\,} , and thus there is a vertex v i {\displaystyle v_{i}\,} with o ( i ) > e ( i ) {\displaystyle o(i)>e(i)\,} .
So, if we have not yet reached a codeword, then there will always be some vertex to flip. Next, we show that the number of errors can never increase beyond γ n {\displaystyle \gamma n\,} .
If we start with s < γ ( 1 − 2 ε ) n {\displaystyle s<\gamma (1-2\varepsilon )n\,} , then we never reach s = γ n {\displaystyle s=\gamma n\,} at any point in the algorithm.
When we flip a vertex v i {\displaystyle v_{i}\,} , o ( i ) {\displaystyle o(i)\,} and e ( i ) {\displaystyle e(i)\,} are interchanged, and since we had o ( i ) > e ( i ) {\displaystyle o(i)>e(i)\,} , this means the number of unsatisfied vertices on the right decreases by at least one after each flip. Since s < γ ( 1 − 2 ε ) n {\displaystyle s<\gamma (1-2\varepsilon )n\,} , the initial number of unsatisfied vertices is at most d γ ( 1 − 2 ε ) n {\displaystyle d\gamma (1-2\varepsilon )n\,} , by the graph's d {\displaystyle d\,} -regularity. If we reached a string with γ n {\displaystyle \gamma n\,} errors, then by Lemma 1, there would be at least d γ ( 1 − 2 ε ) n {\displaystyle d\gamma (1-2\varepsilon )n\,} unique neighbors, which means there would be at least d γ ( 1 − 2 ε ) n {\displaystyle d\gamma (1-2\varepsilon )n\,} unsatisfied vertices, a contradiction.
Lemmas 3 and 4 show us that if we start with s < γ ( 1 − 2 ε ) n {\displaystyle s<\gamma (1-2\varepsilon )n\,} (half the distance of C {\displaystyle C\,} ), then we will always find a vertex v i {\displaystyle v_{i}\,} to flip. Each flip reduces the number of unsatisfied vertices in R {\displaystyle R\,} by at least 1, and hence the algorithm terminates in at most m {\displaystyle m\,} steps, and it terminates at some codeword, by Lemma 3. (Were it not at a codeword, there would be some vertex to flip). Lemma 4 shows us that we can never be farther than γ n {\displaystyle \gamma n\,} away from the correct codeword. Since the code has distance 2 ( 1 − ε ) γ n > γ n {\displaystyle 2(1-\varepsilon )\gamma n>\gamma n\,} (since ε < 1 2 {\displaystyle \varepsilon <{\tfrac {1}{2}}\,} ), the codeword it terminates on must be the correct codeword, since the number of bit flips is less than half the distance (so we couldn't have traveled far enough to reach any other codeword).
We now show that the algorithm can achieve linear time decoding. Let n m {\displaystyle {\tfrac {n}{m}}\,} be constant, and r {\displaystyle r\,} be the maximum degree of any vertex in R {\displaystyle R\,} . Note that r {\displaystyle r\,} is also constant for known constructions.
This gives a total runtime of O ( m d r ) = O ( n ) {\displaystyle O(mdr)=O(n)\,} time, where d {\displaystyle d\,} and r {\displaystyle r\,} are constants.
This article is based on Dr. Venkatesan Guruswami's course notes.[4]