LSH is a cryptographic hash function designed in 2014 by South Korea to provide integrity in general-purpose software environments such as PCs and smart devices.[1] LSH is one of the cryptographic algorithms approved by the Korean Cryptographic Module Validation Program (KCMVP). And it is the national standard of South Korea (KS X 3262).
The overall structure of the hash function LSH is shown in the following figure.
The hash function LSH has the wide-pipe Merkle-Damgård structure with one-zeros padding. The message hashing process of LSH consists of the following three stages.
{\displaystyle \qquad } One-zeros padding of m {\displaystyle m}
{\displaystyle \qquad } Generation of t {\displaystyle t} message blocks { M ( i ) } i = 0 t − 1 {\displaystyle \{{\textsf {M}}^{(i)}\}_{i=0}^{t-1}} , where t = ⌈ | m | + 1 32 w ⌉ {\displaystyle t={\Big \lceil }{\frac {|m|+1}{32w}}{\Big \rceil }} from the padded bit string
{\displaystyle \qquad } CV ( 0 ) ← IV {\displaystyle {\textsf {CV}}^{(0)}\leftarrow {\textsf {IV}}}
{\displaystyle \qquad } for i = 0 {\displaystyle i=0} to ( t − 1 ) {\displaystyle (t-1)} do
{\displaystyle \qquad } {\displaystyle \qquad } CV ( i + 1 ) ← CF ( CV ( i ) , M ( i ) ) {\displaystyle {\textsf {CV}}^{(i+1)}\leftarrow {\textrm {CF}}({\textsf {CV}}^{(i)},{\textsf {M}}^{(i)})}
{\displaystyle \qquad } end for
{\displaystyle \qquad } h ← FIN n ( CV ( t ) ) {\displaystyle h\leftarrow {\textrm {FIN}}_{n}({\textsf {CV}}^{(t)})}
{\displaystyle \qquad } return h {\displaystyle h}
The specifications of the hash function LSH are as follows.
Let m {\displaystyle m} be a given bit string message. The given m {\displaystyle m} is padded by one-zeros, i.e., the bit ‘1’ is appended to the end of m {\displaystyle m} , and the bit ‘0’s are appended until a bit length of a padded message is 32 w t {\displaystyle 32wt} , where t = ⌈ ( | m | + 1 ) / 32 w ⌉ {\displaystyle t=\lceil (|m|+1)/32w\rceil } and ⌈ x ⌉ {\displaystyle \lceil x\rceil } is the smallest integer not less than x {\displaystyle x} .
Let m p = m 0 ‖ m 1 ‖ … ‖ m ( 32 w t − 1 ) {\displaystyle m_{p}=m_{0}\|m_{1}\|\ldots \|m_{(32wt-1)}} be the one-zeros-padded 32 w t {\displaystyle 32wt} -bit string of m {\displaystyle m} . Then m p {\displaystyle m_{p}} is considered as a 4 w t {\displaystyle 4wt} -byte array m a = ( m [ 0 ] , … , m [ 4 w t − 1 ] ) {\displaystyle m_{a}=(m[0],\ldots ,m[4wt-1])} , where m [ k ] = m 8 k ‖ m ( 8 k + 1 ) ‖ … ‖ m ( 8 k + 7 ) {\displaystyle m[k]=m_{8k}\|m_{(8k+1)}\|\ldots \|m_{(8k+7)}} for all 0 ≤ k ≤ ( 4 w t − 1 ) {\displaystyle 0\leq k\leq (4wt-1)} . The 4 w t {\displaystyle 4wt} -byte array m a {\displaystyle m_{a}} converts into a 32 t {\displaystyle 32t} -word array M = ( M [ 0 ] , … , M [ 32 t − 1 ] ) {\displaystyle {\textsf {M}}=(M[0],\ldots ,M[32t-1])} as follows.
M [ s ] ← m [ w s / 8 + ( w / 8 − 1 ) ] ‖ … ‖ m [ w s / 8 + 1 ] ‖ m [ w s / 8 ] {\displaystyle M[s]\leftarrow m[ws/8+(w/8-1)]\|\ldots \|m[ws/8+1]\|m[ws/8]} ( 0 ≤ s ≤ ( 32 t − 1 ) ) {\displaystyle (0\leq s\leq (32t-1))}
From the word array M {\displaystyle {\textsf {M}}} , we define the t {\displaystyle t} 32-word array message blocks { M ( i ) } i = 0 t − 1 {\displaystyle \{{\textsf {M}}^{(i)}\}_{i=0}^{t-1}} as follows.
M ( i ) ← ( M [ 32 i ] , M [ 32 i + 1 ] , … , M [ 32 i + 31 ] ) {\displaystyle {\textsf {M}}^{(i)}\leftarrow (M[32i],M[32i+1],\ldots ,M[32i+31])} ( 0 ≤ i ≤ ( t − 1 ) ) {\displaystyle (0\leq i\leq (t-1))}
The 16-word array chaining variable CV ( 0 ) {\displaystyle {\textsf {CV}}^{(0)}} is initialized to the initialization vector IV {\displaystyle {\textsf {IV}}} .
CV ( 0 ) [ l ] ← IV [ l ] {\displaystyle {\textsf {CV}}^{(0)}[l]\leftarrow {\textsf {IV}}[l]} ( 0 ≤ l ≤ 15 ) {\displaystyle (0\leq l\leq 15)}
The initialization vector IV {\displaystyle {\textsf {IV}}} is as follows. In the following tables, all values are expressed in hexadecimal form.
In this stage, the t {\displaystyle t} 32-word array message blocks { M ( i ) } i = 0 t − 1 {\displaystyle \{{\textsf {M}}^{(i)}\}_{i=0}^{t-1}} , which are generated from a message m {\displaystyle m} in the initialization stage, are compressed by iteration of compression functions. The compression function CF : W 16 × W 32 → W 16 {\displaystyle {\textrm {CF}}:{\mathcal {W}}^{16}\times {\mathcal {W}}^{32}\rightarrow {\mathcal {W}}^{16}} has two inputs; the i {\displaystyle i} -th 16-word chaining variable CV ( i ) {\displaystyle {\textsf {CV}}^{(i)}} and the i {\displaystyle i} -th 32-word message block M ( i ) {\displaystyle {\textsf {M}}^{(i)}} . And it returns the ( i + 1 ) {\displaystyle (i+1)} -th 16-word chaining variable CV ( i + 1 ) {\displaystyle {\textsf {CV}}^{(i+1)}} . Here and subsequently, W t {\displaystyle {\mathcal {W}}^{t}} denotes the set of all t {\displaystyle t} -word arrays for t ≥ 1 {\displaystyle t\geq 1} .
The following four functions are used in a compression function:
The overall structure of the compression function is shown in the following figure.
In a compression function, the message expansion function MsgExp {\displaystyle {\textrm {MsgExp}}} generates ( N s + 1 ) {\displaystyle (N_{s}+1)} 16-word array sub-messages { M j ( i ) } j = 0 N s {\displaystyle \{{\textsf {M}}_{j}^{(i)}\}_{j=0}^{N_{s}}} from given M ( i ) {\displaystyle {\textsf {M}}^{(i)}} . Let T = ( T [ 0 ] , … , T [ 15 ] ) {\displaystyle {\textsf {T}}=(T[0],\ldots ,T[15])} be a temporary 16-word array set to the i {\displaystyle i} -th chaining variable CV ( i ) {\displaystyle {\textsf {CV}}^{(i)}} . The j {\displaystyle j} -th step function Step j {\displaystyle {\textrm {Step}}_{j}} having two inputs T {\displaystyle {\textsf {T}}} and M j ( i ) {\displaystyle {\textsf {M}}_{j}^{(i)}} updates T {\displaystyle {\textsf {T}}} , i.e., T ← Step j ( T , M j ( i ) ) {\displaystyle {\textsf {T}}\leftarrow {\textrm {Step}}_{j}\left({\textsf {T}},{\textsf {M}}_{j}^{(i)}\right)} . All step functions are proceeded in order j = 0 , … , N s − 1 {\displaystyle j=0,\ldots ,N_{s}-1} . Then one more MsgAdd {\displaystyle {\textrm {MsgAdd}}} operation by M N s ( i ) {\displaystyle {\textsf {M}}_{N_{s}}^{(i)}} is proceeded, and the ( i + 1 ) {\displaystyle (i+1)} -th chaining variable CV ( i + 1 ) {\displaystyle {\textsf {CV}}^{(i+1)}} is set to T {\displaystyle {\textsf {T}}} . The process of a compression function in detail is as follows.
{\displaystyle \qquad } { M j ( i ) } j = 0 N s ← MsgExp ( M ( i ) ) {\displaystyle \{{\textsf {M}}_{j}^{(i)}\}_{j=0}^{N_{s}}\leftarrow {\textrm {MsgExp}}\left({\textsf {M}}^{(i)}\right)}
{\displaystyle \qquad } T ← CV ( i ) {\displaystyle {\textsf {T}}\leftarrow {\textsf {CV}}^{(i)}}
{\displaystyle \qquad } for j = 0 {\displaystyle j=0} to ( N s − 1 ) {\displaystyle (N_{s}-1)} do
{\displaystyle \qquad } {\displaystyle \qquad } T ← Step j ( T , M j ( i ) ) {\displaystyle {\textsf {T}}\leftarrow {\textrm {Step}}_{j}\left({\textsf {T}},{\textsf {M}}_{j}^{(i)}\right)}
{\displaystyle \qquad } CV ( i + 1 ) ← MsgAdd ( T , M N s ( i ) ) {\displaystyle {\textsf {CV}}^{(i+1)}\leftarrow {\textrm {MsgAdd}}\left({\textsf {T}},{\textsf {M}}_{N_{s}}^{(i)}\right)}
{\displaystyle \qquad } return CV ( i + 1 ) {\displaystyle {\textsf {CV}}^{(i+1)}}
Here the j {\displaystyle j} -th step function Step j : W 16 × W 16 → W 16 {\displaystyle {\textrm {Step}}_{j}:{\mathcal {W}}^{16}\times {\mathcal {W}}^{16}\rightarrow {\mathcal {W}}^{16}} is as follows.
Step j := WordPerm ∘ Mix j ∘ MsgAdd {\displaystyle {\textrm {Step}}_{j}:={\textrm {WordPerm}}\circ {\textrm {Mix}}_{j}\circ {\textrm {MsgAdd}}} ( 0 ≤ j ≤ ( N s − 1 ) ) {\displaystyle (0\leq j\leq (N_{s}-1))}
The following figure shows the j {\displaystyle j} -th step function Step j {\displaystyle {\textrm {Step}}_{j}} of a compression function.
Let M ( i ) = ( M ( i ) [ 0 ] , … , M ( i ) [ 31 ] ) {\displaystyle {\textsf {M}}^{(i)}=(M^{(i)}[0],\ldots ,M^{(i)}[31])} be the i {\displaystyle i} -th 32-word array message block. The message expansion function MsgExp {\displaystyle {\textrm {MsgExp}}} generates ( N s + 1 ) {\displaystyle (N_{s}+1)} 16-word array sub-messages { M j ( i ) } j = 0 N s {\displaystyle \{{\textsf {M}}_{j}^{(i)}\}_{j=0}^{N_{s}}} from a message block M ( i ) {\displaystyle {\textsf {M}}^{(i)}} . The first two sub-messages M 0 ( i ) = ( M 0 ( i ) [ 0 ] , … , M 0 ( i ) [ 15 ] ) {\displaystyle {\textsf {M}}_{0}^{(i)}=(M_{0}^{(i)}[0],\ldots ,M_{0}^{(i)}[15])} and M 1 ( i ) = ( M 1 ( i ) [ 0 ] , … , M 1 ( i ) [ 15 ] ) {\displaystyle {\textsf {M}}_{1}^{(i)}=(M_{1}^{(i)}[0],\ldots ,M_{1}^{(i)}[15])} are defined as follows.
The next sub-messages { M j ( i ) = ( M j ( i ) [ 0 ] , … , M j ( i ) [ 15 ] ) } j = 2 N s {\displaystyle \{{\textsf {M}}_{j}^{(i)}=(M_{j}^{(i)}[0],\ldots ,M_{j}^{(i)}[15])\}_{j=2}^{N_{s}}} are generated as follows.
Here τ {\displaystyle \tau } is the permutation over Z 16 {\displaystyle \mathbb {Z} _{16}} defined as follows.
For two 16-word arrays X = ( X [ 0 ] , … , X [ 15 ] ) {\displaystyle {\textsf {X}}=(X[0],\ldots ,X[15])} and Y = ( Y [ 0 ] , … , Y [ 15 ] ) {\displaystyle {\textsf {Y}}=(Y[0],\ldots ,Y[15])} , the message addition function MsgAdd : W 16 × W 16 → W 16 {\displaystyle {\textrm {MsgAdd}}:{\mathcal {W}}^{16}\times {\mathcal {W}}^{16}\rightarrow {\mathcal {W}}^{16}} is defined as follows.
MsgAdd ( X , Y ) := ( X [ 0 ] ⊕ Y [ 0 ] , … , X [ 15 ] ⊕ Y [ 15 ] ) {\displaystyle {\textrm {MsgAdd}}({\textsf {X}},{\textsf {Y}}):=(X[0]\oplus Y[0],\ldots ,X[15]\oplus Y[15])}
The j {\displaystyle j} -th mix function Mix j : W 16 → W 16 {\displaystyle {\textrm {Mix}}_{j}:{\mathcal {W}}^{16}\rightarrow {\mathcal {W}}^{16}} updates the 16-word array T = ( T [ 0 ] , … , T [ 15 ] ) {\displaystyle {\textsf {T}}=(T[0],\ldots ,T[15])} by mixing every two-word pair; T [ l ] {\displaystyle T[l]} and T [ l + 8 ] {\displaystyle T[l+8]} for ( 0 ≤ l < 8 ) {\displaystyle (0\leq l<8)} . For 0 ≤ j < N s {\displaystyle 0\leq j<N_{s}} , the mix function Mix j {\displaystyle {\textrm {Mix}}_{j}} proceeds as follows.
( T [ l ] , T [ l + 8 ] ) ← Mix j , l ( T [ l ] , T [ l + 8 ] ) {\displaystyle (T[l],T[l+8])\leftarrow {\textrm {Mix}}_{j,l}(T[l],T[l+8])} ( 0 ≤ l < 8 ) {\displaystyle (0\leq l<8)}
Here Mix j , l {\displaystyle {\textrm {Mix}}_{j,l}} is a two-word mix function. Let X {\displaystyle X} and Y {\displaystyle Y} be words. The two-word mix function Mix j , l : W 2 → W 2 {\displaystyle {\textrm {Mix}}_{j,l}:{\mathcal {W}}^{2}\rightarrow {\mathcal {W}}^{2}} is defined as follows.
{\displaystyle \qquad } X ← X ⊞ Y {\displaystyle X\leftarrow X\boxplus Y} ; X ← X ⋘ α j {\displaystyle \qquad X\leftarrow X^{\lll \alpha _{j}}} ;
{\displaystyle \qquad } X ← X ⊕ S C j [ l ] {\displaystyle X\leftarrow X\oplus SC_{j}[l]} ;
{\displaystyle \qquad } Y ← X ⊞ Y {\displaystyle Y\leftarrow X\boxplus Y} ; Y ← Y ⋘ β j {\displaystyle \qquad Y\leftarrow Y^{\lll \beta _{j}}} ;
{\displaystyle \qquad } X ← X ⊞ Y {\displaystyle X\leftarrow X\boxplus Y} ; Y ← Y ⋘ γ l {\displaystyle \qquad Y\leftarrow Y^{\lll \gamma _{l}}} ;
{\displaystyle \qquad } return X {\displaystyle X} , Y {\displaystyle Y} ;
The two-word mix function Mix j , l {\displaystyle {\textrm {Mix}}_{j,l}} is shown in the following figure.
The bit rotation amounts α j {\displaystyle \alpha _{j}} , β j {\displaystyle \beta _{j}} , γ l {\displaystyle \gamma _{l}} used in Mix j , l {\displaystyle {\textrm {Mix}}_{j,l}} are shown in the following table.
The j {\displaystyle j} -th 8-word array constant SC j = ( S C j [ 0 ] , … , S C j [ 7 ] ) {\displaystyle {\textsf {SC}}_{j}=(SC_{j}[0],\ldots ,SC_{j}[7])} used in Mix j , l {\displaystyle {\textrm {Mix}}_{j,l}} for 0 ≤ l < 8 {\displaystyle 0\leq l<8} is defined as follows. The initial 8-word array constant SC 0 = ( S C 0 [ 0 ] , … , S C 0 [ 7 ] ) {\displaystyle {\textsf {SC}}_{0}=(SC_{0}[0],\ldots ,SC_{0}[7])} is defined in the following table. For 1 ≤ j < N s {\displaystyle 1\leq j<N_{s}} , the j {\displaystyle j} -th constant SC j = ( S C j [ 0 ] , … , S C j [ 7 ] ) {\displaystyle {\textsf {SC}}_{j}=(SC_{j}[0],\ldots ,SC_{j}[7])} is generated by S C j [ l ] ← S C j − 1 [ l ] ⊞ S C j − 1 [ l ] ⋘ 8 {\displaystyle SC_{j}[l]\leftarrow SC_{j-1}[l]\boxplus SC_{j-1}[l]^{\lll 8}} for 0 ≤ l < 8 {\displaystyle 0\leq l<8} .
Let X = ( X [ 0 ] , … , X [ 15 ] ) {\displaystyle {\textsf {X}}=(X[0],\ldots ,X[15])} be a 16-word array. The word-permutation function WordPerm : W 16 → W 16 {\displaystyle {\textrm {WordPerm}}:{\mathcal {W}}^{16}\rightarrow {\mathcal {W}}^{16}} is defined as follows.
WordPerm ( X ) = ( X [ σ ( 0 ) ] , … , X [ σ ( 15 ) ] ) {\displaystyle {\textrm {WordPerm}}({\textsf {X}})=(X[\sigma (0)],\ldots ,X[\sigma (15)])}
Here σ {\displaystyle \sigma } is the permutation over Z 16 {\displaystyle \mathbb {Z} _{16}} defined by the following table.
The finalization function FIN n : W 16 → { 0 , 1 } n {\displaystyle {\textrm {FIN}}_{n}:{\mathcal {W}}^{16}\rightarrow \{0,1\}^{n}} returns n {\displaystyle n} -bit hash value h {\displaystyle h} from the final chaining variable CV ( t ) = ( C V ( t ) [ 0 ] , … , C V ( t ) [ 15 ] ) {\displaystyle {\textsf {CV}}^{(t)}=(CV^{(t)}[0],\ldots ,CV^{(t)}[15])} . When H = ( H [ 0 ] , … , H [ 7 ] ) {\displaystyle {\textsf {H}}=(H[0],\ldots ,H[7])} is an 8-word variable and h b = ( h b [ 0 ] , … , h b [ w − 1 ] ) {\displaystyle {\textsf {h}}_{\textsf {b}}=(h_{b}[0],\ldots ,h_{b}[w-1])} is a w {\displaystyle w} -byte variable, the finalization function FIN n {\displaystyle {\textrm {FIN}}_{n}} performs the following procedure.
Here, X [ i : j ] {\displaystyle X_{[i:j]}} denotes x i ‖ x i − 1 ‖ … ‖ x j {\displaystyle x_{i}\|x_{i-1}\|\ldots \|x_{j}} , the sub-bit string of a word X {\displaystyle X} for i ≥ j {\displaystyle i\geq j} . And x [ i : j ] {\displaystyle x_{[i:j]}} denotes x i ‖ x i + 1 ‖ … ‖ x j {\displaystyle x_{i}\|x_{i+1}\|\ldots \|x_{j}} , the sub-bit string of a l {\displaystyle l} -bit string x = x 0 ‖ x 1 ‖ … ‖ x l − 1 {\displaystyle x=x_{0}\|x_{1}\|\ldots \|x_{l-1}} for i ≤ j {\displaystyle i\leq j} .
LSH is secure against known attacks on hash functions up to now. LSH is collision-resistant for q < 2 n / 2 {\displaystyle q<2^{n/2}} and preimage-resistant and second-preimage-resistant for q < 2 n {\displaystyle q<2^{n}} in the ideal cipher model, where q {\displaystyle q} is a number of queries for LSH structure.[1] LSH-256 is secure against all the existing hash function attacks when the number of steps is 13 or more, while LSH-512 is secure if the number of steps is 14 or more. Note that the steps which work as security margin are 50% of the compression function.[1]
LSH outperforms SHA-2/3 on various software platforms. The following table shows the speed performance of 1MB message hashing of LSH on several platforms.
The following table is the comparison at the platform based on Haswell, LSH is measured on Intel Core i7-4770k @ 3.5 GHz quad core platform, and others are measured on Intel Core i5-4570S @ 2.9 GHz quad core platform.
The following table is measured on Samsung Exynos 5250 ARM Cortex-A15 @ 1.7 GHz dual core platform.
Test vectors for LSH for each digest length are as follows. All values are expressed in hexadecimal form.
LSH-256-224("abc") = F7 C5 3B A4 03 4E 70 8E 74 FB A4 2E 55 99 7C A5 12 6B B7 62 36 88 F8 53 42 F7 37 32
LSH-256-256("abc") = 5F BF 36 5D AE A5 44 6A 70 53 C5 2B 57 40 4D 77 A0 7A 5F 48 A1 F7 C1 96 3A 08 98 BA 1B 71 47 41
LSH-512-224("abc") = D1 68 32 34 51 3E C5 69 83 94 57 1E AD 12 8A 8C D5 37 3E 97 66 1B A2 0D CF 89 E4 89
LSH-512-256("abc") = CD 89 23 10 53 26 02 33 2B 61 3F 1E C1 1A 69 62 FC A6 1E A0 9E CF FC D4 BC F7 58 58 D8 02 ED EC
LSH-512-384("abc") = 5F 34 4E FA A0 E4 3C CD 2E 5E 19 4D 60 39 79 4B 4F B4 31 F1 0F B4 B6 5F D4 5E 9D A4 EC DE 0F 27 B6 6E 8D BD FA 47 25 2E 0D 0B 74 1B FD 91 F9 FE
LSH-512-512("abc") = A3 D9 3C FE 60 DC 1A AC DD 3B D4 BE F0 A6 98 53 81 A3 96 C7 D4 9D 9F D1 77 79 56 97 C3 53 52 08 B5 C5 72 24 BE F2 10 84 D4 20 83 E9 5A 4B D8 EB 33 E8 69 81 2B 65 03 1C 42 88 19 A1 E7 CE 59 6D
LSH is free for any use public or private, commercial or non-commercial. The source code for distribution of LSH implemented in C, Java, and Python can be downloaded from KISA's cryptography use activation webpage.[2]
LSH is one of the cryptographic algorithms approved by the Korean Cryptographic Module Validation Program (KCMVP).[3]
LSH is included in the following standard.