前面一篇,帶大家對加密算法進行了鳥瞰,本篇主要談md5算法的實現。
MD5:Message-Digest Algorithm 5(信息摘要5),確保信息的完整性。其算法是1992年公開的,那時我才幾歲,鑒於大家對md5都很熟悉,且程序中經常應用,我就不再介紹了。我簡單的介紹下設計者。其人是羅納德·李維斯特,美國密碼學家,后來發明了非對稱秘鑰RSA算法,因這個算法的在信息安全中的突破與重要性而獲得了2002年的圖靈獎。
好了,接下來一起看算法步驟以及源代碼:
1、填充
在MD5算法中,首先需要對信息進行填充,使其位長對512求余的結果等於448,並且填充必須進行,使其位長對512求余的結果等於448。因此,信息的位長(Bits Length)將被擴展至N*512+448,N為一個非負整數,N可以是零。
理解:位長,就是位數。比如一個“wbq”,字符串是三個字節存儲,一個字節8bit,所以位長就是24。
用數學語言可能更簡潔:設M為位長,當且僅當 M%512==448時,才可以處理。換另一種表示方式,M=N*512+448 ,N>=0
填充的方法如下:
1) 在信息的后面填充一個1和無數個0,直到滿足上面的條件時才停止用0對信息的填充。
2) 在這個結果后面附加一個以64位二進制表示的填充前信息長度(單位為Bit),如果二進制表示的填充前信息長度超過64位,則取低64位。
經過這兩步的處理,M=N*512+448+64=(N+1)*512,即長度恰好是512的整數倍。這樣做的原因是為滿足后面處理中對信息長度的要求。
經過兩步處理后,信息變成了這樣,如下圖所示:
64位,8個字節,用來表示原始信息的位長。
1 private static UInt32[] MD5_Append(byte[] input) 2 { 3 int zeros = 0; 4 int ones = 1; 5 int size = 0; 6 int n = input.Length; 7 int m = n % 64; 8 if (m < 56) 9 { 10 zeros = 55 - m; 11 size = n - m + 64; 12 } 13 else if (m == 56) 14 { 15 zeros = 0; 16 ones = 0; 17 size = n + 8; 18 } 19 else 20 { 21 zeros = 63 - m + 56; 22 size = n + 64 - m + 64; 23 } 24 25 ArrayList bs = new ArrayList(input); 26 if (ones == 1) 27 { 28 bs.Add((byte)0x80); // 0x80 = $10000000 29 } 30 for (int i = 0; i < zeros; i++) 31 { 32 bs.Add((byte)0); 33 } 34 35 UInt64 N = (UInt64)n * 8; 36 byte h1 = (byte)(N & 0xFF); 37 byte h2 = (byte)((N >> 8) & 0xFF); 38 39 byte h3 = (byte)((N >> 16) & 0xFF); 40 byte h4 = (byte)((N >> 24) & 0xFF); 41 byte h5 = (byte)((N >> 32) & 0xFF); 42 byte h6 = (byte)((N >> 40) & 0xFF); 43 byte h7 = (byte)((N >> 48) & 0xFF); 44 byte h8 = (byte)(N >> 56); 45 bs.Add(h1); 46 bs.Add(h2); 47 bs.Add(h3); 48 bs.Add(h4); 49 bs.Add(h5); 50 bs.Add(h6); 51 bs.Add(h7); 52 bs.Add(h8); 53 byte[] ts = (byte[])bs.ToArray(typeof(byte)); 54 55 /* Decodes input (byte[]) into output (UInt32[]). Assumes len is 56 * a multiple of 4. 57 */ 58 UInt32[] output = new UInt32[size / 4]; 59 for (Int64 i = 0, j = 0; i < size; j++, i += 4) 60 { 61 output[j] = (UInt32)(ts[i] | ts[i + 1] << 8 | ts[i + 2] << 16 | ts[i + 3] << 24); 62 } 63 return output; 64 }
說明,補多少0,如何補?第7行,求余。第10行,為什么是55-m,而不是56-m?此時m<56,56-m表示,還需要補多少。因為需要補1個1,所以補0,就是56-m-1=55-m。那么變更后的長度size如何計算?應該是新長度=原始長度+補1的長度+補0的長度+最后64位的長度,第11行 size = n - m + 64,推導如下:
size=n+1+55-m+8=n-m+64
注意:這里的計算都是字節數的計算
其余兩個分支,可以以此類推。從35-44行,把原始信息的位長轉為字節,追加到數組后面。58行以后,是把信息划分了4組。分組是UInt32,無符號32位,即4個字節。61行的操作,就是把四個字節轉為一個UInt32。
2、初始化變量
private static void MD5_Init() { A = 0x67452301; //in memory, this is 0x01234567 B = 0xefcdab89; //in memory, this is 0x89abcdef C = 0x98badcfe; //in memory, this is 0xfedcba98 D = 0x10325476; //in memory, this is 0x76543210 }
注意:這里用的是小端模式,什么是大端和小端模式?
舉一個例子,比如數字0x12 34 56 78在內存中的表示形式。
1)大端模式:Big-Endian就是高位字節排放在內存的低地址端,低位字節排放在內存的高地址端。(其實大端模式比較直觀)
低地址 --------------------> 高地址
0x12 | 0x34 | 0x56 | 0x78
2)小端模式:Little-Endian就是低位字節排放在內存的低地址端,高位字節排放在內存的高地址端。
低地址 --------------------> 高地址
0x78 | 0x56 | 0x34 | 0x12
3. 處理分組數據
private static UInt32[] MD5_Trasform(UInt32[] x) { UInt32 a, b, c, d; for (int k = 0; k < x.Length; k += 16) { a = A; b = B; c = C; d = D; /* Round 1 */ FF(ref a, b, c, d, x[k + 0], S11, 0xd76aa478); /* 1 */ FF(ref d, a, b, c, x[k + 1], S12, 0xe8c7b756); /* 2 */ FF(ref c, d, a, b, x[k + 2], S13, 0x242070db); /* 3 */ FF(ref b, c, d, a, x[k + 3], S14, 0xc1bdceee); /* 4 */ FF(ref a, b, c, d, x[k + 4], S11, 0xf57c0faf); /* 5 */ FF(ref d, a, b, c, x[k + 5], S12, 0x4787c62a); /* 6 */ FF(ref c, d, a, b, x[k + 6], S13, 0xa8304613); /* 7 */ FF(ref b, c, d, a, x[k + 7], S14, 0xfd469501); /* 8 */ FF(ref a, b, c, d, x[k + 8], S11, 0x698098d8); /* 9 */ FF(ref d, a, b, c, x[k + 9], S12, 0x8b44f7af); /* 10 */ FF(ref c, d, a, b, x[k + 10], S13, 0xffff5bb1); /* 11 */ FF(ref b, c, d, a, x[k + 11], S14, 0x895cd7be); /* 12 */ FF(ref a, b, c, d, x[k + 12], S11, 0x6b901122); /* 13 */ FF(ref d, a, b, c, x[k + 13], S12, 0xfd987193); /* 14 */ FF(ref c, d, a, b, x[k + 14], S13, 0xa679438e); /* 15 */ FF(ref b, c, d, a, x[k + 15], S14, 0x49b40821); /* 16 */ /* Round 2 */ GG(ref a, b, c, d, x[k + 1], S21, 0xf61e2562); /* 17 */ GG(ref d, a, b, c, x[k + 6], S22, 0xc040b340); /* 18 */ GG(ref c, d, a, b, x[k + 11], S23, 0x265e5a51); /* 19 */ GG(ref b, c, d, a, x[k + 0], S24, 0xe9b6c7aa); /* 20 */ GG(ref a, b, c, d, x[k + 5], S21, 0xd62f105d); /* 21 */ GG(ref d, a, b, c, x[k + 10], S22, 0x2441453); /* 22 */ GG(ref c, d, a, b, x[k + 15], S23, 0xd8a1e681); /* 23 */ GG(ref b, c, d, a, x[k + 4], S24, 0xe7d3fbc8); /* 24 */ GG(ref a, b, c, d, x[k + 9], S21, 0x21e1cde6); /* 25 */ GG(ref d, a, b, c, x[k + 14], S22, 0xc33707d6); /* 26 */ GG(ref c, d, a, b, x[k + 3], S23, 0xf4d50d87); /* 27 */ GG(ref b, c, d, a, x[k + 8], S24, 0x455a14ed); /* 28 */ GG(ref a, b, c, d, x[k + 13], S21, 0xa9e3e905); /* 29 */ GG(ref d, a, b, c, x[k + 2], S22, 0xfcefa3f8); /* 30 */ GG(ref c, d, a, b, x[k + 7], S23, 0x676f02d9); /* 31 */ GG(ref b, c, d, a, x[k + 12], S24, 0x8d2a4c8a); /* 32 */ /* Round 3 */ HH(ref a, b, c, d, x[k + 5], S31, 0xfffa3942); /* 33 */ HH(ref d, a, b, c, x[k + 8], S32, 0x8771f681); /* 34 */ HH(ref c, d, a, b, x[k + 11], S33, 0x6d9d6122); /* 35 */ HH(ref b, c, d, a, x[k + 14], S34, 0xfde5380c); /* 36 */ HH(ref a, b, c, d, x[k + 1], S31, 0xa4beea44); /* 37 */ HH(ref d, a, b, c, x[k + 4], S32, 0x4bdecfa9); /* 38 */ HH(ref c, d, a, b, x[k + 7], S33, 0xf6bb4b60); /* 39 */ HH(ref b, c, d, a, x[k + 10], S34, 0xbebfbc70); /* 40 */ HH(ref a, b, c, d, x[k + 13], S31, 0x289b7ec6); /* 41 */ HH(ref d, a, b, c, x[k + 0], S32, 0xeaa127fa); /* 42 */ HH(ref c, d, a, b, x[k + 3], S33, 0xd4ef3085); /* 43 */ HH(ref b, c, d, a, x[k + 6], S34, 0x4881d05); /* 44 */ HH(ref a, b, c, d, x[k + 9], S31, 0xd9d4d039); /* 45 */ HH(ref d, a, b, c, x[k + 12], S32, 0xe6db99e5); /* 46 */ HH(ref c, d, a, b, x[k + 15], S33, 0x1fa27cf8); /* 47 */ HH(ref b, c, d, a, x[k + 2], S34, 0xc4ac5665); /* 48 */ /* Round 4 */ II(ref a, b, c, d, x[k + 0], S41, 0xf4292244); /* 49 */ II(ref d, a, b, c, x[k + 7], S42, 0x432aff97); /* 50 */ II(ref c, d, a, b, x[k + 14], S43, 0xab9423a7); /* 51 */ II(ref b, c, d, a, x[k + 5], S44, 0xfc93a039); /* 52 */ II(ref a, b, c, d, x[k + 12], S41, 0x655b59c3); /* 53 */ II(ref d, a, b, c, x[k + 3], S42, 0x8f0ccc92); /* 54 */ II(ref c, d, a, b, x[k + 10], S43, 0xffeff47d); /* 55 */ II(ref b, c, d, a, x[k + 1], S44, 0x85845dd1); /* 56 */ II(ref a, b, c, d, x[k + 8], S41, 0x6fa87e4f); /* 57 */ II(ref d, a, b, c, x[k + 15], S42, 0xfe2ce6e0); /* 58 */ II(ref c, d, a, b, x[k + 6], S43, 0xa3014314); /* 59 */ II(ref b, c, d, a, x[k + 13], S44, 0x4e0811a1); /* 60 */ II(ref a, b, c, d, x[k + 4], S41, 0xf7537e82); /* 61 */ II(ref d, a, b, c, x[k + 11], S42, 0xbd3af235); /* 62 */ II(ref c, d, a, b, x[k + 2], S43, 0x2ad7d2bb); /* 63 */ II(ref b, c, d, a, x[k + 9], S44, 0xeb86d391); /* 64 */ A += a; B += b; C += c; D += d; } return new UInt32[] { A, B, C, D }; }
每一個分組經過64輪處理,FF、GG、HH、II為處理函數。從上面程序,可以看出,每16個數字為一組。以上是算法的核心處理方法,下面是程序主方法:
public static byte[] MD5Array(byte[] input) { MD5_Init(); UInt32[] block = MD5_Append(input); UInt32[] bits = MD5_Trasform(block); /* Encodes bits (UInt32[]) into output (byte[]). Assumes len is * a multiple of 4. */ byte[] output = new byte[bits.Length * 4]; for (int i = 0, j = 0; i < bits.Length; i++, j += 4) { output[j] = (byte)(bits[i] & 0xff); output[j + 1] = (byte)((bits[i] >> 8) & 0xff); output[j + 2] = (byte)((bits[i] >> 16) & 0xff); output[j + 3] = (byte)((bits[i] >> 24) & 0xff); } return output; }
把output連接起來,就是md5值,output傳入到下面方法:
public static string ArrayToHexString(byte[] array, bool uppercase) { string hexString = ""; string format = "x2"; if (uppercase) { format = "X2"; } foreach (byte b in array) { hexString += b.ToString(format); } return hexString; }
附錄:常量和基礎函數:
//static state variables private static UInt32 A; private static UInt32 B; private static UInt32 C; private static UInt32 D; #region 常量 //number of bits to rotate in tranforming private const int S11 = 7; private const int S12 = 12; private const int S13 = 17; private const int S14 = 22; private const int S21 = 5; private const int S22 = 9; private const int S23 = 14; private const int S24 = 20; private const int S31 = 4; private const int S32 = 11; private const int S33 = 16; private const int S34 = 23; private const int S41 = 6; private const int S42 = 10; private const int S43 = 15; private const int S44 = 21; #endregion #region 基礎函數 /* F, G, H and I are basic MD5 functions. * 四個非線性函數: * * F(X,Y,Z) =(X&Y)|((~X)&Z) * G(X,Y,Z) =(X&Z)|(Y&(~Z)) * H(X,Y,Z) =X^Y^Z * I(X,Y,Z)=Y^(X|(~Z)) * * (&與,|或,~非,^異或) */ private static uint F(UInt32 x, UInt32 y, UInt32 z) { return (x & y) | ((~x) & z); } private static uint G(UInt32 x, UInt32 y, UInt32 z) { return (x & z) | (y & (~z)); } private static uint H(UInt32 x, UInt32 y, UInt32 z) { return x ^ y ^ z; } private static uint I(UInt32 x, UInt32 y, UInt32 z) { return y ^ (x | (~z)); } /* FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4. * Rotation is separate from addition to prevent recomputation. */ private static void FF(ref UInt32 a, UInt32 b, UInt32 c, UInt32 d, UInt32 mj, int s, UInt32 ti) { a = a + F(b, c, d) + mj + ti; a = a << s | a >> (32 - s); a += b; } private static void GG(ref UInt32 a, UInt32 b, UInt32 c, UInt32 d, UInt32 mj, int s, UInt32 ti) { a = a + G(b, c, d) + mj + ti; a = a << s | a >> (32 - s); a += b; } private static void HH(ref UInt32 a, UInt32 b, UInt32 c, UInt32 d, UInt32 mj, int s, UInt32 ti) { a = a + H(b, c, d) + mj + ti; a = a << s | a >> (32 - s); a += b; } private static void II(ref UInt32 a, UInt32 b, UInt32 c, UInt32 d, UInt32 mj, int s, UInt32 ti) { a = a + I(b, c, d) + mj + ti; a = a << s | a >> (32 - s); a += b; } #endregion
說明:
假設Mj表示消息的第j個子分組(從0到15),常數ti是4294967296*abs( sin(i) )的整數部分,i 取值從1到64,單位是弧度。(4294967296=2的32次方)
現定義:
FF(a ,b ,c ,d ,Mj ,s ,ti ) 操作為 a = b + ( (a + F(b,c,d) + Mj + ti) << s)
GG(a ,b ,c ,d ,Mj ,s ,ti ) 操作為 a = b + ( (a + G(b,c,d) + Mj + ti) << s)
HH(a ,b ,c ,d ,Mj ,s ,ti) 操作為 a = b + ( (a + H(b,c,d) + Mj + ti) << s)
II(a ,b ,c ,d ,Mj ,s ,ti) 操作為 a = b + ( (a + I(b,c,d) + Mj + ti) << s)
注意:此處“<<”表示循環左移位,不是左移位。函數內部有循環左移位的處理,符號本身表示左移位。FF函數的第二行代碼如下:
a = a << s | a >> (32 - s);
它先左移,然后右移,兩者與操作。左移,右邊補0。右移,左邊補0。所以實現了循環左移。可以想象把一直線,首尾相連,然后移動點,最后從某處切開,變成了新的首尾。
小結:關於MD5的算法,還算是比較簡單的算法,相比其它的加密算法而言。每一個算法都值得去推敲和學習。