加密解密基礎問題：字節數組和(16進制)字符串的相互轉換

本文轉載自查看原文 2015-07-07 18:18 7613 Java 加密/解密

在加密時，一般加密算法和hash算法，它們操作的都是字節數組，對字節數組按照加密算法進行各種變換，運算，得到的結果也是字節數組。而我們一般是要求對字符串進行加密，所以就涉及到字符串String到 byte[] 的轉換，這個很簡單。同時在解密時，也涉及到字節數組byte[] 到 String 的轉換。另外在對用戶的密碼進行hash加密之后，最終是要保存在數據庫中，所以加密得到 byte[] 也要轉換到 String.

1. String 到 byte[] 的轉換很簡單，因為String類有直接的函數：

    public byte[] getBytes(Charset charset) {
        if (charset == null) throw new NullPointerException();
        return StringCoding.encode(charset, value, 0, value.length);
    }

    /**
     * Encodes this {@code String} into a sequence of bytes using the
     * platform's default charset, storing the result into a new byte array.
     *
     * @return  The resultant byte array
     *
     * @since      JDK1.1
     */
    public byte[] getBytes() {
        return StringCoding.encode(value, 0, value.length);
    }

2. 但是，byte[] 到String 的轉換卻沒有那么簡單

其原因是，我們不能簡單的使用使用String的函數：

     /**
     * Constructs a new {@code String} by decoding the specified array of bytes
     * using the platform's default charset.  The length of the new {@code
     * String} is a function of the charset, and hence may not be equal to the
     * length of the byte array.
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the default charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.*/
    public String(byte bytes[]) {
        this(bytes, 0, bytes.length);
    }

   /**
     * Constructs a new {@code String} by decoding the specified array of
     * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
     * The length of the new {@code String} is a function of the charset, and
     * hence may not be equal to the length of the byte array.
     *
     * <p> This method always replaces malformed-input and unmappable-character
     * sequences with this charset's default replacement string.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.*/
    public String(byte bytes[], Charset charset) {
        this(bytes, 0, bytes.length, charset);
    }

也就是不能使用 new String(byte); 也不能使用 new String(byte, charset).

為什么呢？

很簡單因為， MD5, SHA-256, SHA-512 等等算法，它們是通過對byte[] 進行各種變換和運算，得到加密之后的byte[]，那么這個加密之后的 byte[] 結果顯然就不會符合任何一種的編碼方案，比如 utf-8, GBK等，因為加密的過程是任意對byte[]進行運算的。所以你用任何一種編碼方案來解碼加密之后的 byte[] 結果，得到的都會是亂碼。

那么，我們該如何將加密的結果 byte[] 轉換到String呢？

首先，我們要問一下，為什么要將加密得到的 byte[] 轉換到 String ？

答案是因為一是要對加密的結果進行存儲，比如存入數據庫中，二是在單向不可逆的hash加密算法對密碼加密時，我們需要判斷用戶登錄的密碼是否正確，那么就涉及到兩個加密之后的byte[] 進行比較，看他們是否一致。兩個 byte[] 進行比較，可以一次比較一個單字節，也可以一次比較多個字節。也可以轉換成String, 然后比較兩個String就行了。因為加密結果要進行存儲，所以其實都是選擇轉換成String來進行比較的。

加密解密時，采用的byte[] 到 String 轉換的方法都是將 byte[] 二進制利用16進制的char[]來表示，每一個 byte 是8個bit，每4個bit對應一個16進制字符。所以一個 byte 對應於兩個 16進制字符：

public class HexUtil {
    private static final char[] DIGITS = {
            '0', '1', '2', '3', '4', '5', '6', '7',
            '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'
    };

    public static String encodeToString(byte[] bytes) {
        char[] encodedChars = encode(bytes);
        return new String(encodedChars);
    }

    public static char[] encode(byte[] data) {
        int l = data.length;
        char[] out = new char[l << 1];
        // two characters form the hex value.
        for (int i = 0, j = 0; i < l; i++) {
            out[j++] = DIGITS[(0xF0 & data[i]) >>> 4];
            out[j++] = DIGITS[0x0F & data[i]];
        }
        return out;
    }

我們知道16進制表達方式是使用 0-9 abcdef 這16個數字和字母來表示 0-15 這16個數字的。而顯然我們在String轉化時，可以用字符 '0' 來表示數字0, 可以用 '1' 來表示 1，可以用 'f' 來表示15.

所以上面我們看到16進制使用 "0-9abcdef' 16個字符來表示 0-15 這個16個數字。主要的轉換過程是 public static char[] encode(byte[] data)函數：

int l = data.length; char[] out = new char[l << 1]; 這兩句是初始化一個 char[] 數組，其數組的大小是 byte[] 參數大小的兩倍，因為每一個byte[] 轉換到到2位16進制的char[]。

(0xF0 & data[i]) >>> 4 表示先使用0xF0 & data[i], 去除了低4位上的值(其實這一步是多余的)，然后右移4位，得到byte[] 數組中第 i 個 byte 的高 4位，然后通過 DIGITS[] 數組，得到高4為對應的字符；

DIGITS[0x0F & data[i]] 表示先使用 0x0F & data[i], 去除了高4位上的值，也就得到了低4為代表的大小，然后通過 DIGITS[] 數組，得到低4為對應的字符；

通過這種方式，就可以將 byte[] 數組轉換成16進制字符表示的 char[]。最后 new String(encodedChars); 得到String類型的結果.

所以最后的String是由：'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' 這16個字符組成的String, 不含有任何其的字母。比如不會g,h,jklmn.....等等。

3. 反向轉換：String 到 byte[]

上面我們實現了 byte[] 到 String 的轉換，編碼方案使用的是16進制編碼。那么如何進行反向解碼呢？也就是將16進制編碼的String轉換成原來的byte[]呢？

    /**
     * Converts the specified Hex-encoded String into a raw byte array.  This is a
     * convenience method that merely delegates to {@link #decode(char[])} using the
     * argument's hex.toCharArray() value.
     *
     * @param hex a Hex-encoded String.
     * @return A byte array containing binary data decoded from the supplied String's char array.
     */
    public static byte[] decode(String hex) {
        return decode(hex.toCharArray());
    }
    /**
     * Converts an array of characters representing hexidecimal values into an
     * array of bytes of those same values. The returned array will be half the
     * length of the passed array, as it takes two characters to represent any
     * given byte. An exception is thrown if the passed char array has an odd
     * number of elements.
     *
     * @param data An array of characters containing hexidecimal digits
     * @return A byte array containing binary data decoded from
     *         the supplied char array.
     * @throws IllegalArgumentException if an odd number or illegal of characters
     *                                  is supplied
     */
    public static byte[] decode(char[] data) throws IllegalArgumentException {
        int len = data.length;
        if ((len & 0x01) != 0) {
            throw new IllegalArgumentException("Odd number of characters.");
        }
        byte[] out = new byte[len >> 1];
        // two characters form the hex value.
        for (int i = 0, j = 0; j < len; i++) {
            int f = toDigit(data[j], j) << 4;
            j++;
            f = f | toDigit(data[j], j);
            j++;
            out[i] = (byte) (f & 0xFF);
        }
        return out;
    }
    protected static int toDigit(char ch, int index) throws IllegalArgumentException {
        int digit = Character.digit(ch, 16);
        if (digit == -1) {
            throw new IllegalArgumentException("Illegal hexadecimal charcter " + ch + " at index " + index);
        }
        return digit;
    }

要將16進制編碼的String轉換成原來的byte[]，第一步是將 String 類型轉換到 char[] 數組，也就是將 "10ae4f" 轉換成 ['1','0','a','e','4','f']，然后將每兩個相連的 char 轉化成一個 byte. 顯然 char[] 數組的大小必須是偶數的。

byte[] out = new byte[len >> 1]; byte[] 結果是 char[] 大小的一半大。

toDigit(data[j], j) << 4 表示：toDigit() 將一個字符轉換成16進制的int大小，也就是將 '0' 轉換成數字0，將'f' 轉換成數字 f, 然后左移4位，成為byte[]的高4位；

f = f | toDigit(data[j], j); 表示先得到字符對應的數字，然后做為低4位，和高4為合並(使用 | 操作符)為一個完整的8位byte.

out[i] = (byte) (f & 0xFF); 只保留8位，將多余高位去掉。

其實就是上面的反向過程而已。

4. 例子

public class EncodeTest {
    public static void main(String[] args){
        String str = "???hello/sasewredfdd>>>. Hello 世界！"; 
        System.out.println("str.getBytes()=" + str.getBytes());
        System.out.println("Base64=" + Base64.encodeToString(str.getBytes()));
        
        String hexStr = HexUtil.encodeToString(str.getBytes());    //str.getBytes(Charset.forName("utf-8"));
        
        System.out.println("hexStr=" + hexStr);
        String orignalStr = new String(str.getBytes());     //new String(str.getBytes(), Charset.forName("utf-8"));
        System.out.println("orignalStr=" + orignalStr);
        String str2 = new String(HexUtil.decode(hexStr));
        System.out.println("str2=" + str2);
        System.out.println(str.equals(str2));
        
        String sha = new SimpleHash("sha-256", str, "11d23ccf28fc1e8cbab8fea97f101fc1d", 2).toString();
        System.out.println("sha=" + sha);
    }
}

結果：

str.getBytes()=[B@19e0bfd
Base64=Pz8/aGVsbG8vc2FzZXdyZWRmZGQ+Pj4uIEhlbGxvIOS4lueVjO+8gQ==
hexStr=3f3f3f68656c6c6f2f73617365777265646664643e3e3e2e2048656c6c6f20e4b896e7958cefbc81
orignalStr=???hello/sasewredfdd>>>. Hello 世界！
str2=???hello/sasewredfdd>>>. Hello 世界！
true
sha=37a9715fecb5e2f9812d4a02570636e3d5fe476fc67ac34bc824d6a8f835635d

最后的 new SimpleHash("sha-256", str, "11d23ccf28fc1e8cbab8fea97f101fc1d", 2).toString() ，其 .toString() 方法就是使用的 16進制的編碼將hash加密之后的 byte[] 轉換成 16進制的字符串。

我們看得到的結果：37a9715fecb5e2f9812d4a02570636e3d5fe476fc67ac34bc824d6a8f835635d

全部由'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' 這16個字符組成。不含其他任何字符。

上面我們也可以使用Base64的編碼方案：

Base64.encodeToString(str.getBytes())

它其實是使用 a-z, A-Z, 0-9, /， + 這64個字符來進行編碼的，0-63分別對應用前面的64個字符來表示。

其編碼結果的特點是：末尾可能有1個或者2個 = :

Pz8/aGVsbG8vc2FzZXdyZWRmZGQ+Pj4uIEhlbGxvIOS4lueVjO+8gQ==

其原因是，Base64編碼算法是每次處理byte[]數組中三個連續的byte，那么就有可能 byte[] 數組不是3的整數倍，那么余數就有可能是1，或者2，所以就分別使用一個 = 和兩個 = 來進行填充。

所以：

Base64的編碼其特點就是可能末尾有一個或者兩個=，可能含有 / 和 + 字符。

16進制編碼的特點是全部由'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' 這16個字符組成，不含其他字母。

加密算法都是對byte[]進行變換和運算。

有 String 轉換得到的 byte[] 就一定可以使用原來的編碼方案轉換成原來的 String,

但是加密的結果 byte[] 卻不能用任何字符編碼方案得到String, 一般使用16進制編碼成String，然后進行存儲或者比較。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 二進制字節數組和16進制字符串相互轉換字節數組與16進制字符串的相互轉換 C#字符串、字符串數組、字節、字節數組和16進制，8進制，2進制相互轉換及數字和ASCII碼互轉 16進制字符串和字節數組的互轉 java實現16進制字符串轉float浮點數、字節數組和16進制字符串相互轉換加密后字節數組和字符串相互轉換使用Apache的Hex類實現Hex(16進制字符串和)和字節數組的互轉 Go字節數組與字符串相互轉換 C# 字符串與字節數組相互轉換 C#字符串、字節數組和內存流間的相互轉換