挖一挖unsigned int和補碼

本文轉載自查看原文 2014-02-07 11:14 3919 變量類型自動轉換/ 數據結構&算法/ 補碼

文章要討論的是兩部分：

1. 原碼，反碼和補碼。

2. short, unsigned short, int, unsigned int, long, unsigned long的表示及轉換

1. 原碼，反碼和補碼

原碼是最直觀的表示方式：最高位表示符號(0表示正，1表示負)，其余位表示大小。假設占位為1字節的數，原碼表示的范圍就是[-127 ~ 127]一共255個數字。理論上8個bit可以表示256個數，我們只能表示255個，是原碼的設計讓10000000和00000000都可以表示0。[1]

計算機中使用的不是原碼，而是補碼。這樣做的原因在於：為了簡化計算，計算機把1-1當作1+(-1)來做，從而只需要設計加法的實現。然而原碼的表示無法讓1-1和1+(-1)結果相等。反碼雖然可以，但是最后的是1-1=+0，1+(-1)=-0，導致了0有兩種表示方法。只有補碼的設計，讓1+(-1)和1-1得到了滿意的一致結果(所有位數都為零，我們用它來表示補碼的0)。

反碼的定義：正數的反碼就是其本身；負數的反碼表示，是將其除了最高位，其余全部取反。因此，我們依然可以從最高位看出其正負。

補碼的定義：正數的補碼就是其本身；負數的補碼是在其原碼的基礎上, 符號位不變, 其余各位取反, 最后+1(即在反碼的基礎上+1)，0的補碼表示是唯一的，就是所有位全零。

例子：

[+1] = [00000001]_原 = [00000001]_反 = [00000001]_補

[-1] = [10000001]_原 = [11111110]_反 = [11111111]_補

因為機器使用補碼, 所以對於編程中常用到的32位int類型, 可以表示范圍是: [-2³¹, 2³¹-1] 因為第一位表示的是符號位。而使用補碼表示時又可以多保存一個最小值.

歸納起來，有幾個注意點：

(1) 相同位數下，原碼和反碼可以表示的下限相同，補碼可以表示的最小值則比他們還要小1。

以8位為例，原碼和反碼的下限都是-(2⁷-1)，原碼的表示是11111111，反碼的表示是10000000，補碼的-(2⁷-1)表示方式是10000001(反碼+1)，但是補碼還可以用10000000表示-2⁷。上限是正數，大家表示方法相同，因此一致。

(2) 補碼表示方法中，最小值的表示方法是最高位是1，其余全為0。

2. short, unsigned short, int, unsigned int, long, unsigned long的表示和混用的結果

cpu, OS, complier都可以32位和64位之分。但是決定一種類型占的字節數的，最直接的是complier的位數。(Ultimately the compiler does, but in order for compiled code to play nicely with system libraries, most compilers match the behavior of the compiler[s] used to build the target system.[2])

常用數據類型對應字節數[3]

32位編譯器：

      char ：1個字節
      char*（即指針變量）: 4個字節（32位的尋址空間是2^32, 即32個bit，也就是4個字節。同理64位編譯器）
      short int : 2個字節
      int：  4個字節
      unsigned int : 4個字節
      float:  4個字節
      double:   8個字節
      long:   4個字節
      long long:  8個字節
      unsigned long:  4個字節

64位編譯器：

      char ：1個字節
      char*(即指針變量): 8個字節
      short int : 2個字節
      int：  4個字節
      unsigned int : 4個字節
      float:  4個字節
      double:   8個字節
      long:   8個字節
      long long:  8個字節
      unsigned long:  8個字節

跨平台時為了避免問題，往往使用__int8， __int16，__int32，__int64。

混用的結果

比如出現：unsigned int a = 3; return a * -1; 結果會如何呢？

首先，不同類型的數在一起運算，必然會讓編譯器將它們划為同一類型再進行計算。這種類型間的自動轉化標准，被稱作Usual arithmetic conversions。下面是摘自MSDN上關於它的說明[4]：

If either operand is of type long double, the other operand is converted to type long double.
If the above condition is not met and either operand is of type double, the other operand is converted to type double.
If the above two conditions are not met and either operand is of type float, the other operand is converted to type float.
If the above three conditions are not met (none of the operands are of floating types), then integral conversions are performed on the operands as follows:
- If either operand is of type unsigned long, the other operand is converted to type unsigned long.
- If the above condition is not met and either operand is of type long and the other of type unsigned int, both operands are converted to type unsigned long.
- If the above two conditions are not met, and either operand is of type long, the other operand is converted to type long.
- If the above three conditions are not met, and either operand is of type unsigned int, the other operand is converted to type unsigned int.
- If none of the above conditions are met, both operands are converted to type int.

這樣，這個問題就好回答了，-1會被默認為int型，但是int和unsigned int做運算，int會被自動轉化為unsigned int。

那么-1轉換為unsigned int會是什么？

有了第一節中的討論，下面的推論就非常明顯：計算機中的表示方法是補碼，int的字節數是4字節，因此-1在機器中是：0xFFFFFFFF。

這個時候我們將它當作unsigned int識別出來，unsigned int的特點是：最高位不作為符號位，所有位都表示值。

因此32位編譯器上，unsigned int的范圍是[0, 2³²-1]，int的范圍是[-2³¹, 2³¹-1](補碼可以多表示一個最小值)

當0xFFFFFFFF的所有位都作為數值位時，其十進制表示就成了2³²-1，再乘以3，毫無疑問超過了32位而出現溢出，unsigned int取前32位，結果就是0xfffffffd，一個接近unsigned int上限的正整數。

這道例題來自http://blog.sina.com.cn/s/blog_4c7fa77b01000a3m.html，據說是微軟面試題 :)

在上題的分析中，我們也可以發現一點：

機器中的補碼總是不會變的，當我們把它們定義為不同的類型(int, unsigned)編譯器將他們解讀出來的值就會不同。

舉個例子，例子來自[5]的節選：

unsigned b = -10;

if (b) printf("yes\n"); else printf("no\n");

int c = b;

printf("%d\n", c);

unsigned a = 10;

int d = -20;

int e = a + d;
printf("%d\n", e);

答案是：

yes

-10

原因正如上面所說，傳值傳的是機器中的補碼，總不會變(溢出除外)，unsinged int和int只是定義了編譯器解讀它們的方式。

[5] 中還有一道題也非常有意思，這里就不轉過來了，各位看官有興趣可以移步去看看 :)

參考文章：

http://www.cnblogs.com/zhangziqiu/archive/2011/03/30/ComputerCode.html (這篇文章寫的是真好，深入淺出級別。第一部分基本上來自這篇博文)

http://stackoverflow.com/questions/13764892/what-determines-the-size-of-integer-in-c

http://www.cnblogs.com/augellis/archive/2009/09/29/1576501.html

http://msdn.microsoft.com/en-us/library/3t4w2bkb.aspx

http://www.cnblogs.com/krythur/archive/2012/10/29/2744398.html

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 不要挖門羅不建議雙挖 BZOJ 2448: 挖油 Docker從入門到掉坑(五)：繼續挖一挖 k8s 挖一挖C#中那些我們不常用的東西之系列（1）——ToDictionary,ToLookup 批量挖sql注入漏洞挖一挖C#中那些我們不常用的東西之系列（3）——StackTrace，Trim 挖一挖C#中那些我們不常用的東西之系列（4）——GetHashCode，ExpandoObject 挖一挖C#中那些我們不常用的東西之系列（5）——FlagAttribute 一個比特幣要挖多久