map，hash_map和unordered_map 實現比較

本文轉載自查看原文 2016-03-28 10:42 9231 STL

map介紹

Map是STL[1]的一個關聯容器，它提供一對一（其中第一個可以稱為關鍵字，每個關鍵字只能在map中出現一次，第二個可能稱為該關鍵字的值）的數據處理能力，由於這個特性，它完成有可能在我們處理一對一數據的時候，在編程上提供快速通道。這里說下map內部數據的組織，map內部自建一顆紅黑樹（一種非嚴格意義上的平衡二叉樹），這顆樹具有對數據自動排序的功能，所以在map內部所有的數據都是有序的，后邊我們會見識到有序的好處。

hash_map介紹

hash_map基於hash table（哈希表）。哈希表最大的優點，就是把數據的存儲和查找消耗的時間大大降低，幾乎可以看成是常數時間；而代價僅僅是消耗比較多的內存。然而在當前可利用內存越來越多的情況下，用空間換時間的做法是值得的。另外，編碼比較容易也是它的特點之一。

其基本原理是：使用一個下標范圍比較大的數組來存儲元素。可以設計一個函數（哈希函數，也叫做散列函數），使得每個元素的關鍵字都與一個函數值（即數組下標，hash值）相對應，於是用這個數組單元來存儲這個元素；也可以簡單的理解為，按照關鍵字為每一個元素“分類”，然后將這個元素存儲在相應“類”所對應的地方，稱為桶。

但是，不能夠保證每個元素的關鍵字與函數值是一一對應的，因此極有可能出現對於不同的元素，卻計算出了相同的函數值，這樣就產生了“沖突”，換句話說，就是把不同的元素分在了相同的“類”之中。總的來說，“直接定址”與“解決沖突”是哈希表的兩大特點。

hash_map，首先分配一大片內存，形成許多桶。是利用hash函數，對key進行映射到不同區域（桶）進行保存。其插入過程是：

1.得到key
2.通過hash函數得到hash值
3.得到桶號(一般都為hash值對桶數求模)
4.存放key和value在桶內。
其取值過程是:
1.得到key
2.通過hash函數得到hash值
3.得到桶號(一般都為hash值對桶數求模)
4.比較桶的內部元素是否與key相等，若都不相等，則沒有找到。
5.取出相等的記錄的value。
hash_map中直接地址用hash函數生成，解決沖突，用比較函數解決。這里可以看出，如果每個桶內部只有一個元素，那么查找的時候只有一次比較。當許多桶內沒有值時，許多查詢就會更快了(指查不到的時候).

由此可見，要實現哈希表, 和用戶相關的是：hash函數和比較函數。這兩個參數剛好是我們在使用hash_map時需要指定的參數。

unordered_map介紹

Unordered maps are associative containers that store elements formed by the combination of a key value and amapped value, and which allows for fast retrieval of individual elements based on their keys.

In an unordered_map, the key value is generally used to uniquely identify the element, while the mapped value is an object with the content associated to this key. Types of key and mapped value may differ.

Internally, the elements in the unordered_map are not sorted in any particular order with respect to either theirkey or mapped values, but organized into buckets depending on their hash values to allow for fast access to individual elements directly by their key values (with a constant average time complexity on average).

unordered_map containers are faster than map containers to access individual elements by their key, although they are generally less efficient for range iteration through a subset of their elements.

Unordered maps implement the direct access operator (operator[]) which allows for direct access of themapped value using its key value as argument.

unordered_map與map的區別

boost::unordered_map，它與 stl::map的區別就是，stl::map是按照operator<比較判斷元素是否相同，以及比較元素的大小，然后選擇合適的位置插入到樹中。所以，如果對map進行遍歷（中序遍歷）的話，輸出的結果是有序的。順序就是按照operator< 定義的大小排序。
而boost::unordered_map是計算元素的Hash值，根據Hash值判斷元素是否相同。所以，對unordered_map進行遍歷，結果是無序的。
用法的區別就是，stl::map 的key需要定義operator< 。而boost::unordered_map需要定義hash_value函數並且重載operator==。對於內置類型，如string，這些都不用操心。對於自定義的類型做key，就需要自己重載operator< 或者hash_value()了。
最后，說，當不需要結果排好序時，最好用unordered_map。
其實，stl::map對於與java中的TreeMap，而boost::unordered_map對應於java中的HashMap。

[cpp] view plain copy
 
/** 
比較map、hash_map和unordered_map的執行效率以及內存占用情況 
**/  
  
#include <sys/types.h>  
#include <unistd.h>  
#include <sys/time.h>   
#include <iostream>  
#include <fstream>  
#include <string>  
#include <map>  
#include <ext/hash_map>  
#include <tr1/unordered_map>  
  
using namespace std;  
  
using namespace __gnu_cxx;  
  
using namespace std::tr1;  
  
#define N 100000000  //分別測試N=100,000、N=1,000,000、N=10,000,000以及N=100,000,000  
  
//分別定義MapKey=map<int,int>、hash_map<int,int>、unordered_map<int,int>  
//typedef map<int,int> MapKey;          //采用map  
//typedef hash_map<int,int> MapKey;     //采用hash_map  
typedef unordered_map<int,int> MapKey;  //采用unordered_map  
  
  
  
int GetPidMem(pid_t pid,string& memsize)  
{  
    char filename[1024];  
      
    snprintf(filename,sizeof(filename),"/proc/%d/status",pid);  
      
    ifstream fin;  
      
    fin.open(filename,ios::in);  
    if (! fin.is_open())  
    {  
        cout<<"open "<<filename<<" error!"<<endl;  
        return (-1);  
    }  
      
    char buf[1024];  
    char size[100];  
    char unit[100];  
      
    while(fin.getline(buf,sizeof(buf)-1))  
    {  
        if (0 != strncmp(buf,"VmRSS:",6))  
            continue;  
          
        sscanf(buf+6,"%s%s",size,unit);  
          
        memsize = string(size)+string(unit);  
    }  
      
    fin.close();  
      
    return 0;  
}  
  
int main(int argc, char *argv[])  
{  
    struct timeval begin;  
      
    struct timeval end;  
          
    MapKey MyMap;  
      
    gettimeofday(&begin,NULL);  
      
    for(int i=0;i<N;++i)  
        MyMap.insert(make_pair(i,i));  
      
    gettimeofday(&end,NULL);  
      
    cout<<"insert N="<<N<<",cost="<<end.tv_sec-begin.tv_sec + float(end.tv_usec-begin.tv_usec)/1000000<<" sec"<<endl;  
      
    for(int i=0;i<N;++i)  
        MyMap.find(i);  
  
    gettimeofday(&end,NULL);  
      
    cout<<"insert and getall N="<<N<<",cost="<<end.tv_sec-begin.tv_sec + float(end.tv_usec-begin.tv_usec)/1000000<<" sec"<<endl;  
      
    string memsize;  
      
    GetPidMem(getpid(),memsize);  
      
    cout<<memsize<<endl;  
      
    return 0;  
}

運行結果

記錄數N=100000時，結果如下：

Map類型	插入耗時，單位秒	插入加遍歷耗時，單位秒	內存占用情況
map	0.110705	0.171859	5,780kB
hash_map	0.079074	0.091751	5,760kB
unordered_map	0.041311	0.050298	5,216kB

記錄數N=1000000時，結果如下：

Map類型	插入耗時，單位秒	插入加遍歷耗時，單位秒	內存占用情況
map	1.22678	1.95435	47,960kB
hash_map	0.684772	0.814646	44,632kB
unordered_map	0.311155	0.386898	40,604kB

記錄數N=10000000時，結果如下：

Map類型	插入耗時，單位秒	插入加遍歷耗時，單位秒	內存占用情況
map	14.9517	23.9928	469,844kB
hash_map	5.93318	7.18117	411,904kB
unordered_map	3.64201	4.43355	453,920kB

記錄數N=100000000時，結果如下：

Map類型	插入耗時，單位秒	插入加遍歷耗時，單位秒	內存占用情況
map	167.941	251.591	4,688,692kB
hash_map	46.3518	57.6972	3,912,632kB
unordered_map	28.359	35.122	4,3012,56kB

結果分析

運行效率方面：unordered_map最高，hash_map其次，而map效率最低

占用內存方面：hash_map內存占用最低，unordered_map其次，而map占用最高

stl::map

[cpp] view plain copy

#include<string>
#include<iostream>
#include<map>
using namespace std;
struct person
{
string name;
int age;
person(string name, int age)
{
this->name = name;
this->age = age;
}
bool operator < (const person& p) const
{
return this->age < p.age;
}
};
map<person,int> m;
int main()
{
person p1("Tom1",20);
person p2("Tom2",22);
person p3("Tom3",22);
person p4("Tom4",23);
person p5("Tom5",24);
m.insert(make_pair(p3, 100));
m.insert(make_pair(p4, 100));
m.insert(make_pair(p5, 100));
m.insert(make_pair(p1, 100));
m.insert(make_pair(p2, 100));
for(map<person, int>::iterator iter = m.begin(); iter != m.end(); iter++)
{
cout<<iter->first.name<<"\t"<<iter->first.age<<endl;
}
return 0;
}

output:

Tom1 20
Tom3 22
Tom4 23
Tom5 24

operator<的重載一定要定義成const。因為map內部實現時調用operator<的函數好像是const。

由於operator<比較的只是age,所以因為Tom2和Tom3的age相同，所以最終結果里面只有Tom3，沒有Tom2

boost::unordered_map

[cpp] view plain copy

#include<string>
#include<iostream>
#include<boost/unordered_map.hpp>
using namespace std;
struct person
{
string name;
int age;
person(string name, int age)
{
this->name = name;
this->age = age;
}
bool operator== (const person& p) const
{
return name==p.name && age==p.age;
}
};
size_t hash_value(const person& p)
{
size_t seed = 0;
boost::hash_combine(seed, boost::hash_value(p.name));
boost::hash_combine(seed, boost::hash_value(p.age));
return seed;
}
int main()
{
typedef boost::unordered_map<person,int> umap;
umap m;
person p1("Tom1",20);
person p2("Tom2",22);
person p3("Tom3",22);
person p4("Tom4",23);
person p5("Tom5",24);
m.insert(umap::value_type(p3, 100));
m.insert(umap::value_type(p4, 100));
m.insert(umap::value_type(p5, 100));
m.insert(umap::value_type(p1, 100));
m.insert(umap::value_type(p2, 100));
for(umap::iterator iter = m.begin(); iter != m.end(); iter++)
{
cout<<iter->first.name<<"\t"<<iter->first.age<<endl;
}
return 0;
}

輸出

Tom1 20
Tom5 24
Tom4 23
Tom2 22
Tom3 22

必須要自定義operator==和hash_value。重載operator==是因為，如果兩個元素的hash_value的值相同，並不能斷定這兩個元素就相同，必須再調用operator==。當然，如果hash_value的值不同，就不需要調用operator==了。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 hash_map與unordered_map區別 STL—map、hash_map、unordered_map STL中的map、unordered_map、hash_map c++ hash_map/unordered_map 使用 unordered_map的hash函數 hash_map的簡潔實現 map和unordered_map的區別 STL 之 unordered_map unordered_map詳解 std::unordered_map