話說,盡管 SQL 數據庫一直是我們IT行業中最有用的工具,然而,它們這樣在行業中超過15年以上的“轉正”終於就要壽終正寢了。現在,雖然關系型數據庫仍然無所不在,但它越來越不能滿足我們的需要了。NoSQL成為了業界的新寵。
但是,各種 "NoSQL" 數據庫之間的差異比當年眾多關系型數據庫之間的差異要大許多。這就加大了人們在建設自己的應用是選擇合適的數據庫的難度。
在這篇匯總的PK中,我們對 Cassandra, Mongodb, CouchDB, Redis, Riak 和 HBase 進行了比較,以供參考:
CouchDB
- Written in: Erlang
- Main point關鍵點: DB consistency一致性, ease of use易用
- License 許可協議: Apache
- Protocol 協議: HTTP/REST
- Bi-directional (!) replication雙向復制,
- continuous or ad-hoc,
- with conflict detection沖突檢測,
- thus, master-master replication. (!) 主主復制
- MVCC - write operations do not block reads 寫操作不會阻塞讀操作
- Previous versions of documents are available 文本式
- Crash-only (reliable) design 可靠性設計
- Needs compacting from time to time
- Views: embedded內部嵌入 map/reduce算法
- Formatting views: lists & shows
- Server-side document validation possible
- Authentication possible
- Real-time updates via _changes (!) 實時更新
- Attachment handling
- thus, CouchApps (standalone js apps)
- jQuery library included
適用: 累計 堆積計算, 偶爾改變數據, 預先定義的查詢. 非常注重版本控制的場合.
舉例: CRM, CMS系統. 主-主復制是其特別亮點,可以易於多個站點部署。
教程:http://guide.couchdb.org/editions/1/en/index.html
Redis
- Written in: C/C++
- Main point 關鍵點: Blazing fast 超快
- License: BSD
- Protocol: Telnet-like
- Disk-backed in-memory database, 磁盤后備,內存數據庫
- but since 2.0, it can swap to disk. 但是從2.0開始直接交換到磁盤
- Master-slave replication 主-從復制
- Simple keys and values, 簡單的key-value形式
- but complex operations like ZREVRANGEBYSCORE 但是復雜操作類似ZREVRANGEBYSCORE
- INCR & co (good for rate limiting or statistics)
- Has sets (also union/diff/inter)
- Has lists (also a queue; blocking pop)
- Has hashes (objects of multiple fields)
- Of all these databases, only Redis does transactions (!) 在這些數據庫中,只有Redis有事務機制。
- Values can be set to expire (as in a cache) 如同緩存一樣,值能被設置為超過一定時間過期失效。
- Sorted sets (high score table, good for range queries) 有排序的sets,善於range查詢。
- Pub/Sub and WATCH on data changes (!) 采取Pub/Sub 和觀察者WATCH事件觸發數據變化。
適用: 在可以控制的數據庫大小情況下(放得下整個內存),快速改變數據,快速寫數據。
舉例: 股票價格系統 分析,實時數據收集,聯系等等。
MongoDB
- Written in: C++
- Main point: Retains some friendly properties of SQL. 保留類似SQL風格.(Query, index)
- License: AGPL (Drivers: Apache)
- Protocol: Custom, binary (BSON)
- Master/slave replication 主從復制(分布式狀態集群方式)
- Queries are javascript expressions 查詢是javascript表達式
- Run arbitrary javascript functions server-side
- Better update-in-place than CouchDB 比CouchDB更好地就地更新
- Sharding built-in 內置分片碎片
- Uses memory mapped files for data storage 使用內存對應文件方式實現數據存儲
- Performance over features
- After crash, it needs to repair tables 當崩潰后,需要修復表。
適用: 需要動態查詢. 願意事先定義索引indexes, 不需要 map/reduce 功能. 你需要巨大的數據庫有良好性能,你需要CouchDB但是你數據變化改變很頻繁,需要頻繁寫。
舉例: 適合所有MySQL 或者 PostgreSQL場合,它也適合
Cassandra
- Written in: Java
- Main point: 大表模型BigTable 和 Dynamo中最好的
- License: Apache
- Protocol: Custom, binary (Thrift)
- Tunable trade-offs for distribution and replication (N, R, W)
- Querying by column, range of keys 按列查詢
- BigTable-like features: columns, column families 列
- Writes are much faster than reads (!) 寫快於讀
- Map/reduce possible with Apache Hadoop
- 部分復雜性可能由於Java自身原因(如配置configuration, seeing exceptions, etc)
適用: 當寫操作多於讀操作 (如日志logging).
舉例: 銀行Banking, 金融系統,寫必須快於都的場合,實時的數據分析等.
Riak
- Written in: Erlang & C, some Javascript
- Main point: 容錯性Fault tolerance 失敗恢復 可靠性好
- License: Apache
- Protocol: HTTP/REST
- Tunable trade-offs for distribution and replication (N, R, W)
- Pre- and post-commit hooks,
- for validation and security.
- Built-in full-text search 內置全文本搜索
- 在 Javascript 中Map/reduce 或 Erlang 支持
- Comes in "open source" and "enterprise" editions 有兩個版本
適用: 如果你希望有類似Cassandra-like (Dynamo-like)風格, 但是你不想處理器復雜性和膨脹性。單服務器有良好可伸縮性scalability, 可用性availability 和容錯性 fault-tolerance, 采取是昂貴的多站點復制multi-site replication.
舉例: 銷售點數據收集,工廠控制系統,那些不能允許幾秒當機的場合。
HBase
(With the help of ghshephard)
- Written in: Java
- Main point: 十億級別的行 X 百萬級別的列 大容量
- License: Apache
- Protocol: HTTP/REST (also Thrift)
- Modeled after BigTable 大表模型
- Map/reduce with Hadoop 內置Map/reduce
- Query predicate push down via server side scan and get filters
- Optimizations for real time queries 能夠實時獲得基於查詢的優化
- A high performance Thrift gateway 高性能的Thrift型網關
- HTTP supports XML, Protobuf, and binary
- Cascading, hive, and pig source and sink modules
- Jruby-based (JIRB) shell
- No single point of failure 無單點風險
- Rolling restart for configuration changes and minor upgrades
- Random access performance is like MySQL 隨機訪問的性能類似MySQL
適用: 如果你喜歡大表模型BigTable. :) 你需要隨機 實時的讀寫操作
舉例: Facebook 消息數據庫
當然,所有這些數據庫系統都有比列在這里多得多的功能特性。我這里僅僅依據我個人認識列出一些關鍵特性,並且這些項目的開發也很活躍,我將盡力保持更新。
-- Kristof
來源:http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
項目名稱 | 語言 | 容錯性 | 持久性存儲介質 | 客戶端協議 | 數據模型 | 文檔 | 贊助商/社區 |
Project Voldemort | Java | 分區, 復制, read-repair |
Pluggable: BerkleyDB, Mysql |
Java API | Structured / blob / text |
A | Linkedin, no |
Ringo | Erlang | 分區, 復制, immutable |
Custom on-disk (append only log) |
HTTP | blob | B | 諾基亞, no |
Scalaris | Erlang | 分區, 復制, paxos |
In-memory only | Erlang, Java, HTTP |
blob | B | OnScale, no |
Kai | Erlang | 分區, 復制? | On-disk Dets file | Memcached | blob | C | no |
Dynomite | Erlang | 分區, 復制 | Pluggable: couch, dets |
Custom ascii, Thrift |
blob | D+ | Powerset, no |
MemcacheDB | C | 復制 | BerkleyDB | Memcached | blob | B | 新浪網, some |
ThruDB | C++ | 復制 | Pluggable: BerkleyDB, Custom, Mysql, S3 |
Thrift | Document oriented |
C+ | Third rail, unsure |
CouchDB | Erlang | 復制, 分區? | Custom on-disk | HTTP, json | Document oriented (json) |
A | Apache, yes |
Cassandra | Java | 復制, 分區 | Custom on-disk | Thrift | Bigtable meets Dynamo |
F | Facebook, no |
HBase | Java | 復制, 分區 | Custom on-disk | Custom API, Thrift, Rest |
Bigtable | A | Apache, yes |
Hypertable | C++ | 復制, 分區 | Custom on-disk (HDFS, KFS) |
Thrift, other | Bigtable | A | Zvents, 百度, yes |
Tokyo Tyrant | C | 復制 | Tokyo Cabinet | Memcached, HTTP, other |
blob | A | mixi.jp, no |