1.4、CDH 搭建Hadoop在安裝之前(推薦的群集主機和角色分配)


推薦的群集主機和角色分配

要點: 本主題描述了Cloudera Manager管理的CDH群集的建議角色分配。您為部署選擇的實際分配 可能會有所不同,具體取決於工作負載的類型和數量,群集中部署的服務,硬件資源,配置和其他因素。

使用Cloudera Manager安裝向導安裝CDH時,Cloudera Manager會嘗試根據主機中可用的資源在群集主機(分配給網關主機的角色除外)之間分配角色。您可以在向導中顯示的“ 自定義角色分配”頁面上更改這些分配您也可以稍后使用Cloudera Manager更改和添加角色。請參閱角色實例

如果您的群集使用靜態數據加密,請參閱為密鑰受托者服務器和密鑰受托者KMS分配主機

有關在何處找到Cloudera Manager和其他服務所需的各種數據庫的信息,請參閱步驟4:安裝和配置數據庫

CDH群集主機和角色分配

群集主機可以大致描述為以下類型:
  • 主主機運行Hadoop主進程,例如HDFS NameNode和YARN Resource Manager。
  • 實用程序主機運行不是主進程的其他集群進程,例如Cloudera Manager和Hive Metastore。
  • 網關主機是用於在群集中啟動作業的客戶端訪問點。所需的網關主機數量取決於工作負載的類型和大小。
  • 工作者主機主要運行DataNode和其他分布式進程,例如Impalad。
重要提示:  Cloudera建議您在生產環境中使用CDH時始終啟用高可用性。

3 - 10 Worker Hosts without High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • YARN ResourceManager
  • JobHistory Server
  • ZooKeeper
  • Kudu master
  • Spark History Server
One host for all Utility and Gateway roles:
  • Secondary NameNode
  • Cloudera Manager
  • Cloudera Manager Management Service
  • Hive Metastore
  • HiveServer2
  • Impala Catalog Server
  • Impala StateStore
  • Hue
  • Oozie
  • Flume
  • Gateway configuration
3 - 10 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server

3 - 20 Worker Hosts with High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • JobHistory Server
  • Spark History Server
  • Kudu master
Master Host 2:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • Kudu master
Master Host 3:
  • Kudu master (Kudu requires an odd number of masters for HA.)
Utility Host 1:
  • Cloudera Manager
  • Cloudera Manager Management Service
  • Hive Metastore
  • Impala Catalog Server
  • Impala StateStore
  • Oozie
  • ZooKeeper (requires dedicated disk)
  • JournalNode (requires dedicated disk)
One or more Gateway Hosts:
  • Hue
  • HiveServer2
  • Flume
  • Gateway configuration
3 - 20 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server

20 - 80 Worker Hosts with High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • Kudu master
Master Host 2:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • Kudu master
Master Host 3:
  • ZooKeeper
  • JournalNode
  • JobHistory Server
  • Spark History Server
  • Kudu master
Utility Host 1:
  • Cloudera Manager
Utility Host 2:
  • Cloudera Manager Management Service
  • Hive Metastore
  • Impala Catalog Server
  • Oozie
One or more Gateway Hosts:
  • Hue
  • HiveServer2
  • Flume
  • Gateway configuration
20 - 80 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server

80 - 200 Worker Hosts with High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • Kudu master
Master Host 2:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • Kudu master
Master Host 3:
  • ZooKeeper
  • JournalNode
  • JobHistory Server
  • Spark History Server
  • Kudu master
Utility Host 1:
  • Cloudera Manager
Utility Host 2:
  • Hive Metastore
  • Impala Catalog Server
  • Impala StateStore
  • Oozie
Utility Host 3:
  • Activity Monitor
Utility Host 4:
  • Host Monitor
Utility Host 5:
  • Navigator Audit Server
Utility Host 6:
  • Navigator Metadata Server
Utility Host 7:
  • Reports Manager
Utility Host 8:
  • Service Monitor
One or more Gateway Hosts:
  • Hue
  • HiveServer2
  • Flume
  • Gateway configuration
80 - 200 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server (Recommended maximum number of tablet servers is 100.)

200 - 500 Worker Hosts with High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • JournalNode
  • FailoverController
  • ZooKeeper
  • Kudu master
Master Host 2:
  • NameNode
  • JournalNode
  • FailoverController
  • ZooKeeper
  • Kudu master
Master Host 3:
  • YARN ResourceManager
  • ZooKeeper
  • JournalNode
  • Kudu master
Master Host 4:
  • YARN ResourceManager
  • ZooKeeper
  • JournalNode
Master Host 5:
  • JobHistory Server
  • Spark History Server
  • ZooKeeper
  • JournalNode

We recommend no more than three Kudu masters.

Utility Host 1:
  • Cloudera Manager
Utility Host 2:
  • Hive Metastore
  • Impala Catalog Server
  • Impala StateStore
  • Oozie
Utility Host 3:
  • Activity Monitor
Utility Host 4:
  • Host Monitor
Utility Host 5:
  • Navigator Audit Server
Utility Host 6:
  • Navigator Metadata Server
Utility Host 7:
  • Reports Manager
Utility Host 8:
  • Service Monitor
One or more Gateway Hosts:
  • Hue
  • HiveServer2
  • Flume
  • Gateway configuration
200 - 500 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server (Recommended maximum number of tablet servers is 100.)

500 -1000 Worker Hosts with High Availability

Master Hosts
Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • JournalNode
  • FailoverController
  • ZooKeeper
  • Kudu master
Master Host 2:
  • NameNode
  • JournalNode
  • FailoverController
  • ZooKeeper
  • Kudu master
Master Host 3:
  • YARN ResourceManager
  • ZooKeeper
  • JournalNode
  • Kudu master
Master Host 4:
  • YARN ResourceManager
  • ZooKeeper
  • JournalNode
Master Host 5:
  • JobHistory Server
  • Spark History Server
  • ZooKeeper
  • JournalNode

We recommend no more than three Kudu masters.

Utility Host 1:
  • Cloudera Manager
Utility Host 2:
  • Hive Metastore
  • Impala Catalog Server
  • Impala StateStore
  • Oozie
Utility Host 3:
  • Activity Monitor
Utility Host 4:
  • Host Monitor
Utility Host 5:
  • Navigator Audit Server
Utility Host 6:
  • Navigator Metadata Server
Utility Host 7:
  • Reports Manager
Utility Host 8:
  • Service Monitor
One or more Gateway Hosts:
  • Hue
  • HiveServer2
  • Flume
  • Gateway configuration
500 - 1000 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server (Recommended maximum number of tablet servers is 100.)

為密鑰受托者服務器和密鑰受托者KMS分配主機

如果要為CDH群集啟用靜態數據加密,Cloudera建議您通過在Cloudera Manager管理的單獨群集中的專用主機上部署密鑰受托者服務器,將密鑰受托者服務器與其他企業數據中心(EDH)服務隔離開來。Cloudera還建議在與需要訪問Key Trustee Server的EDH服務相同的群集中的專用主機上部署Key Trustee KMS。此體系結構允許多個群集共享相同的密鑰托管服務器,並避免在重新啟動群集時重新啟動密鑰托管服務器。

有關在EDH中加密靜態數據的詳細信息,請參閱加密靜態數據

對於一般的生產環境,或者如果您已啟用HDFS的高可用性並且正在使用靜態數據加密,Cloudera建議您為密鑰受托服務器和密鑰受托者KMS啟用高可用性。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM