這次搭建我使用的機器 os 是 Centos7.4 RH 系的下面以流的方式紀錄搭建過程以及注意事項
Step1:
配置域名相關,因為只有三台機器組集群,所以直接使用了 hosts 的方法:
修改主機名 hostnamectl set-hostname ryze-1.bigdata.com 然后在 /etc/hosts 文件中配置相關機器的域名 和 域名簡寫 x.x.x.x ryze-1.bigdata.com ryze-1 x.x.x.x zed-1.bigdata.com zed-1 x.x.x.x zed-2.bigdata.com zed-2 並且測試其能互通 配置 /etc/sysconfig/network HOSTNAME=foo-1.example.com 驗證配置 uname -a 需要和 hostname 得到一致的域名
Step2:
關閉防火牆
# 防火牆關閉 iptables-save > /root/firewall.rules sudo chkconfig iptables off sudo service iptables stop
Step3:
啟動 NTP 服務
yum install ntp 設置一個同步時間服務器 /etc/ntp.conf # 筆者使用的 aliyun 這里應該是可以跳過的,我檢查了安裝 ntp 之后,里面配置了非常多 aliyun 的相關節點。 server 0.pool.ntp.org server 1.pool.ntp.org server 2.pool.ntp.org # 開啟 NTP 服務 sudo systemctl start ntpd # 配置 NTP 服務自啟動 sudo systemctl enable ntpd # 向某個服務器同步時間 ntpdate -u <ntp_server> # 同步系統時間
hwclock --systohc
所有機器完成上面配置之后,我們開始進入安裝的步驟。
Step1:
首先 Cloudera 為用戶已經准備好了專用的程序倉庫,我們需要將其下載下來
wget https://archive.cloudera.com/cm6/6.0.1/redhat7/yum/cloudera-manager.repo -P /etc/yum.repos.d/ # Import the repository signing GPG key sudo rpm --import https://archive.cloudera.com/cm6/6.0.0/redhat7/yum/RPM-GPG-KEY-cloudera
Step2: 安裝 為 CDH 6.0.x 安裝 jdk1.8
# yum 安裝 sudo yum install oracle-j2sdk1.8 # 自己下載包安裝 tar xvfz /path/to/jdk-8u<update_version>-linux-x64.tar.gz -C /usr/java/ 注意不能更換路徑
這里如果使用 yum 安裝 cloudera 的源的話會安裝 1.8u141 並且會補充一個文件
- The RHEL-compatible and Ubuntu operating systems supported by Cloudera Enterprise 6 all use AES-256 encryption by default for tickets. To support AES-256 bit encryption in JDK versions lower than 1.8u161, you must install the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy File on all cluster and Hadoop user machines. Cloudera Manager can automatically install the policy files, or you can install them manually. For JCE Policy File installation instructions, see the README.txt file included in the jce_policy-x.zip file. JDK 1.8u161 and higher enable unlimited strength encryption by default, and do not require policy files.
- On SLES platforms, do not install or try to use the IBM Java version bundled with the SLES distribution. CDH does not run correctly with that version.
需要注意一下,Java1.8u161 以上的版本似乎不受影響。
Step3:
安裝 cloudera manager 到目標機 和 配置 TLS
sudo yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server
警告如果沒有證書,就不要配這個否則后面會出現 agent 無法上報的問題。 然后配置 auto TSL sudo JAVA_HOME=/usr/java/jdk1.8.0_141-cloudera /opt/cloudera/cm-agent/bin/certmanager setup --configure-services That's it! When you start Cloudera Manager Server, it will have TLS enabled, and all hosts that you add to the cluster, as well as any supported services, will automatically have TLS configured and enabled.
Step4: 安裝 CM 使用的數據庫
# 安裝數據庫大禮包 wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm sudo rpm -ivh mysql-community-release-el7-5.noarch.rpm sudo yum update sudo yum install mysql-server sudo systemctl start mysqld # 然后開始數據庫配置 sudo systemctl stop mysqld # 拷貝日志文件 Move old InnoDB log files /var/lib/mysql/ib_logfile0 and /var/lib/mysql/ib_logfile1 out of /var/lib/mysql/ to a backup location # 配置數據庫文件 /etc/my.conf [mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock transaction-isolation = READ-COMMITTED # Disabling symbolic-links is recommended to prevent assorted security risks; # to do so, uncomment this line: symbolic-links = 0 key_buffer_size = 32M max_allowed_packet = 32M thread_stack = 256K thread_cache_size = 64 query_cache_limit = 8M query_cache_size = 64M query_cache_type = 1 max_connections = 550 #expire_logs_days = 10 #max_binlog_size = 100M #log_bin should be on a disk with enough free space. #Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your #system and chown the specified folder to the mysql user. log_bin=/var/lib/mysql/mysql_binary_log #In later versions of MySQL, if you enable the binary log and do not set #a server_id, MySQL will not start. The server_id must be unique within #the replicating group. server_id=1 binlog_format = mixed read_buffer_size = 2M read_rnd_buffer_size = 16M sort_buffer_size = 8M join_buffer_size = 8M # InnoDB settings innodb_file_per_table = 1 innodb_flush_log_at_trx_commit = 2 innodb_log_buffer_size = 64M innodb_buffer_pool_size = 4G innodb_thread_concurrency = 8 innodb_flush_method = O_DIRECT innodb_log_file_size = 512M [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid sql_mode=STRICT_ALL_TABLES # 添加到自啟動項 sudo systemctl enable mysqld # 重新將 MySQL 啟動起來 sudo systemctl start mysqld
安裝 JDBC 驅動
# Download the MySQL JDBC driver from http://www.mysql.com/downloads/connector/j/5.1.html (in .tar.gz format). As of the time of writing, you can download version 5.1.46 using wget as follows: wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz # Extract the JDBC driver JAR file from the downloaded file. For example: tar zxvf mysql-connector-java-5.1.46.tar.gz # Copy the JDBC driver, renamed, to /usr/share/java/. If the target directory does not yet exist, create it. For example: sudo mkdir -p /usr/share/java/ cd mysql-connector-java-5.1.46 sudo cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar
使用 utf8 編碼規則創建新的數據庫服務於需要使用到數據庫的服務:
Service | Database | User |
---|---|---|
Cloudera Manager Server | scm | scm |
Activity Monitor | amon | amon |
Reports Manager | rman | rman |
Hue | hue | hue |
Hive Metastore Server | metastore | hive |
Sentry Server | sentry | sentry |
Cloudera Navigator Audit Server | nav | nav |
Cloudera Navigator Metadata Server | navms | navms |
Oozie | oozie | oozie |
這里我們會使用的服務:
Cloudera Manager Server
Reports Manager
Hue
Hive Metastore Server
Cloudera Navigator Audit Server
Cloudera Navigator Metadata Server
但是我打算全部都建起來,后面再來看
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm'; GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon'; GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman'; GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue'; GRANT ALL ON metastore.* TO 'metastore'@'%' IDENTIFIED BY 'metastore'; GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry'; GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav'; GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms'; GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
Step5: setup database
由於我們的數據庫搭建在遠端 所以我們使用:
Example 2: Running the script when MySQL or MariaDB is installed on another host This example demonstrates how to run the script on the Cloudera Manager Server host (cm01.example.com) and connect to a remote MySQL or MariaDB host (db01.example.com): sudo /opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h db01.example.com --scm-host cm01.example.com scm scm Enter database password: JAVA_HOME=/usr/java/jdk1.8.0_141-cloudera Verifying that we can write to /etc/cloudera-scm-server Creating SCM configuration file in /etc/cloudera-scm-server Executing: /usr/java/jdk1.8.0_141-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db. [ main] DbCommandExecutor INFO Successfully connected to database. All done, your SCM database is configured correctly!
Step6: 安裝 CDH 和 其他的組件
# start cloudera-scm-server sudo systemctl start cloudera-scm-server # 通過該命令觀察啟動流程 sudo tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log 當看到 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server. the Cloudera Manager Admin Console is ready 然后就可以通過 ip:8070 上去了
之后就是選擇版本,選擇完了之后進入到集群安裝界面:
掃描 hosts 地址使用 FQDN 來標識,填寫之后進行掃描。
Range Definition Matching Hosts 10.1.1.[1-4] 10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4
host[1-3].example.com host1.example.com, host2.example.com, host3.example.com host[07-10].company.com host07.example.com, host08.example.com, host09.example.com, host10.example.com Note: Unqualified hostnames (short names) must be unique in a Cloudera Manager instance. For example, you cannot have both host01.example.com and host01.standby.example.com managed by the same Cloudera Manager Server. You can specify multiple addresses and address ranges by separating them with commas, semicolons, tabs, or blank spaces, or by placing them on separate lines. Use this technique to make more specific searches instead of searching overly wide ranges. Only scans that reach hosts running SSH will be selected for inclusion in your cluster by default. You can enter an address range that spans over unused addresses and then clear the nonexistent hosts later in the procedure, but wider ranges require more time to scan.
單機 serach 即可掃描,掃描到節點之后進行下一步。
Select Repository 使用默認選項直接選擇下一步。
Accept JDK License 閱讀協議☑️選擇下一步。
ssh 設置,選擇合適的選項下一步。
然后開始安裝 agents 下載速度慢可以直接用 proxychain 或者 找快的地方在 https://archive.cloudera.com/cm6/6.0.1/redhat7/yum/RPMS/x86_64/ 下好之后上傳。
然后開始安裝 parcels 因為這個在國內安裝下載速度也是超級慢所以推薦直接用網速快的地方去 https://archive.cloudera.com/cdh6/6.0.1/parcels/ 下載對應的包。
然后上傳到服務器放到 cm 機器的 /opt/cloudera/parcel-repo/ 目錄下。要下包 sha1 文件和 manifest 文件。
按照提示解決 warning
已啟用透明大頁面壓縮,可能會導致重大性能問題。請運行“echo never > /sys/kernel/mm/transparent_hugepage/defrag”和
“echo never > /sys/kernel/mm/transparent_hugepage/enabled”以禁用此設置,然后將同一命令添加到 /etc/rc.local 等初始化腳本中,以便在系統重啟時予以設置。以下主機將受到影
Starting with CDH 6, PostgreSQL-backed Hue requires the Psycopg2 version to be at least 2.5.4, see the documentation for more information.
This warning can be ignored if hosts will not run CDH 6, or will not run Hue with PostgreSQL. The following hosts have an incompatible Psycopg2 version of '2.5.1':
Step7: Set Up a Cluster Using the Wizard
選擇想要安裝的服務
進行角色配置,master 集中在一台機器的話盡量讓該機器的配置比較強。
比如我們的 ryze-1 的機器就比較強,所以承擔了更多的 master 節點的角色。
設置要安裝服務使用的數據庫,之前我們 setup 了數據庫信息,這里直接填寫之前 setup 的信息
審計查看配置參數有沒有需要修改的
命令詳細信息執行相關操作,啟動相關服務
summary 提示信息 -> 服務已安裝、配置並在群集中運行 XD。
Reference:
https://www.cloudera.com/documentation/enterprise/6/6.0/topics/installation_reqts.html Before You Install
https://www.cloudera.com/documentation/enterprise/6/6.0/topics/install_cm_cdh.html Installing Cloudera Manager, CDH, and Managed Services
http://www.uuboku.com/444.html mysql配置sql_mode中STRICT_TRANS_TABLES和STRICT_ALL_TABLES 區別
http://www.cnblogs.com/zhoujinyi/p/3179279.html InnoDB O_DIRECT選項漫談(一)【轉】
https://www.cloudera.com/documentation/enterprise/6/6.0/topics/cm_ig_installing_configuring_dbs.html#cmig_topic_5_1 Required Databases
https://www.oschina.net/question/3964891_2287725 CDH6在安裝agent時,提示安裝失敗 無法接收 Agent 發出的檢測信號
https://archive.cloudera.com/cm6/6.0.1/redhat7/yum/RPMS/x86_64/ yum 包下載包地址
https://archive.cloudera.com/cdh6/6.0.1/parcels/ parcels 下載包地址