前兩篇文章介紹了docker的基本命令如何安裝hadoop
那么大家會比較了解docker的基本語法的安裝過程。那么咱們今天來一起安裝一下hive。
安裝
1、下載gitHub,地址:https://github.com/prasanthj/docker-hive-on-tez。如果背牆了,可以選擇下載zip。進入目錄之后就能看見如下內容:
@~/git/github/docker-hive-on-tez-master $ ls Dockerfile datagen.py hive-log4j.properties store_sales.sql LICENSE hive-0.14 hive-site.xml store_sales.txt README.md hive-bootstrap.sh postgresql.conf
2、安裝:
docker build --no-cache=true -t local-hive-on-tez .
這是一個漫長的過程,喝一杯咖啡,該干嘛干嘛,幾個小時之后回來......
3、進入系統
docker --tls run -i -t -P local-hive-on-tez /etc/hive-bootstrap.sh -bash Starting postgresql server... / 2014-12-15 23:12:56 GMT LOG: database system was interrupted; last known up at 2014-12-15 23:10:11 GMT 2014-12-15 23:12:56 GMT LOG: database system was not properly shut down; automatic recovery in progress 2014-12-15 23:12:56 GMT LOG: redo starts at 0/1782A58
4、查看hive
root@2c1282c522bf:/# hive -f /opt/files/store_sales.sql Logging initialized using configuration in file:/usr/local/hive-dist/apache-hive-0.15.0-SNAPSHOT-bin/conf/hive-log4j.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.5.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
5、端口映射
如果你只想登錄虛擬機玩玩,這個就足夠了,可是還想看看這個hadoop運行的怎么樣,需要在頁面查看一下
@~/VirtualBox VMs/boot2docker-vm $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 2c1282c522bf local-hive-on-tez:latest "/etc/hive-bootstrap 39 hours ago Up 6 hours 0.0.0.0:49181->8032/tcp, 0.0.0.0:49182->50075/tcp, 0.0.0.0:49183->50010/tcp, 0.0.0.0:49184->50090/tcp, 0.0.0.0:49185->8031/tcp, 0.0.0.0:49186->8040/tcp, 0.0.0.0:49187->8088/tcp, 0.0.0.0:49188->22/tcp, 0.0.0.0:49189->50020/tcp, 0.0.0.0:49190->8030/tcp, 0.0.0.0:49191->49707/tcp, 0.0.0.0:49192->8033/tcp, 0.0.0.0:49193->8042/tcp, 0.0.0.0:49194->50070/tcp kickass_galileo
發現上面的一個端口是49194對應虛擬機的50070端口。
這還不夠,需要知道自己的boot2docker的ip是什么?默認是192.168.59.103
那么需要在自己的瀏覽器輸入:http://192.168.59.103:49194/ 就會看見hadoop的運行狀態了。
備注:
為什么下載github?
因為需要獲取他的Dockerfile,好讓docker知道它依賴docker-tez,然后在虛擬機執行下載和安裝hive,內容如下:
FROM prasanthj/docker-tez:tez-0.5.2 #這是說明依賴什么,下面是安裝命令 MAINTAINER Prasanth Jayachandran # to configure postgres as hive metastore backend RUN apt-get update RUN apt-get -yq install vim postgresql-9.3 libpostgresql-jdbc-java # having ADD commands will invalidate the cache forcing hive build from trunk everytime # copy config, sql, data files to /opt/files RUN mkdir /opt/files ADD hive-site.xml /opt/files/ ADD hive-log4j.properties /opt/files/ ADD store_sales.* /opt/files/ ADD datagen.py /opt/files/ # clone and compile hive ENV HIVE_VERSION 0.15.0-SNAPSHOT RUN cd /usr/local && git clone https://github.com/apache/hive.git #在天朝這可能被牆,所以速度非常慢 RUN cd /usr/local/hive && /usr/local/maven/bin/mvn clean install -DskipTests -Phadoop-2,dist RUN mkdir /usr/local/hive-dist && tar -xf /usr/local/hive/packaging/target/apache-hive-${HIVE_VERSION}-bin.tar.gz -C /usr/local/hive-dist # set hive environment ENV HIVE_HOME /usr/local/hive-dist/apache-hive-${HIVE_VERSION}-bin ENV HIVE_CONF $HIVE_HOME/conf ENV PATH $HIVE_HOME/bin:$PATH ADD hive-site.xml $HIVE_CONF/hive-site.xml ADD hive-log4j.properties $HIVE_CONF/hive-log4j.properties # zookeeper pulls jline 0.9.94 and hive pulls jline2. This workaround is from HIVE-8609 RUN rm $HADOOP_PREFIX/share/hadoop/yarn/lib/jline-0.9.94.jar # add postgresql jdbc jar to classpath RUN ln -s /usr/share/java/postgresql-jdbc4.jar $HIVE_HOME/lib/postgresql-jdbc4.jar # set permissions for hive bootstrap file ADD hive-bootstrap.sh /etc/hive-bootstrap.sh RUN chown root:root /etc/hive-bootstrap.sh RUN chmod 700 /etc/hive-bootstrap.sh # to avoid psql asking password, set PGPASSWORD ENV PGPASSWORD hive # 下面是安裝postgresql,這個在國外很流行 # To overcome the bug in AUFS that denies postgres permission to read /etc/ssl/private/ssl-cert-snakeoil.key file. # https://github.com/Painted-Fox/docker-postgresql/issues/30 # https://github.com/docker/docker/issues/783 # To avoid this issue lets disable ssl in postgres.conf. If we really need ssl to encrypt postgres connections we have to fix permissions to /etc/ssl/private directory everytime until AUFS fixes the issue ENV POSTGRESQL_MAIN /var/lib/postgresql/9.3/main/ ENV POSTGRESQL_CONFIG_FILE $POSTGRESQL_MAIN/postgresql.conf ENV POSTGRESQL_BIN /usr/lib/postgresql/9.3/bin/postgres ADD postgresql.conf $POSTGRESQL_MAIN RUN chown postgres:postgres $POSTGRESQL_CONFIG_FILE USER postgres # create metastore db, hive user and assign privileges RUN /etc/init.d/postgresql start &&\ psql --command "CREATE DATABASE metastore;" &&\ psql --command "CREATE USER hive WITH PASSWORD 'hive';" && \ psql --command "ALTER USER hive WITH SUPERUSER;" && \ psql --command "GRANT ALL PRIVILEGES ON DATABASE metastore TO hive;" && \ cd $HIVE_HOME/scripts/metastore/upgrade/postgres/ &&\ psql -h localhost -U hive -d metastore -f hive-schema-0.15.0.postgres.sql # revert back to root user USER root
參考:
https://github.com/prasanthj/docker-hive-on-tez
https://github.com/prasanthj/docker-hadoop
https://github.com/prasanthj/docker-tez