geomesa sparksql 分析環境搭建
1、安裝hbase-1.3.2.1 standlone版本,作為geomesa的store
a、修改配置文件:hbase-1.3.2.1/conf/hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>/home/qingzhi.lzp/hbase-1.3.2.1/data</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/tmp/zookeeper</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
<description>
likely not a false positive.
</description>
</property>
<property>
<name>hbase.coprocessor.user.region.classes</name>
<value>org.locationtech.geomesa.hbase.coprocessor.GeoMesaCoprocessor</value>
</property>
b、修改配置文件: hbase-env.sh ,增加JAVA_HOME配置
export JAVA_HOME=path
c、geomesa-hbase-distributed-runtime_2.11-2.0.2.jar 部署到hbase的lib目錄:
cp geomesa-hbase-distributed-runtime_2.11-2.0.2.jar ~/hbase-1.3.2.1/lib/
d、啟動hbase
cd hbase-1.3.2.1/bin
./start-hbase.sh
2、安裝zookeeper-3.4.10 standlone版本
a、增加zookeeper配置
cd zookeeper-3.4.10/conf
cp zoo_sample.cfg zoo.cfg
b、啟動zookeeper
cd zookeeper-3.4.10/bin
zkServer.sh start
c、查看hase是否注冊成功
./zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, hbase]
3、命令行工具安裝 geomesa-hbase_2.11-2.0.2-bin.tar.gz
a、進入目錄執行安裝命令
cd geomesa-hbase_2.11-2.0.2
$ bin/install-jai.sh
$ bin/install-jline.sh
b、導入gdelt數據到hbase
/bin/geomesa-hbase ingest --catalog gdeltable --feature-name gdelt --converter gdelt2 --spec gdelt2 /home/qingzhi.lzp/20180101.tsv
Error while parsing JAI registry file "file:/home/qingzhi.lzp/hbase-1.3.2.1/lib/geomesa-hbase-distributed-runtime_2.11-2.0.2.jar!/META-INF/registryFile.jai" :
Error in registry file at line number #31
A descriptor is already registered against the name "org.geotools.ColorReduction" under registry mode "rendered"
Error in registry file at line number #32
A descriptor is already registered against the name "org.geotools.ColorInversion" under registry mode "rendered"
INFO Creating schema 'gdelt'
INFO Running ingestion in local mode
INFO Ingesting 1 file with 1 thread
[============================================================] 100% complete 79119 ingested 0 failed in 00:00:12
INFO Local ingestion complete in 00:00:12
INFO Ingested 79119 features with no failures.
c、查看hbase里面的數據
hbase(main):001:0> list
TABLE
gdeltable
gdeltable_gdelt_id
gdeltable_gdelt_z2_v2
gdeltable_gdelt_z3_v2
4 row(s) in 0.2630 seconds
=> ["gdeltable", "gdeltable_gdelt_id", "gdeltable_gdelt_z2_v2", "gdeltable_gdelt_z3_v2"]
hbase(main):002:0>
至此,說明數據導入完成,后面主要對導入的數據使用spark進行分析。
4、安裝spark
下載spark-2.3.1-bin-hadoop2.7,解壓。
無需啟動sparkserver,ln -s spark-2.3.1-bin-hadoop2.7
.bashrc 配置spark home
export SPARK_HOME=/home/qingzhi.lzp/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
5、安裝hadoop
下載hbase-1.3.2.1-bin.tar.gz,解壓.
修改配置文件:hadoop-3.0.3/etc/hadoop/core-site.xml,增加:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
修改配置文件:hadoop-3.0.3/etc/hadoop/hdfs-site.xml,增加:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
修改配置文件:hadoop-3.0.3/etc/hadoop/hadoop-env.sh,增加:
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-3.b17.1.alios7.x86_64/jre
.bashrc 配置spark home
export HADOOP_HOME=/home/qingzhi.lzp/hadoop-3.0.3
export PATH=$PATH:$HADOOP_HOME/bin
6、安裝zeppelin可視化工具
a、安裝zeppelin-0.8.0-bin-all.tgz版本
解壓后直接啟動
zeppelin-0.8.0-bin-all/bin/zeppelin-daemon.sh start
b、web頁面訪問,進行配置
配置interpreter:
c、使用spark進行分析:
查詢hbase表數據: