情況描述
- 使用JDBC從Hive中抽取數據,所以maven項目中有hive依賴庫;
- 數據導入Elasticsearch,版本2.3.1其中guava庫為18以上的版本
- hive與ES的guava版本沖突
- 現象:java.lang.NoSuchMethodError: com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
解決方法
- 將Elasticsearch中沖突庫,進行改名,重新打包;
- 在新項目中引入新打包的ES庫
方法一:Shade and relocate
簡介
- 為了避免ES中庫與其他依賴庫的沖突,可以選擇將ES依賴的沖突庫relocate,並映射到新的名詞,避免庫覆蓋。
- 因為hadoop生產環境的更新並不方便,通過maven的shade插件,重新映射庫版本更靠譜
Shade Elasticsearch
這一步將所依賴的ES庫進行shade,創建一個新的maven項目,將依賴的Elasticsearch庫依賴加入,並將沖突的庫relocate,編譯成新的jar
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>my.elasticsearch</groupId> <artifactId>es-shaded</artifactId> <version>1.0-SNAPSHOT</version> <properties> <elasticsearch.version>2.3.1</elasticsearch.version> </properties> <dependencies> <dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>${elasticsearch.version}</version> </dependency> <dependency> <groupId>org.elasticsearch.plugin</groupId> <artifactId>shield</artifactId> <version>${elasticsearch.version}</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.4.1</version> <configuration> <createDependencyReducedPom>false</createDependencyReducedPom> </configuration> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <relocations> <relocation> <pattern>com.google.guava</pattern> <shadedPattern>my.elasticsearch.guava</shadedPattern> </relocation> <relocation> <pattern>org.joda</pattern> <shadedPattern>my.elasticsearch.joda</shadedPattern> </relocation> <relocation> <pattern>com.google.common</pattern> <shadedPattern>my.elasticsearch.common</shadedPattern> </relocation> <relocation> <pattern>com.google.thirdparty</pattern> <shadedPattern>my.elasticsearch.thirdparty</shadedPattern> </relocation> </relocations> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer" /> </transformers> </configuration> </execution> </executions> </plugin> </plugins> </build> <repositories> <repository> <id>elasticsearch-releases</id> <url>http://maven.elasticsearch.org/releases</url> <releases> <enabled>true</enabled> <updatePolicy>daily</updatePolicy> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories> </project>
引入shade ES jar
在新的項目中引入上一步編譯好的ES包
<dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>${guava.version}</version> </dependency> <dependency> <groupId>my.elasticsearch</groupId> <artifactId>es-shaded</artifactId> <version>1.0-SNAPSHOT</version> </dependency>
參考:https://www.elastic.co/blog/to-shade-or-not-to-shade
方法二:修改集群job庫加載策略(未實驗)
<property> <name>mapreduce.job.user.classpath.first</name> <value>true</value> </property>