背景:
監聽器針對RabbitMQ隊列做業務數據處理
系統問題表現:
業務系統無法正常使用,所有請求均不予相應,報404異常
控制台問題表現:
接收隊列數據的logger日志打印,但是相關sql不打印(之前sql打印)
報錯異常:
dbcp連接池(開始使用)
[WARN ] 19:01:05.762 [SimpleAsyncTaskExecutor-1] o.h.util.JDBCExceptionReporter - SQL Error: 1040, SQLState: 08004
[ERROR] 19:01:05.762 [SimpleAsyncTaskExecutor-1] o.h.util.JDBCExceptionReporter - Data source rejected establishment of connection, message from server: "Too many connections"
Caused by: org.springframework.transaction.CannotCreateTransactionException: Could not open Hibernate Session for transaction; nested exception is org.hibernate.exception.JDBCConnectionException: Cannot open connection
Caused by: org.hibernate.exception.JDBCConnectionException: Cannot open connection
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Data source rejected establishment of connection, message from server: "Too many connections"
druid連接池(為方便監控,后來修改采用,后續的測試分析均采用druid連接池后所得)
[WARN ] 17:35:23.659 [SimpleAsyncTaskExecutor-1] o.h.util.JDBCExceptionReporter - SQL Error: 0, SQLState: null
[ERROR] 17:35:23.659 [SimpleAsyncTaskExecutor-1] o.h.util.JDBCExceptionReporter - wait millis 10000, active 152, maxActive 500, creating 0
[ERROR] 17:35:31.106 [QuartzScheduler_scheduler-LAPTOP-B40EFQQR1565686921066_ClusterManager] o.s.s.quartz.LocalDataSourceJobStore - ClusterManager: Error managing cluster: Failed to obtain DB connection from data source 'springNonTxDataSource.scheduler': com.alibaba.druid.pool.GetConnectionTimeoutException: wait millis 10000, active 152, maxActive 500, creating 0
org.quartz.JobPersistenceException: Failed to obtain DB connection from data source 'springNonTxDataSource.scheduler': com.alibaba.druid.pool.GetConnectionTimeoutException: wait millis 10000, active 152, maxActive 500, creating 0
Caused by: com.alibaba.druid.pool.GetConnectionTimeoutException: wait millis 10000, active 152, maxActive 500, creating 0
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Data source rejected establishment of connection, message from server: "Too many connections"
[WARN ] 17:35:43.742 [SimpleAsyncTaskExecutor-1] o.s.a.r.l.SimpleMessageListenerContainer - Execution of Rabbit message listener failed, and no ErrorHandler has been set.
org.springframework.amqp.rabbit.listener.ListenerExecutionFailedException: Listener threw exception
Caused by: org.springframework.transaction.CannotCreateTransactionException: Could not open Hibernate Session for transaction; nested exception is org.hibernate.exception.GenericJDBCException: Cannot open connection
Caused by: org.hibernate.exception.GenericJDBCException: Cannot open connection
Caused by: com.alibaba.druid.pool.GetConnectionTimeoutException: wait millis 10000, active 152, maxActive 500, creating 0
監聽器核心代碼:
protected void doBusiness(String msg, String queueName) {
List<WeldingDto> list;
list = new LinkedList<>();
//校驗數據合法性
super.validatePattern(msg, list, WeldingDto.class);
if (!list.isEmpty()) {
//保存wip表相關信息
WeldingDto weldingDto = list.get(0);
EquipDto equipDto = this.wipTempService.getEquipDto(weldingDto.getEquipCode(), null);
List<WorkOrder> workingOrderList = this.workOrderDao.getWorkingOrderListByEquip(equipDto.getId());
//注釋部分業務代碼,不影響問題分析
}
}
問題原因分析:
雖然druid"活躍連接數"小於"最大連接數",但是"活躍連接數"已達到mysql數據庫配置的最大連接數(SHOW STATUS LIKE 'Max_used_connections';);
邏輯連接打開次數 > 邏輯連接關閉次數 說明:連接泄露;
最終導致池中連接數被耗盡,池中連接數變為0;
佐證問題測試發現:
將隊列清空,然后每丟一條數據,連接泄露數+N
排查方案:
采用druid連接池的超時回收機制,配置如下:
<!-- 超過時間限制是否回收 -->
<property name="removeAbandoned" value="true" />
<!-- 超時時間,單位為秒 -->
<property name="removeAbandonedTimeout" value="20" />
<!-- 關閉abanded連接時輸出錯誤日志 -->
<property name="logAbandoned" value="true" />
添加以上配置后,可以在druid監視器的"數據源datasource"的"活躍連接堆棧查看ActiveConnection StackTrace"
看到異常日志,異常日志可以看出連接占用未關閉的代碼位置。
注:該配置影響系統性能,建議排查問題時打開,系統運行時關閉。
異常日志如下:
java.lang.Thread.getStackTrace(Thread.java:1559)
com.alibaba.druid.pool.DruidDataSource.getConnectionDirect(DruidDataSource.java:1209)
com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1135)
com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1125)
com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:104)
org.springframework.orm.hibernate3.LocalDataSourceConnectionProvider.getConnection(LocalDataSourceConnectionProvider.java:83)
org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:446)
org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:167)
org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:161)
org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1616)
org.hibernate.loader.Loader.doQuery(Loader.java:717)
org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:270)
org.hibernate.loader.Loader.doList(Loader.java:2449)
org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2192)
org.hibernate.loader.Loader.list(Loader.java:2187)
org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:452)
org.hibernate.hql.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:363)
org.hibernate.engine.query.HQLQueryPlan.performList(HQLQueryPlan.java:196)
org.hibernate.impl.SessionImpl.list(SessionImpl.java:1258)
org.hibernate.impl.QueryImpl.list(QueryImpl.java:102)
com.wisdytech.linkmes.plan.dao.impl.WorkOrderDaoImpl.getWorkingOrderListByEquip(WorkOrderDaoImpl.java:331)
com.wisdytech.linkmes.msghandler.listeners.business.WeldingQueueListener.doBusiness(WeldingQueueListener.java:109)
com.wisdytech.linkmes.msghandler.listeners.CommonQueueListener.onMessage(CommonQueueListener.java:70)
org.springframework.amqp.rabbit.listener.adapter.MessageListenerAdapter.onMessage(MessageListenerAdapter.java:334)
org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:546)
org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:472)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$001(SimpleMessageListenerContainer.java:61)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$1.invokeListener(SimpleMessageListenerContainer.java:111)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.invokeListener(SimpleMessageListenerContainer.java:627)
org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:454)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.doReceiveAndExecute(SimpleMessageListenerContainer.java:480)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.receiveAndExecute(SimpleMessageListenerContainer.java:464)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$300(SimpleMessageListenerContainer.java:61)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:558)
java.lang.Thread.run(Thread.java:748)
找到產生問題的指定代碼位置后,分析代碼發現getSession()創建后沒有對該創建session關閉。
解決方案1:手動關閉
Session session = this.getSession();
org.hibernate.Query query = session.createQuery(hql.toString());
List<XXX> list = query.list();
session.close();
解決方案2:使用回調機制,通過spring關閉
this.getHibernateTemplate().executeFind(new HibernateCallback<Object>() {
@Override
public Object doInHibernate(Session session) throws HibernateException {
org.hibernate.Query query = session.createQuery(hql.toString());//輸入參數類型final
List<XXX> list1 = query.list();
return list1;
}
});
//下面兩個方法自帶回調
this.getHibernateTemplate().find(hql.toString());
this.getHibernateTemplate().get(XXX.class,"id");
getSession() | 無 | 需手動關閉 |
---|---|---|
getHibernateTemplate() | 經過spring封裝添加相應的聲明式事務管理 | spring管理 |
注:建議使用getHibernateTemplate(),針對getHibernateTemplate()未提供的方法,可以使用HibernateCallback回調方法管理數據庫,代碼如上。
解決方案3:加入事務
本次問題是在監聽器中調用dao層方法,dao層沒有加入事務;可以在監聽器最外層套一層有事務的service方法或者給dao層添加事務。