關於 Mybatis的原生連接池 和 DBCP 連接池


一 遇到的問題:  項目用的play框架,數據庫DB2, 持久化框架是Mybatis, 連接池用的是Mybatis原生的,遇到的問題是:有時候拋出如下異常:

play.api.UnexpectedException: Unexpected exception[CompletionException: org.apache.ibatis.exceptions.PersistenceException:
### Error querying database.  Cause: com.ibm.db2.jcc.am.DisconnectNonTransientConnectionException: [jcc][t4][2030][11211][4.23.42] A communication error occurred during operations on the connection's underlying socket, socket input stream,
or socket output stream.  Error location: Reply.fill() - socketInputStream.read (-1).  Message: Connection timed out (Write failed). ERRORCODE=-4499, SQLSTATE=08001
### The error may exist in mapper/mapper.xml
### The error may involve defaultParameterMap
### The error occurred while setting parameters
### SQL: select tmpifro.* from (SELECT         MAX(nvl(ifm.opportunity_id,''))            AS opportunity_name,         MAX(nvl(IFM.restricted, ''))               AS restricted,         MAX(nvl(ifc.client_id, ''))                AS client_id,         MAX(nvl(ifc.disp_name, ''))                AS client_name,         MAX(nvl(ifc.disp_alt_name, ''))            AS client_alt_name,         MAX(nvl(ifc.issuing_iot_name, ''))         AS issuing_iot,         MAX(ifm.oppty_close_date)                  AS decision_date,         MAX(nvl(ifss.sales_stage_name, ''))        AS opportunity_sales_stage,         MAX(nvl(ifss.sales_stage_id, ''))          AS opportunity_sales_stage_id,         nvl(ifm.opportunity_id, '')                AS opportunity_id,         nvl(sum(ifrm.manager_amount_usd), 0)       AS total_ts_amount,         nvl(sum(ifrm.manager_won_amount), 0)         + nvl(sum(ifrm.manager_sol_amount), 0)         + nvl(sum(ifrm.manager_atr_amount), 0)     AS manager_won_solid_atrisk,         nvl(sum(ifrm.manager_key_amount), 0)       AS man_key_stretch,         nvl(sum(ifrm.manager_str_amount), 0)       AS man_stretch,         nvl(sum(ifrm.manager_won_amount), 0)       AS man_won,         nvl(sum(ifrm.manager_sol_amount), 0)       AS man_solid,         nvl(sum(ifrm.manager_atr_amount), 0)       AS man_atrisk,         nvl(sum(ifrm.manager_nir_amount), 0)       AS man_nir,         nvl(sum(ifrm.manager_cq_amount), 0)        AS manager_cq_amount,         nvl(sum(ifrm.manager_nq_amount), 0)        AS manager_nq_amount,         MAX(nvl(o.DESCRIPTION, ''))                AS opportunity_description,         MAX(nvl(o.amount_usdollar, 0))             AS total_opportunity_amount,         MAX(nvl(o.assigned_user_id, ''))           AS assigned_user_id,         MAX(nvl(o.assigned_bp_id, ''))             AS assigned_bp_id,         MAX(nvl(iw.overall_progress, 0))           AS win_plan_progress,         MAX((case when (nvl(o.assigned_user_id, '') = '')         then nvl(bp.name, '')         else nvl(u.first_name, '') || ' ' || nvl(u.last_name, '')         end))                                 AS opportunity_owner,         MAX(nvl(iw.executive_sponsor, ''))         AS executive_sponsor_id,         MAX(nvl(u1.first_name, '') || ' ' || nvl(u1.last_name, ''))         AS ibm_executive_sponsor,         MAX(nvl(iw.snd_sponsor, ''))               AS sd_sponsor_id,         MAX(nvl(u2.first_name, '') || ' ' || nvl(u2.last_name, ''))         AS ibm_sd_sponsor,         MAX(nvl(PARENT_ACC.NAME, ''))   AS    account_name,         MAX(nvl(stc.comments, ''))    AS opportunity_steps_to_closure_summary         FROM IBM_FORECAST_ROADMAP IFRM         INNER JOIN (         SELECT         DISTINCT F_ROADMAP_ID         FROM IBM_FN_TO_ROADMAP IFTR         WHERE IFTR.LOWEST_NODE_ID =?         ) IFTR on IFTR.F_ROADMAP_ID = IFRM.ID         INNER JOIN IBM_FDIM_ROADMAP IFM ON IFM.ID = IFRM.FDIM_ROADMAP_ID AND IFM.DELETED = 0         INNER JOIN IBM_FDIM_SALES_STAGE IFSS ON IFSS.ID = IFRM.FDIM_SALES_STAGE_ID AND IFSS.DELETED = 0         INNER JOIN IBM_FDIM_DATE MIFD ON MIFD.ID = IFRM.MANAGER_FDIM_DATE_ID AND MIFD.DELETED = 0         INNER JOIN IBM_FDIM_CMR IFC on IFC.ID = IFRM.FDIM_CMR_ID AND IFC.DELETED = 0         INNER JOIN OPPORTUNITIES O ON  O.ID = ifm.OPPORTUNITY_ID AND O.deleted=0 AND O.BP_PRIVATE_OPPORTUNITY = '0'          LEFT JOIN IBM_WINPLANS IW ON IW.OPPORTUNITY_ID = ifm.OPPORTUNITY_ID AND IW.DELETED = 0         LEFT JOIN IBM_STEPSTOCLOSURE STC ON STC.OPPORTUNITY_ID = ifm.OPPORTUNITY_ID AND STC.TYPE = 'OSC'         AND STC.DELETED = 0         LEFT JOIN USERS U ON U.ID = O.ASSIGNED_USER_ID AND U.DELETED = 0         LEFT JOIN USERS U1 ON U1.ID = IW.EXECUTIVE_SPONSOR AND U1.DELETED = 0         LEFT JOIN USERS U2 ON U2.ID = IW.SND_SPONSOR AND U2.DELETED = 0         LEFT JOIN IBM_BUSINESSPARTNERS BP ON BP.ID = O.ASSIGNED_BP_ID AND BP.DELETED = 0                   LEFT JOIN ACCOUNTS ACC ON IFC.client_id = ACC.ID AND ACC.DELETED = 0         LEFT JOIN ACCOUNTS PARENT_ACC ON ACC.PARENT_ID = PARENT_ACC.ID AND PARENT_ACC.DELETED = 0          WHERE IFRM.manager_amount_usd!='0'                        AND MIFD.QUARTERLY_TIMEPERIOD_ID = ?                                  AND IFRM.REVENUE_TYPE =  ?                    GROUP BY IFM.OPPORTUNITY_ID WITH UR) tmpifro  order by total_ts_amount desc         LIMIT ?,?         WITH UR FOR FETCH ONLY
### Cause: com.ibm.db2.jcc.am.DisconnectNonTransientConnectionException: [jcc][t4][2030][11211][4.23.42] A communication error occurred during operations on the connection's underlying socket, socket input stream,
or socket output stream.  Error location: Reply.fill() - socketInputStream.read (-1).  Message: Connection timed out (Write failed). ERRORCODE=-4499, SQLSTATE=08001]
        at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:251)
        at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:182)
        at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:320)
        at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:318)
        at scala.concurrent.Future.$anonfun$recoverWith$1(Future.scala:414)

很明顯是: connection 和 DB的連接出問題了,正常情況下,一般過一會,連接好用了,就可以恢復,但是有一種情況是:連接一直不恢復: 用

netstat -anop 命令看連接池狀態發現,如下圖: 其中,端口是50001的是SSL連接,這些連接池中Send-Q中是4101的,表示socket連接的發送隊列出現阻塞,連接有問題。

Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name Timer
tcp6 0 0 :::9000 :::* LISTEN 1/java off (0.00/0/0)
tcp6 0 4101 10.244.1.164:39236 XXXXXX:50001 ESTABLISHED 1/java on (104.87/15/0)
tcp6 0 4101 10.244.1.164:42786 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (18.34/15/0)
tcp6 0 4101 10.244.1.164:39214 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (51.62/12/0)
tcp6 0 4101 10.244.1.164:39218 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (105.38/10/0)
tcp6 0 4101 10.244.1.164:37600 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (5.09/8/0)
tcp6 0 4101 10.244.1.164:33852 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (33.45/10/0)
tcp6 0 0 10.244.1.164:39240 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java keepalive (26.85/0/7)
tcp6 0 4101 10.244.1.164:39250 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (26.02/15/0)
tcp6 0 4101 10.244.1.164:39256 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (41.38/14/0)
tcp6 0 4101 10.244.1.164:33860 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (41.89/15/0)
tcp6 0 0 10.244.1.164:34200 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java keepalive (40.17/0/5)
tcp6 0 0 10.244.1.164:39230 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java keepalive (1072.87/0/0)
tcp6 0 0 10.244.1.164:39224 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java keepalive (1990.37/0/0)
tcp6 0 4101 10.244.1.164:39210 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (105.89/14/0)
tcp6 0 0 10.244.1.164:9000 10.244.1.1:36294 TIME_WAIT - timewait (44.47/0/0)
tcp6 0 0 10.244.1.164:39262 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java keepalive (43.75/0/2)
tcp6 0 4101 10.244.1.164:34206 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (84.20/9/0)
tcp6 0 4101 10.244.1.164:33598 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (39.33/13/0)
tcp6 0 4101 10.244.1.164:33584 XX.XX.XXX.XXXX:50001 ESTABLISHED 1/java on (79.27/15/0)

在這種情況下:連接池中的這些連接其實都是有問題的,但是Mybatis每次都從連接池中得到一個connection,並不知道其是否好用。所以就一直阻塞。

二 原因:首先分析了Mybatis原生的連接池的源碼,這里簡單說一下:

PooledDataSource.java :: popConnection() 這個函數表示每次都去得到一個連接,
(1)首先,如果連接池不是空,就從連接池取一個,跳到(3);否則跳到(2)

        (2)新建一個連接

        (3)判斷這個連接是否可用,如果可用,就返回這個連接;否則繼續(1)

這里重點說一下上面標注紅色的步驟,如何判斷連接是否可用:這個調用的是PooledConnection.java::isValid(),  

  /*
   * Method to see if the connection is usable
   *
   * @return True if the connection is usable
   */
  public boolean isValid() {
    return valid && realConnection != null && dataSource.pingConnection(this);
  }

重點是dataSource.pingConnection(this),如果mybatis里配置了poolPingEnabled,就會在拿到這個連接的時候,執行poolPingQuery,相當於檢查數據庫連接是正常,如果正常,就返回這個連接;否則關掉這個壞的連接,然后再去找其他的正常的空閑的連接。

關鍵我代碼里面沒有配置poolPingEnabled, 所以,根本沒去檢查空閑的連接是否好用。導致一直用壞的連接去執行SQL語句。

三 解決方案

       (1) 如果繼續使用mybatis原生的連接池,就配置如下屬性:

   poolPingEnabled - 默認值是false,當值為true的時候,將開啟ping機制。
   poolPingQuery - 對數據庫進行ping時所使用的sql。
   poolPingConnectionsNotUsedFor
        默認值是0,單位是毫秒。我們不能在每次使用連接池之前,都使用ping機制,
        這會使每一條sql的執行,都要額外執行一次ping語句。所以使用此屬性來避免這種不合理做法。
        我們只針對閑置時間超過某個時間的連接,進行ping。本例中的值為1小時,當從連接池中拿出的連接閑置超過1小時,才會對它進行ping。

                  但是這個的效率非常低下,每次執行一條SQL語句,都要執行一個檢查的SQL, 效率不好。

        (2) 使用DBCP的連接池:DBCP連接池功能強大,提供了四種檢測DB連接是否正常的選擇:          

testOnCreate
testOnBorrow
testOnReturn 
testWhileIdle 

             其中testWhileIdle ,是另外啟線程,定時的去檢測連接是否健康,這個比較實用,需要在Mybatis里進行如下配置:

                 <property name="testWhileIdle" value="true"/>             啟用空閑連接健康檢查機制
                <property name="validationQueryTimeout" value="2"/>        如果執行檢查的SQL  2秒沒有結果,就認為連接不健康,就關掉
                <property name="validationQuery" value="SELECT 1 FROM SYSIBM.SYSDUMMY1"/>   執行檢查的SQL, 要一定有返回值,並且快!
                <property name="timeBetweenEvictionRunsMillis" value="20000"/>        多長時間檢查一次
                <property name="numTestsPerEvictionRun" value="20"/>          檢查的時候,每次檢查多少個,是串行檢查的。

 最后,使用DBCP的 testWhileIdle, 可以在連接不健康的情況下,關閉掉不健康的連接。

PoolableConnection.java源碼:
 注意:1 在進行test的時候,如果numTestsPerEvictionRun表示每次檢查多少個,是串行一個一個檢查。
    2  雖然validationQueryTimeout 設置成2秒,但是test 連接的時候,如果是拔網線,依然是20秒才超時,我如果有8個連接,在斷網的情況下要8 * 20 = 160
   秒的時間才能全部釋放連接。
 public void validate(final String sql, int timeoutSeconds) throws SQLException {
        if (fastFailValidation && fatalSqlExceptionThrown) {
            throw new SQLException(Utils.getMessage("poolableConnection.validate.fastFail"));
        }

        if (sql == null || sql.length() == 0) {
            if (timeoutSeconds < 0) {
                timeoutSeconds = 0;
            }
            if (!isValid(timeoutSeconds)) {
                throw new SQLException("isValid() returned false");
            }
            return;
        }

        if (!sql.equals(lastValidationSql)) {
            lastValidationSql = sql;
            // Has to be the innermost delegate else the prepared statement will
            // be closed when the pooled connection is passivated.
            validationPreparedStatement = getInnermostDelegateInternal().prepareStatement(sql);
        }

        if (timeoutSeconds > 0) {
            validationPreparedStatement.setQueryTimeout(timeoutSeconds);
        }

        try (ResultSet rs = validationPreparedStatement.executeQuery()) {
            if (!rs.next()) {
                throw new SQLException("validationQuery didn't return a row");
            }
        } catch (final SQLException sqle) {
            throw sqle;
        }
    }

testWhileIdle : 關注的重點,GenericObjectPool中針對pool管理,起了一個Evict的TimerTask定時線程進行控制(可通過設置參數timeBetweenEvictionRunsMillis>0),定時對線程池中的鏈接進行validateObject校驗,對無效的鏈接進行關閉后,會調用ensureMinIdle,適當建立鏈接保證最小的minIdle連接數。
timeBetweenEvictionRunsMillis,設置的Evict線程的時間,單位ms,大於0才會開啟evict檢查線程

numTestsPerEvictionRun,代表每次檢查鏈接的數量,建議設置和maxActive一樣大,這樣每次可以有效檢查所有的鏈接.

maxWaitMillis:(如果沒有可用連接)池在拋出異常前等待的一個連接被歸還的最大毫秒數,設置為-1則等待時間不確定
RemoveAbandonedOnBorrow:表示DBCP連接池自動管理應程序中使用完畢的連接
removeAbandonedOnMaintenance:表示DBCP連接池自動管理應程序中使用完畢的連接
removeAbandonedTimeout: 表示一個連接在程序中使用完畢后,若在指定時間之內沒有再次使用,則DBCP連接池回收該連接
logAbandoned:   

   以下代碼看出: 每次test的時候,只test處於idle狀態的連接

GenericObjectPool.java
private int getNumTests() {
int numTestsPerEvictionRun = getNumTestsPerEvictionRun();
if (numTestsPerEvictionRun >= 0) {
return Math.min(numTestsPerEvictionRun, idleObjects.size());
} else {
return (int) (Math.ceil(idleObjects.size() /
Math.abs((double) numTestsPerEvictionRun)));
}
}

GenericObjectPool.java
removeAbandoned函數源碼。removeAbandoned只會remove掉ALLOCATED狀態的並且很久未用(removeAbandonedTimeout)的連接。
   並且有兩個地方會調用:
     1: borrowObject():  每次從連接池中取得conn的時候,先調用一下這個函數。

執行的條件是:(getNumIdle() < 2) &&(getNumActive() > getMaxTotal() - 3)
idle的很小, 但是active的很多的情況。
        2: evit() :  每次test evit的時候去調用一下。這個的執行是無條件的,只要配置了就去檢查。
 private void removeAbandoned(AbandonedConfig ac) {
        // Generate a list of abandoned objects to remove
        final long now = System.currentTimeMillis();
        final long timeout =
                now - (ac.getRemoveAbandonedTimeout() * 1000L);
        ArrayList<PooledObject<T>> remove = new ArrayList<PooledObject<T>>();
        Iterator<PooledObject<T>> it = allObjects.values().iterator();
        while (it.hasNext()) {
            PooledObject<T> pooledObject = it.next();
            synchronized (pooledObject) {
                if (pooledObject.getState() == PooledObjectState.ALLOCATED &&
                        pooledObject.getLastUsedTime() <= timeout) {
                    pooledObject.markAbandoned();
                    remove.add(pooledObject);
                }
            }
        }

        // Now remove the abandoned objects
        Iterator<PooledObject<T>> itr = remove.iterator();
        while (itr.hasNext()) {
            PooledObject<T> pooledObject = itr.next();
            if (ac.getLogAbandoned()) {
                pooledObject.printStackTrace(ac.getLogWriter());
            }
            try {
                invalidateObject(pooledObject.getObject());
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

 

參考文章:

http://lc87624.iteye.com/blog/1734089
https://www.cnblogs.com/ahang/p/5744530.html
select 1 from sysibm.sysdummy1 或
SELECT current date FROM sysibm.sysdummy1
https://jingyan.baidu.com/article/5552ef4738fbf0518efbc944.html

            

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM