一. 概述
當input事件處理得慢就會觸發ANR,那ANR內部原理是什么,哪些場景會產生ANR呢。 “工欲善其事必先利其器”,為了理解input ANR原理,前面幾篇文章疏通了整個input框架的處理流程,都是為了這篇文章而做鋪墊。在正式開始分析ANR觸發原理以及觸發場景之前,先來回顧一下input流程。
1.1 InputReader
點擊查看大圖:
InputReader的主要工作分兩部分:
- 調用EventHub的getEvents()讀取節點/dev/input的input_event結構體轉換成RawEvent結構體,RawEvent根據不同InputMapper來轉換成相應的EventEntry,比如按鍵事件則對應KeyEntry,觸摸事件則對應MotionEntry。
- 轉換結果:inut_event -> EventEntry;
- 將事件添加到mInboundQueue隊列尾部,加入該隊列前有以下兩個過濾:
- IMS.interceptKeyBeforeQueueing:事件分發前可增加業務邏輯;
- IMS.filterInputEvent:可攔截事件,當返回值為false的事件都直接攔截,沒有機會加入mInboundQueue隊列,不會再往下分發;否則進入下一步;
- enqueueInboundEventLocked:該事件放入mInboundQueue隊列尾部;
- mLooper->wake:並根據情況來喚醒InputDispatcher線程.
- KeyboardInputMapper.processKey()的過程, 記錄下按下down事件的時間點.
1.2 InputDispatcher
點擊查看大圖:
- dispatchOnceInnerLocked(): 從InputDispatcher的
mInboundQueue
隊列,取出事件EventEntry。另外該方法開始執行的時間點(currentTime)便是后續事件dispatchEntry的分發時間(deliveryTime) - dispatchKeyLocked():滿足一定條件時會添加命令doInterceptKeyBeforeDispatchingLockedInterruptible;
- enqueueDispatchEntryLocked():生成事件DispatchEntry並加入connection的
outbound
隊列 - startDispatchCycleLocked():從outboundQueue中取出事件DispatchEntry, 重新放入connection的
waitQueue
隊列; - runCommandsLockedInterruptible():通過循環遍歷地方式,依次處理mCommandQueue隊列中的所有命令。而mCommandQueue隊列中的命令是通過postCommandLocked()方式向該隊列添加的。ANR回調命令便是在這個時機執行。
- handleTargetsNotReadyLocked(): 該過程會判斷是否等待超過5s來決定是否調用onANRLocked().
流程15中sendMessage是將input事件分發到app端,當app處理完該事件后會發送finishInputEvent()事件. 接下來又回到pollOnce()方法.
1.3 UI Thread
- “InputDispatcher”線程監聽socket服務端,收到消息后回調InputDispatcher.handleReceiveCallback();
- UI主線程監聽socket客戶端,收到消息后回調NativeInputEventReceiver.handleEvent().
對於ANR的觸發主要是在InputDispatcher過程,下面再從ANR的角度來說一說ANR觸發過程。
二. ANR處理流程
ANR時間區別便是指當前這次的事件dispatch過程中執行findFocusedWindowTargetsLocked()方法到下一次執行resetANRTimeoutsLocked()的時間區間. 以下5個時機會reset. 都位於InputDispatcher.cpp文件:
- resetAndDropEverythingLocked
- releasePendingEventLocked
- setFocusedApplication
- dispatchOnceInnerLocked
- setInputDispatchMode
簡單來說, 主要是以下4個場景,會有機會執行resetANRTimeoutsLocked:
- 解凍屏幕, 系統開/關機的時刻點 (thawInputDispatchingLw, setEventDispatchingLw)
- wms聚焦app的改變 (WMS.setFocusedApp, WMS.removeAppToken)
- 設置input filter的過程 (IMS.setInputFilter)
- 再次分發事件的過程(dispatchOnceInnerLocked)
當InputDispatcher線程 findFocusedWindowTargetsLocked()過程調用到handleTargetsNotReadyLocked,且滿足超時5s的情況則會調用onANRLocked().
2.1 onANRLocked
[-> InputDispatcher.cpp]
void InputDispatcher::onANRLocked( nsecs_t currentTime, const sp<InputApplicationHandle>& applicationHandle, const sp<InputWindowHandle>& windowHandle, nsecs_t eventTime, nsecs_t waitStartTime, const char* reason) { float dispatchLatency = (currentTime - eventTime) * 0.000001f; float waitDuration = (currentTime - waitStartTime) * 0.000001f; ALOGI("Application is not responding: %s. " "It has been %0.1fms since event, %0.1fms since wait started. Reason: %s", getApplicationWindowLabelLocked(applicationHandle, windowHandle).string(), dispatchLatency, waitDuration, reason); //捕獲ANR的現場信息 time_t t = time(NULL); struct tm tm; localtime_r(&t, &tm); char timestr[64]; strftime(timestr, sizeof(timestr), "%F %T", &tm); mLastANRState.clear(); mLastANRState.append(INDENT "ANR:\n"); mLastANRState.appendFormat(INDENT2 "Time: %s\n", timestr); mLastANRState.appendFormat(INDENT2 "Window: %s\n", getApplicationWindowLabelLocked(applicationHandle, windowHandle).string()); mLastANRState.appendFormat(INDENT2 "DispatchLatency: %0.1fms\n", dispatchLatency); mLastANRState.appendFormat(INDENT2 "WaitDuration: %0.1fms\n", waitDuration); mLastANRState.appendFormat(INDENT2 "Reason: %s\n", reason); dumpDispatchStateLocked(mLastANRState); //將ANR命令加入mCommandQueue CommandEntry* commandEntry = postCommandLocked( & InputDispatcher::doNotifyANRLockedInterruptible); commandEntry->inputApplicationHandle = applicationHandle; commandEntry->inputWindowHandle = windowHandle; commandEntry->reason = reason; }
發生ANR調用onANRLocked()的過程會將doNotifyANRLockedInterruptible加入mCommandQueue。 在下一輪InputDispatcher.dispatchOnce的過程中會先執行runCommandsLockedInterruptible()方法,取出 mCommandQueue隊列的所有命令逐一執行。那么ANR所對應的命令doNotifyANRLockedInterruptible,接下來看該方法。
3.2 doNotifyANRLockedInterruptible
[-> InputDispatcher.cpp]
void InputDispatcher::doNotifyANRLockedInterruptible(
CommandEntry* commandEntry) {
mLock.unlock();
//[見小節3.3] nsecs_t newTimeout = mPolicy->notifyANR( commandEntry->inputApplicationHandle, commandEntry->inputWindowHandle, commandEntry->reason); mLock.lock(); //newTimeout =5s [見小節3.8] resumeAfterTargetsNotReadyTimeoutLocked(newTimeout, commandEntry->inputWindowHandle != NULL ? commandEntry->inputWindowHandle->getInputChannel() : NULL); }
mPolicy是指NativeInputManager
3.3 NativeInputManager.notifyANR
[-> com_android_server_input_InputManagerService.cpp]
nsecs_t NativeInputManager::notifyANR(const sp<InputApplicationHandle>& inputApplicationHandle, const sp<InputWindowHandle>& inputWindowHandle, const String8& reason) { JNIEnv* env = jniEnv(); jobject inputApplicationHandleObj = getInputApplicationHandleObjLocalRef(env, inputApplicationHandle); jobject inputWindowHandleObj = getInputWindowHandleObjLocalRef(env, inputWindowHandle); jstring reasonObj = env->NewStringUTF(reason.string()); //調用Java方法[見小節3.4] jlong newTimeout = env->CallLongMethod(mServiceObj, gServiceClassInfo.notifyANR, inputApplicationHandleObj, inputWindowHandleObj, reasonObj); if (checkAndClearExceptionFromCallback(env, "notifyANR")) { newTimeout = 0; //拋出異常,則清理並重置timeout } ... return newTimeout; }
先看看register_android_server_InputManager過程:
int register_android_server_InputManager(JNIEnv* env) { int res = jniRegisterNativeMethods(env, "com/android/server/input/InputManagerService", gInputManagerMethods, NELEM(gInputManagerMethods)); jclass clazz; FIND_CLASS(clazz, "com/android/server/input/InputManagerService"); ... GET_METHOD_ID(gServiceClassInfo.notifyANR, clazz, "notifyANR", "(Lcom/android/server/input/InputApplicationHandle;Lcom/android/server/input/InputWindowHandle;Ljava/lang/String;)J"); ... }
可知gServiceClassInfo.notifyANR是指IMS.notifyANR
3.4 IMS.notifyANR
[-> InputManagerService.java]
private long notifyANR(InputApplicationHandle inputApplicationHandle, InputWindowHandle inputWindowHandle, String reason) { //[見小節3.5] return mWindowManagerCallbacks.notifyANR( inputApplicationHandle, inputWindowHandle, reason); }
此處mWindowManagerCallbacks是指InputMonitor對象。
3.5 InputMonitor.notifyANR
[-> InputMonitor.java]
public long notifyANR(InputApplicationHandle inputApplicationHandle, InputWindowHandle inputWindowHandle, String reason) { AppWindowToken appWindowToken = null; WindowState windowState = null; boolean aboveSystem = false; synchronized (mService.mWindowMap) { if (inputWindowHandle != null) { windowState = (WindowState) inputWindowHandle.windowState; if (windowState != null) { appWindowToken = windowState.mAppToken; } } if (appWindowToken == null && inputApplicationHandle != null) { appWindowToken = (AppWindowToken)inputApplicationHandle.appWindowToken; } //輸出input事件分發超時log if (windowState != null) { Slog.i(WindowManagerService.TAG, "Input event dispatching timed out " + "sending to " + windowState.mAttrs.getTitle() + ". Reason: " + reason); int systemAlertLayer = mService.mPolicy.windowTypeToLayerLw( WindowManager.LayoutParams.TYPE_SYSTEM_ALERT); aboveSystem = windowState.mBaseLayer > systemAlertLayer; } else if (appWindowToken != null) { Slog.i(WindowManagerService.TAG, "Input event dispatching timed out " + "sending to application " + appWindowToken.stringName + ". Reason: " + reason); } else { Slog.i(WindowManagerService.TAG, "Input event dispatching timed out " + ". Reason: " + reason); } mService.saveANRStateLocked(appWindowToken, windowState, reason); } if (appWindowToken != null && appWindowToken.appToken != null) { //【見小節3.6.1】 boolean abort = appWindowToken.appToken.keyDispatchingTimedOut(reason); if (! abort) { return appWindowToken.inputDispatchingTimeoutNanos; //5s } } else if (windowState != null) { //【見小節3.6.2】 long timeout = ActivityManagerNative.getDefault().inputDispatchingTimedOut( windowState.mSession.mPid, aboveSystem, reason); if (timeout >= 0) { return timeout * 1000000L; //5s } } return 0; }
發生input相關的ANR時在system log輸出ANR信息,並且tag為WindowManager. 主要有3類log:
- Input event dispatching timed out sending to [windowState.mAttrs.getTitle()]
- Input event dispatching timed out sending to application [appWindowToken.stringName)]
- Input event dispatching timed out sending.
3.6 DispatchingTimedOut
3.6.1 Token.keyDispatchingTimedOut
[-> ActivityRecord.java :: Token]
final class ActivityRecord { static class Token extends IApplicationToken.Stub { public boolean keyDispatchingTimedOut(String reason) { ActivityRecord r; ActivityRecord anrActivity; ProcessRecord anrApp; synchronized (mService) { r = tokenToActivityRecordLocked(this); if (r == null) { return false; } anrActivity = r.getWaitingHistoryRecordLocked(); anrApp = r != null ? r.app : null; } //[見小節3.7] return mService.inputDispatchingTimedOut(anrApp, anrActivity, r, false, reason); } ... } }
3.6.2 AMS.inputDispatchingTimedOut
public long inputDispatchingTimedOut(int pid, final boolean aboveSystem, String reason) { ... ProcessRecord proc; long timeout; synchronized (this) { synchronized (mPidsSelfLocked) { proc = mPidsSelfLocked.get(pid); //根據pid查看進程record } timeout = getInputDispatchingTimeoutLocked(proc); } //【見小節3.7】 if (!inputDispatchingTimedOut(proc, null, null, aboveSystem, reason)) { return -1; } return timeout; }
inputDispatching的超時為KEY_DISPATCHING_TIMEOUT
,即timeout = 5s。
3.7 AMS.inputDispatchingTimedOut
public boolean inputDispatchingTimedOut(final ProcessRecord proc, final ActivityRecord activity, final ActivityRecord parent, final boolean aboveSystem, String reason) { ... final String annotation; if (reason == null) { annotation = "Input dispatching timed out"; } else { annotation = "Input dispatching timed out (" + reason + ")"; } if (proc != null) { ... //通過handler機制,交由“ActivityManager”線程執行ANR處理過程。 mHandler.post(new Runnable() { public void run() { appNotResponding(proc, activity, parent, aboveSystem, annotation); } }); } return true; }
appNotResponding會輸出現場的重要進程的trace等信息。 再回到【小節3.2】處理完ANR后再調用resumeAfterTargetsNotReadyTimeoutLocked。
3.8 resumeAfterTargetsNotReadyTimeoutLocked
[-> InputDispatcher.cpp]
void InputDispatcher::resumeAfterTargetsNotReadyTimeoutLocked(nsecs_t newTimeout,
const sp<InputChannel>& inputChannel) { if (newTimeout > 0) { //超時時間增加5s mInputTargetWaitTimeoutTime = now() + newTimeout; } else { // Give up. mInputTargetWaitTimeoutExpired = true; // Input state will not be realistic. Mark it out of sync. if (inputChannel.get()) { ssize_t connectionIndex = getConnectionIndexLocked(inputChannel); if (connectionIndex >= 0) { sp<Connection> connection = mConnectionsByFd.valueAt(connectionIndex); sp<InputWindowHandle> windowHandle = connection->inputWindowHandle; if (windowHandle != NULL) { const InputWindowInfo* info = windowHandle->getInfo(); if (info) { ssize_t stateIndex = mTouchStatesByDisplay.indexOfKey(info->displayId); if (stateIndex >= 0) { mTouchStatesByDisplay.editValueAt(stateIndex).removeWindow( windowHandle); } } } if (connection->status == Connection::STATUS_NORMAL) { CancelationOptions options(CancelationOptions::CANCEL_ALL_EVENTS, "application not responding"); synthesizeCancelationEventsForConnectionLocked(connection, options); } } } } }
四. input死鎖監測機制
4.1 IMS.start
[-> InputManagerService.java]
public void start() { ... Watchdog.getInstance().addMonitor(this); ... }
InputManagerService實現了Watchdog.Monitor接口, 並且在啟動過程將自己加入到了Watchdog線程的monitor隊列.
4.2 IMS.monitor
Watchdog便會定時調用IMS.monitor()方法.
public void monitor() { synchronized (mInputFilterLock) { } nativeMonitor(mPtr); }
nativeMonitor經過JNI調用,進如如下方法:
static void nativeMonitor(JNIEnv*, jclass, jlong ptr) { NativeInputManager* im = reinterpret_cast<NativeInputManager*>(ptr); im->getInputManager()->getReader()->monitor(); //見小節4.3 im->getInputManager()->getDispatcher()->monitor(); //見小節4.4 }
4.3 InputReader.monitor
[-> InputReader.cpp]
void InputReader::monitor() {
//請求和釋放一次mLock,來確保reader沒有發生死鎖的問題 mLock.lock(); mEventHub->wake(); mReaderIsAliveCondition.wait(mLock); mLock.unlock(); //監測EventHub[見小節4.3.1] mEventHub->monitor(); }
獲取mLock之后進入Condition類型的wait()方法,等待InputReader線程的loopOnce()中的broadcast()來喚醒.
void InputReader::loopOnce() {
size_t count = mEventHub->getEvents(timeoutMillis, mEventBuffer, EVENT_BUFFER_SIZE);
...
{
AutoMutex _l(mLock);
mReaderIsAliveCondition.broadcast();
if (count) { processEventsLocked(mEventBuffer, count); } } ... mQueuedListener->flush(); }
4.3.1 EventHub.monitor
[-> EventHub.cpp]
void EventHub::monitor() { //請求和釋放一次mLock,來確保reader沒有發生死鎖的問題 mLock.lock(); mLock.unlock(); }
4.4 InputDispatcher
[-> InputDispatcher.cpp]
void InputDispatcher::monitor() {
mLock.lock();
mLooper->wake();
mDispatcherIsAliveCondition.wait(mLock);
mLock.unlock();
}
獲取mLock之后進入Condition類型的wait()方法,等待IInputDispatcher線程的loopOnce()中的broadcast()來喚醒.
void InputDispatcher::dispatchOnce() { nsecs_t nextWakeupTime = LONG_LONG_MAX; { AutoMutex _l(mLock); mDispatcherIsAliveCondition.broadcast(); if (!haveCommandsLocked()) { dispatchOnceInnerLocked(&nextWakeupTime); } if (runCommandsLockedInterruptible()) { nextWakeupTime = LONG_LONG_MIN; } } nsecs_t currentTime = now(); int timeoutMillis = toMillisecondTimeoutDelay(currentTime, nextWakeupTime); mLooper->pollOnce(timeoutMillis); //進入epoll_wait }
4.5 小節
通過將InputManagerService加入到Watchdog的monitor隊列,定時監測是否發生死鎖. 整個監測過涉及EventHub, InputReader, InputDispatcher, InputManagerService的死鎖監測. 監測的原理很簡單,通過嘗試獲取鎖並釋放鎖的方式.
最后, 可通過adb shell dumpsys input
來查看手機當前的input狀態, 輸出內容分別為EventHub.dump(), InputReader.dump(),InputDispatcher.dump()這3類,另外如果發生過input ANR,那么也會輸出上一個ANR的狀態.
其中mPendingEvent代表的當下正在處理的事件.
五. 總結
5.1 ANR分類
由小節[3.5] InputMonitor.notifyANR完成, 當發生ANR時system log中會出現以下信息, 並且TAG=WindowManager:
Input event dispatching timed out xxx. Reason: + reason
, 其中xxx取值:
- 窗口類型: sending to
windowState.mAttrs.getTitle()
- 應用類型: sending to application
appWindowToken.stringName
- 其他類型: 則為空.
至於Reason主要有以下類型:
5.1.1 reason類型
由小節[2.3.1]checkWindowReadyForMoreInputLocked完成, ANR reason主要有以下幾類:
- 無窗口, 有應用:Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.
- 窗口暫停: Waiting because the
[targetType]
window is paused. - 窗口未連接: Waiting because the
[targetType]
window’s input channel is not registered with the input dispatcher. The window may be in the process of being removed. - 窗口連接已死亡:Waiting because the
[targetType]
window’s input connection is[Connection.Status]
. The window may be in the process of being removed. - 窗口連接已滿:Waiting because the
[targetType]
window’s input channel is full. Outbound queue length:[outboundQueue長度]
. Wait queue length:[waitQueue長度]
. - 按鍵事件,輸出隊列或事件等待隊列不為空:Waiting to send key event because the
[targetType]
window has not finished processing all of the input events that were previously delivered to it. Outbound queue length:[outboundQueue長度]
. Wait queue length:[waitQueue長度]
. - 非按鍵事件,事件等待隊列不為空且頭事件分發超時500ms:Waiting to send non-key event because the
[targetType]
window has not finished processing certain input events that were delivered to it over 500ms ago. Wait queue length:[waitQueue長度]
. Wait queue head age:[等待時長]
.
其中
- targetType: 取值為”focused”或者”touched”
- Connection.Status: 取值為”NORMAL”,”BROKEN”,”ZOMBIE”
另外, findFocusedWindowTargetsLocked, findTouchedWindowTargetsLocked這兩個方法中可以通過實現 updateDispatchStatisticsLocked()來分析anr問題.
5.2 drop事件分類
由小節[2.1.2] dropInboundEventLocked完成,輸出事件丟棄的原因:
- DROP_REASON_POLICY: “inbound event was dropped because the policy consumed it”;
- DROP_REASON_DISABLED: “inbound event was dropped because input dispatch is disabled”;
- DROP_REASON_APP_SWITCH: “inbound event was dropped because of pending overdue app switch”;
- DROP_REASON_BLOCKED: “inbound event was dropped because the current application is not responding and the user has started interacting with a different application””;
- DROP_REASON_STALE: “inbound event was dropped because it is stale”;
其他:
- doDispatchCycleFinishedLockedInterruptible的過程, 會記錄分發時間超過2s的事件,
- findFocusedWindowTargetsLocked的過程, 可以統計等待時長信息.