今天和人討論了一下CPS變形為閉包回調(典型為C#和JS),以及Lua這種具有真正堆棧,可以yield和resume的coroutine,兩種以同步的形式寫異步處理邏輯的解決方案的優缺點。之后生出疑問,這兩種做法,到底哪一種會更消耗。我自己的判斷是,在一次調用只有一兩個異步調用中斷時(即有2次回調,或者2次yield),閉包回調的方式性能更好,因為coroutine的方式需要創建一個具有完全堆棧的協程,相對來說還是太重度了。但是如果一次調用中的異步調用非常多,那么coroutine的方式性能更好,因為不管多少次yield,coroutine始終只需要創建一次協程,而閉包回調的每一次調用都必須創建閉包函數,GC的開銷不算小。直接上測試代碼
CPS:
local count = 1000000 local list1 = {} local list2 = {} local clock = os.clock local insert = table.insert local remove = table.remove local function setcb(fn) insert(list1, fn) end local function test1() setcb(function() end) end local time1 = clock()--開始 for i = 1, count do test1() end local time2 = clock()--調用 while true do list1, list2 = list2, list1 for i = 1, #list2 do remove(list2)() end if #list1 == 0 then break end end local time3 = clock()--回調完全結束 print(time2 - time1, time3 - time2)
coroutine:
local count = 1000000 local list1 = {} local list2 = {} local clock = os.clock local insert = table.insert local remove = table.remove local create = coroutine.create local yield = coroutine.yield local running = coroutine.running local resume = coroutine.resume local function setcb() insert(list1, running()) yield() end local function test2() setcb() end local function test1() resume(create(test2)) end local time1 = clock()--開始 for i = 1, count do test1() end local time2 = clock()--調用 while true do list1, list2 = list2, list1 for i = 1, #list2 do resume(remove(list2)) end if #list1 == 0 then break end end local time3 = clock()--回調完全結束 print(time2 - time1, time3 - time2)
輸出:
coroutine的調用和喚醒/回調,比閉包回調慢不少
(PS. 這里有個插曲,我之前設置的count = 10000000,但是測試coroutine時報內存不足的錯誤,因此只能下降一個數量級來測試了)
接下來我把單次調用的回調次數增多
CPS:
local count = 1000000 local list1 = {} local list2 = {} local clock = os.clock local insert = table.insert local remove = table.remove local function setcb(fn) insert(list1, fn) end local function test1() setcb(function() setcb(function() setcb(function() setcb(function() setcb(function() setcb(function() setcb(function() end) end) end) end) end) end) end) end local time1 = clock()--開始 for i = 1, count do test1() end local time2 = clock()--調用 while true do list1, list2 = list2, list1 for i = 1, #list2 do remove(list2)() end if #list1 == 0 then break end end local time3 = clock()--回調完全結束 print(time2 - time1, time3 - time2)
coroutine:
local count = 1000000 local list1 = {} local list2 = {} local clock = os.clock local insert = table.insert local remove = table.remove local create = coroutine.create local yield = coroutine.yield local running = coroutine.running local resume = coroutine.resume local function setcb() insert(list1, running()) yield() end local function test2() setcb() setcb() setcb() setcb() setcb() setcb() setcb() end local function test1() resume(create(test2)) end local time1 = clock()--開始 for i = 1, count do test1() end local time2 = clock()--調用 while true do list1, list2 = list2, list1 for i = 1, #list2 do resume(remove(list2)) end if #list1 == 0 then break end end local time3 = clock()--回調完全結束 print(time2 - time1, time3 - time2)
輸出:
回調的消耗仍然是coroutine處於劣勢,但已經比較接近了。啟動的消耗,由於coroutine需要創建比較大的堆棧,相對於閉包來說還是比較重度,因此啟動仍然遠遠慢於閉包回調的方式。
最后,我把一次調用里的異步接口調用次數,改成到10000次(需要封裝成多個函數,否則lua會報錯:chunk has too many syntax levels),對比如下(此時次數都改成了count = 1000):
這個時候coroutine的回調消耗優勢就上來了。不過一般來說,實際應用中一次調用不可能調用這么多次異步接口。
之后再來測試內存占用
CPS:
local count = 100000 local list1 = {} local list2 = {} local clock = os.clock local insert = table.insert local remove = table.remove local function setcb(fn) insert(list1, fn) end local function test1() setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function() setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function() setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function()setcb(function() end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end) end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end) end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end)end) end collectgarbage("collect") collectgarbage("stop") local count1 = collectgarbage("count") for i = 1, count do test1() end local count2 = collectgarbage("count") while true do list1, list2 = list2, list1 for i = 1, #list2 do remove(list2)() end if #list1 == 0 then break end end local count3 = collectgarbage("count") print(count2 - count1, count3 - count2, count3 - count1)
coroutine:
local count = 100000 local list1 = {} local list2 = {} local clock = os.clock local insert = table.insert local remove = table.remove local create = coroutine.create local yield = coroutine.yield local running = coroutine.running local resume = coroutine.resume local function setcb() insert(list1, running()) yield() end local function test2() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() setcb() end local function test1() resume(create(test2)) end collectgarbage("collect") collectgarbage("stop") local count1 = collectgarbage("count") for i = 1, count do test1() end local count2 = collectgarbage("count") while true do list1, list2 = list2, list1 for i = 1, #list2 do resume(remove(list2)) end if #list1 == 0 then break end end local count3 = collectgarbage("count") print(count2 - count1, count3 - count2, count3 - count1)
輸出:
coroutine的內存占用確實比閉包回調少很多。
因此,要內存還是要性能,這個看自己的取舍了。
本次測試並不全面,還有很多情況沒有測試(比如加上多個局部變量,閉包回調的性能和內存占用可能會受影響)。並且因為lua沒有自帶的CPS變形,callback hell的存在,導致寫代碼的體驗比coroutine差了太多。因此這個測試主要為打算自己實現編程語言的讀者做為參考。