Netty 中 IOException: Connection reset by peer 與 java.nio.channels.ClosedChannelException: null


最近發現系統中出現了很多 IOException: Connection reset by peer 與 ClosedChannelException: null

深入看了看代碼, 做了些測試, 發現 Connection reset 會在客戶端不知道 channel 被關閉的情況下, 觸發了 eventloop 的 unsafe.read() 操作拋出

而 ClosedChannelException 一般是由 Netty 主動拋出的, 在 AbstractChannel 以及 SSLHandler 里都可以看到 ClosedChannel 相關的代碼

AbstractChannel 

static final ClosedChannelException CLOSED_CHANNEL_EXCEPTION = new ClosedChannelException();

...

    static {
        CLOSED_CHANNEL_EXCEPTION.setStackTrace(EmptyArrays.EMPTY_STACK_TRACE);
        NOT_YET_CONNECTED_EXCEPTION.setStackTrace(EmptyArrays.EMPTY_STACK_TRACE);
    }

...

@Override
        public void write(Object msg, ChannelPromise promise) {
            ChannelOutboundBuffer outboundBuffer = this.outboundBuffer;
            if (outboundBuffer == null) {
                // If the outboundBuffer is null we know the channel was closed and so
                // need to fail the future right away. If it is not null the handling of the rest
                // will be done in flush0()
                // See https://github.com/netty/netty/issues/2362
                safeSetFailure(promise, CLOSED_CHANNEL_EXCEPTION);
                // release message now to prevent resource-leak
                ReferenceCountUtil.release(msg);
                return;
            }
            outboundBuffer.addMessage(msg, promise);
        }

在代碼的許多部分, 都會有這個 ClosedChannelException, 大概的意思是說在 channel close 以后, 如果還調用了 write 方法, 則會將 write 的 future 設置為 failure, 並將 cause 設置為 ClosedChannelException, 同樣 SSLHandler 中也類似

-----------------

回到 Connection reset by peer, 要模擬這個情況比較簡單, 就是在 server 端設置一個在 channelActive 的時候就 close channel 的 handler. 而在 client 端則寫一個 Connect 成功后立即發送請求數據的 listener. 如下

client

    public static void main(String[] args) throws IOException, InterruptedException {
        Bootstrap b = new Bootstrap();
        b.group(new NioEventLoopGroup())
                .channel(NioSocketChannel.class)
                .handler(new ChannelInitializer<NioSocketChannel>() {
                    @Override
                    protected void initChannel(NioSocketChannel ch) throws Exception {
                    }
                });
        b.connect("localhost", 8090).addListener(new ChannelFutureListener() {
            @Override
            public void operationComplete(ChannelFuture future) throws Exception {
                if (future.isSuccess()) {
                    future.channel().write(Unpooled.buffer().writeBytes("123".getBytes()));
                    future.channel().flush();
                }
            }
        });

server

public class SimpleServer {

    public static void main(String[] args) throws Exception {

        EventLoopGroup bossGroup = new NioEventLoopGroup(1);
        EventLoopGroup workerGroup = new NioEventLoopGroup();
        ServerBootstrap b = new ServerBootstrap();
        b.group(bossGroup, workerGroup)
                .channel(NioServerSocketChannel.class)
                .option(ChannelOption.SO_REUSEADDR, true)
                .childHandler(new ChannelInitializer<NioSocketChannel>() {
                    @Override
                    protected void initChannel(NioSocketChannel ch) throws Exception {
                        ch.pipeline().addLast(new SimpleServerHandler());
                    }
                });
        b.bind(8090).sync().channel().closeFuture().sync();
    }
}


public class SimpleServerHandler extends ChannelInboundHandlerAdapter {
    @Override
    public void channelActive(ChannelHandlerContext ctx) throws Exception {
        ctx.channel().close().sync();
    }

    @Override
    public void channelRead(ChannelHandlerContext ctx, final Object msg) throws Exception {
        System.out.println(123);
    }

    @Override
    public void channelInactive(ChannelHandlerContext ctx) throws Exception {
        System.out.println("inactive");
    }
}

 

這種情況之所以能觸發 connection reset by peer 異常, 是因為 connect 成功以后, client 段先會觸發 connect 成功的 listener, 這個時候 server 段雖然斷開了 channel, 也觸發 channel 斷開的事件 (它會觸發一個客戶端 read 事件, 但是這個 read 會返回 -1, -1 代表 channel 關閉, client 的 channelInactive 跟 channel  active 狀態的改變都是在這時發生的), 但是這個事件是在 connect 成功的 listener 之后執行, 所以這個時候 listener 里的 channel 並不知道自己已經斷開, 它還是會繼續進行 write 跟 flush 操作, 在調用 flush 后, eventloop 會進入 OP_READ 事件里, 這時候 unsafe.read() 就會拋出 connection reset 異常. eventloop 代碼如下

NioEventLoop

private static void processSelectedKey(SelectionKey k, AbstractNioChannel ch) {
        final NioUnsafe unsafe = ch.unsafe();
        if (!k.isValid()) {
            // close the channel if the key is not valid anymore
            unsafe.close(unsafe.voidPromise());
            return;
        }

        try {
            int readyOps = k.readyOps();
            // Also check for readOps of 0 to workaround possible JDK bug which may otherwise lead
            // to a spin loop
            if ((readyOps & (SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) != 0 || readyOps == 0) {
                unsafe.read(); if (!ch.isOpen()) {
                    // Connection already closed - no need to handle write.
                    return;
                }
            }
            if ((readyOps & SelectionKey.OP_WRITE) != 0) {
                // Call forceFlush which will also take care of clear the OP_WRITE once there is nothing left to write
                ch.unsafe().forceFlush();
            }
            if ((readyOps & SelectionKey.OP_CONNECT) != 0) {
                // remove OP_CONNECT as otherwise Selector.select(..) will always return without blocking
                // See https://github.com/netty/netty/issues/924
                int ops = k.interestOps();
                ops &= ~SelectionKey.OP_CONNECT;
                k.interestOps(ops);

                unsafe.finishConnect();
            }
        } catch (CancelledKeyException e) {
            unsafe.close(unsafe.voidPromise());
        }
    }

這就是 connection reset by peer 產生的原因

------------------

再來看 ClosedChannelException 如何產生, 要復現他也很簡單. 首先要明確, 並沒有客戶端主動關閉才會出現 ClosedChannelException 這么一說. 下面來看兩種出現 ClosedChannelException 的客戶端寫法

client 1, 主動關閉 channel

public class SimpleClient {

    private static final Logger logger = LoggerFactory.getLogger(SimpleClient.class);

    public static void main(String[] args) throws IOException, InterruptedException {
        Bootstrap b = new Bootstrap();
        b.group(new NioEventLoopGroup())
                .channel(NioSocketChannel.class)
                .handler(new ChannelInitializer<NioSocketChannel>() {
                    @Override
                    protected void initChannel(NioSocketChannel ch) throws Exception {
                    }
                });
        b.connect("localhost", 8090).addListener(new ChannelFutureListener() {
            @Override
            public void operationComplete(ChannelFuture future) throws Exception {
                if (future.isSuccess()) {
                    future.channel().close();
                    future.channel().write(Unpooled.buffer().writeBytes("123".getBytes())).addListener(new ChannelFutureListener() {
                        @Override
                        public void operationComplete(ChannelFuture future) throws Exception {
                            if (!future.isSuccess()) {
                                logger.error("Error", future.cause());
                            }
                        }
                    });
                    future.channel().flush();
                }
            }
        });
    }
}

 

只要在 write 之前主動調用了 close, 那么 write 必然會知道 close 是 close 狀態, 最后 write 就會失敗, 並且 future 里的 cause 就是 ClosedChannelException

--------------------

client 2. 由服務端造成的 ClosedChannelException

public class SimpleClient {

    private static final Logger logger = LoggerFactory.getLogger(SimpleClient.class);

    public static void main(String[] args) throws IOException, InterruptedException {
        Bootstrap b = new Bootstrap();
        b.group(new NioEventLoopGroup())
                .channel(NioSocketChannel.class)
                .handler(new ChannelInitializer<NioSocketChannel>() {
                    @Override
                    protected void initChannel(NioSocketChannel ch) throws Exception {
                    }
                });
        Channel channel = b.connect("localhost", 8090).sync().channel();
        Thread.sleep(3000);
        channel.writeAndFlush(Unpooled.buffer().writeBytes("123".getBytes())).addListener(new ChannelFutureListener() {
            @Override
            public void operationComplete(ChannelFuture future) throws Exception {
                if (!future.isSuccess()) {
                    logger.error("error", future.cause());
                }
            }
        });
    }
}

服務端

public class SimpleServer {

    public static void main(String[] args) throws Exception {

        EventLoopGroup bossGroup = new NioEventLoopGroup(1);
        EventLoopGroup workerGroup = new NioEventLoopGroup();
        ServerBootstrap b = new ServerBootstrap();
        b.group(bossGroup, workerGroup)
                .channel(NioServerSocketChannel.class)
                .option(ChannelOption.SO_REUSEADDR, true)
                .childHandler(new ChannelInitializer<NioSocketChannel>() {
                    @Override
                    protected void initChannel(NioSocketChannel ch) throws Exception {
                        ch.pipeline().addLast(new SimpleServerHandler());
                    }
                });
        b.bind(8090).sync().channel().closeFuture().sync();
    }
}

這種情況下,  服務端將 channel 關閉, 客戶端先 sleep, 這期間 client 的 eventLoop 會處理客戶端關閉的時間, 也就是 eventLoop 的 processKey 方法會進入 OP_READ, 然后 read 出來一個 -1, 最后觸發 client channelInactive 事件, 當 sleep 醒來以后, 客戶端調用 writeAndFlush, 這時候客戶端 channel 的狀態已經變為了 inactive, 所以 write 失敗, cause 為 ClosedChannelException


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM