Introduction

數據競爭是並發系統中最常見和最難 debug 的 bug 類型之一，當兩個 goroutine 同時訪問同一個變量並且至少有一個是寫入時，就會發生 data race(數據競爭)。詳細內容可以閱讀The Go Memory Model。
以下是可能導致崩潰和內存損壞的 data race 示例：

func main() {
	c := make(chan bool)
	m := make(map[string]string)
	go func() {
		m["1"] = "a" // First conflicting access.
		c <- true
	}()
	m["2"] = "b" // Second conflicting access.
	<-c
	for k, v := range m {
		fmt.Println(k, v)
	}
}

Usage

為了幫助診斷此類錯誤，Go 包含一個內置的 data race detector。要使用它，請在go命令中添加-race標志：

$ go test -race mypkg    // to test the package
$ go run -race mysrc.go  // to run the source file
$ go build -race mycmd   // to build the command
$ go install -race mypkg // to install the package

Report Format

當 data race detector 在程序中發現有 data race 時，它會打印一個報告。該報告包含沖突訪問的堆棧跟蹤，以及創建相關 goroutine 的堆棧。以下一個例子：

WARNING: DATA RACE
Read by goroutine 185:
  net.(*pollServer).AddFD()
      src/net/fd_unix.go:89 +0x398
  net.(*pollServer).WaitWrite()
      src/net/fd_unix.go:247 +0x45
  net.(*netFD).Write()
      src/net/fd_unix.go:540 +0x4d4
  net.(*conn).Write()
      src/net/net.go:129 +0x101
  net.func·060()
      src/net/timeout_test.go:603 +0xaf

Previous write by goroutine 184:
  net.setWriteDeadline()
      src/net/sockopt_posix.go:135 +0xdf
  net.setDeadline()
      src/net/sockopt_posix.go:144 +0x9c
  net.(*conn).SetDeadline()
      src/net/net.go:161 +0xe3
  net.func·061()
      src/net/timeout_test.go:616 +0x3ed

Goroutine 185 (running) created at:
  net.func·061()
      src/net/timeout_test.go:609 +0x288

Goroutine 184 (running) created at:
  net.TestProlongTimeout()
      src/net/timeout_test.go:618 +0x298
  testing.tRunner()
      src/testing/testing.go:301 +0xe8

Options

環境變量 GORACE 用來設置 data race detector 選項，格式如下：

GORACE="option1=val1 option2=val2"

option 有：

log_path (default stderr): race detector 將其報告寫入名為log_path.pid的文件。stdout和stderr 分別讓報告寫入標准輸出和標准錯誤。
exitcode (default 66): 檢測到的 race 后使用的退出狀態碼。
strip_path_prefix (default ""): 從所有報告的文件路徑中刪除此前綴，以使報告更簡潔。
history_size (default 1): 每個 goroutine 內存訪問歷史記錄是32K * 2**history_size elements。加大此值可以避免“無法還原堆棧”錯誤報告，但會增加內存使用量。
halt_on_error (default 0): 控制在報告第一次數據競爭后程序是否退出。

示例：

$ GORACE="log_path=/tmp/race/report strip_path_prefix=/my/go/sources/" go test -race

Excluding Tests

當使用 -race 標志構建時，go 命令定義了額外的構建參數race。運行 race detector 時，你可以使用此標記排除某些代碼和測試。下面是一些實例：

// +build !race

package foo

// The test contains a data race. See issue 123.
func TestFoo(t *testing.T) {
	// ...
}

// The test fails under the race detector due to timeouts.
func TestBar(t *testing.T) {
	// ...
}

// The test takes too long under the race detector.
func TestBaz(t *testing.T) {
	// ...
}

To start, run your tests using the race detector (go test -race). The race detector only finds races that happen at runtime, so it can't find races in code paths that are not executed. If your tests have incomplete coverage, you may find more races by running a binary built with -race under a realistic workload.

How To Use

首先，使用 race detector 運行測試(go test -race)。race detector 僅查找運行時發生的 race，因此無法在未執行的代碼路徑中找到 race，如果你的測試覆蓋率不完全，在實際工作負載下運行使用-race構建的二進制文件，你可能會發現更多的 race。

Typical Data Races

以下是一些典型的 data race 場景。所有這些都可以通過 race detector 檢測到：

Race on loop counter(循環計數器競爭)

func main() {
	var wg sync.WaitGroup
	wg.Add(5)
	for i := 0; i < 5; i++ {
		go func() {
			fmt.Println(i) // Not the 'i' you are looking for.
			wg.Done()
		}()
	}
	wg.Wait()
}

函數傳參中的變量i與 for 循環使用的變量相同，因此 goroutine 中的讀取與循環的自增產生 race（此程序通常會打印出 55555，而不是 01234）。zhegewenti可以通過對變量i進行復制來修復；

func main() {
	var wg sync.WaitGroup
	wg.Add(5)
	for i := 0; i < 5; i++ {
		go func(j int) {
			fmt.Println(j) // Good. Read local copy of the loop counter.
			wg.Done()
		}(i)
	}
	wg.Wait()
}

Accidentally shared variable(意外的共享變量)

// ParallelWrite writes data to file1 and file2, returns the errors.
func ParallelWrite(data []byte) chan error {
	res := make(chan error, 2)
	f1, err := os.Create("file1")
	if err != nil {
		res <- err
	} else {
		go func() {
			// This err is shared with the main goroutine,
			// so the write races with the write below.
			// 此 err 變量和主 goroutine 共享，所以此寫入和下面的寫入產生 race。
			_, err = f1.Write(data)
			res <- err
			f1.Close()
		}()
	}
	f2, err := os.Create("file2") // The second conflicting write to err.
	if err != nil {
		res <- err
	} else {
		go func() {
			_, err = f2.Write(data)
			res <- err
			f2.Close()
		}()
	}
	return res
}

修復方法是在 goroutines 中引入新變量（注意使用 :=）：

			...
			_, err := f1.Write(data)
			...
			_, err := f2.Write(data)
			...

Unprotected global variable(無保護的全局變量)

如果有多個 goroutine 調用以下代碼，則會導致 map類型的變量service產生 race。並發讀取和寫入同一個 map 是不安全的：

var service map[string]net.Addr

func RegisterService(name string, addr net.Addr) {
	service[name] = addr
}

func LookupService(name string) net.Addr {
	return service[name]
}

為了使代碼安全，用互斥鎖mutex來保護訪問權限：

var (
	service   map[string]net.Addr
	serviceMu sync.Mutex
)

func RegisterService(name string, addr net.Addr) {
	serviceMu.Lock()
	defer serviceMu.Unlock()
	service[name] = addr
}

func LookupService(name string) net.Addr {
	serviceMu.Lock()
	defer serviceMu.Unlock()
	return service[name]
}

Primitive unprotected variable(原始無保護變量)

data race 也可能發生在原始類型的變量上（bool，int，int64 等），如下例所示：

type Watchdog struct{ last int64 }

func (w *Watchdog) KeepAlive() {
	w.last = time.Now().UnixNano() // First conflicting access.
}

func (w *Watchdog) Start() {
	go func() {
		for {
			time.Sleep(time.Second)
			// Second conflicting access.
			if w.last < time.Now().Add(-10*time.Second).UnixNano() {
				fmt.Println("No keepalives for 10 seconds. Dying.")
				os.Exit(1)
			}
		}
	}()
}

即使是這種“無辜的” data race 也會導致難以調試的問題，這些問題是由存儲器訪問的非原子性，編譯器優化的干擾或訪問處理器存儲的重新排序問題引起的。

這種 data race 的典型修復方法是使用 channel 或 mutex。為了保持無鎖行為，還可以使用sync/atomic包。

type Watchdog struct{ last int64 }

func (w *Watchdog) KeepAlive() {
	atomic.StoreInt64(&w.last, time.Now().UnixNano())
}

func (w *Watchdog) Start() {
	go func() {
		for {
			time.Sleep(time.Second)
			if atomic.LoadInt64(&w.last) < time.Now().Add(-10*time.Second).UnixNano() {
				fmt.Println("No keepalives for 10 seconds. Dying.")
				os.Exit(1)
			}
		}
	}()
}

Supported Systems

Trace detector 可以運行在 darwin/amd64, freebsd/amd64, linux/amd64 和 windows/amd64.

Runtime Overhead

競爭檢測的成本因程序而異，但對於一個典型的程序，內存使用量可能增加5-10倍，執行時間增加2-20倍。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 美女檢測器 golang逃逸分析和競爭檢測 Verilog-數據包檢測器 ColorSense顏色檢測器 opencv_traincascade 訓練自己的檢測器詳解FindBugs的各項檢測器 . 狀態機、序列檢測器基於FPGA的序列檢測器10010 【OpenCV】特征檢測器 FeatureDetector Verilog -- 序列模三（整除3）檢測器