Go: sync.Pool

介绍

Pool’s purpose is to cache allocated but unused items for later reuse, relieving pressure on the garbage collector. That is, it makes it easy to build efficient, thread-safe free lists. However, it is not suitable for all free lists.

sync.Pool 是 sync 包的一个组件，用于 复用对象，避免频繁创建销毁对象带来的性能开销

APIs

sync.Pool 一共有两个 API：

Put 用于向 sync.Pool 存放一个对象

Get 用于从 sync.Pool 取出一个对象

如果 sync.Pool 为空，在 Get 的时候，会自动创建一个新的对象返回给调用者

为了控制 sync.Pool 返回的新对象，可以在初始化 sync.Pool 时，指定 new 的方法

使用基本步骤

下面是一个使用 sync.Pool 的基本步骤：

type Object struct {

}

var pool sync.Pool

func init()  {
	// 1. 在使用 pool 前，指定 new 的方式
	pool.New = func() interface{} {
		return new(Object)
	}
}

func demo0()  {
	// 2. 从 pool 中取出一个对象
	// obj := new(Object) // 而不是调用 new 来创建
	obj := pool.Get().(*Object)

	// 3. 使用 obj
	// ...

	// 4. 清空 obj
	// ...

	// 5. 将 obj 放回 pool，供下一次使用
	pool.Put(obj)
}

注意： sync.Pool 中存放的对象应该是 可重用的

特点

sync.Pool 是 goroutine 安全 的，多个 goroutine 可以并发的使用 sync.Pool

sync.Pool 无法保证对象不被清理，当触发 GC 时，会清空 sync.Pool 内所有没有被引用的对象

sync.Pool 的大小无法手动设置，大小取决于内存上限

Get 方法取出来的对象和上次 Put 进去的对象实际上是同一个，Pool 没有做任何「清空」的处理。但我们不应当对此有任何假设，因为在实际的并发使用场景中，无法保证这种顺序，最好的做法是在 Put 前，将对象清空

fmt 包是怎么使用 sync.Pool 的

直接看 fmt.Printf:

func Printf(format string, a ...any) (n int, err error) {
	return Fprintf(os.Stdout, format, a...)
}

内部调用了一个 Fprintf 方法，继续看看：

func Fprintf(w io.Writer, format string, a ...any) (n int, err error) {
	p := newPrinter()
	p.doPrintf(format, a)
	n, err = w.Write(p.buf)
	p.free()
	return
}

容易发现，首先先 new 了一个 Printer 对象，那么这个 new，是否每次都需要分配内存呢？来看看 newPrinter：

func newPrinter() *pp {
	p := ppFree.Get().(*pp) // 从 pool 中取出一个 pp 对象
	p.panicking = false
	p.erroring = false
	p.wrapErrs = false
	p.fmt.init(&p.buf)
	return p
}

这个函数返回一个 p，而 p 的来源，就是 sync.Pool！

再来看看 ppFree：

var ppFree = sync.Pool{
	New: func() any { return new(pp) },
}

可以发现：ppFree 就是一个 sync.Pool，new 方法创建了一个 pp 对象

pp 对象包含了什么呢？

type buffer []byte

// pp is used to store a printer's state and is reused with sync.Pool to avoid allocations.
type pp struct {
	buf buffer

	// arg holds the current item, as an interface{}.
	arg any
    // ...省略
}

pp 对象主要包含了一个 buffer 对象，用做输出缓冲区

fmt.Printf 可能在一个项目中会多次出现，如果每次调用 fmt.Printf 都要创建一个 pp 对象，那么内存分配总次数会大大增加

fmt 包通过使用 sync.Pool，很大程度上减少了内存的分配次数！有效提升了性能

我们再来看看 Fprintf 函数：

func Fprintf(w io.Writer, format string, a ...any) (n int, err error) {
	p := newPrinter() // 内部使用 sync.Pool 来获取一个 pp 对象
	p.doPrintf(format, a)
	n, err = w.Write(p.buf)
	p.free()          // 内部将 pp 对象清空了，并将 pp 对象归还给 sync.Pool
	return
}

重点关注 free 函数：

// free saves used pp structs in ppFree; avoids an allocation per invocation.
func (p *pp) free() {
	if cap(p.buf) > 64*1024 {
		p.buf = nil
	} else {
		p.buf = p.buf[:0]
	}
	if cap(p.wrappedErrs) > 8 {
		p.wrappedErrs = nil
	}

	p.arg = nil
	p.value = reflect.Value{}
	p.wrappedErrs = p.wrappedErrs[:0]
	ppFree.Put(p) // 将 pp 对象放回 sync.Pool
}

可以看出，fmt 包对于 sync.Pool 的使用，是符合之前介绍的步骤的

使用场景

对于很多需要重复分配、回收内存的地方，sync.Pool 是一个很好的选择。频繁地分配、回收内存会给 GC 带来一定的负担，而 sync.Pool 可以将暂时不用的对象缓存起来，待下次需要的时候直接使用，不用再次经过内存分配，复用对象的内存，减轻 GC 的压力，提升系统的性能。

例如，标准库的 fmt 包，gin 框架对 context 的管理等等

Pool 里对象的生命周期受 GC 影响

因此，sync.Pool 不适合用作连接池，因为连接池需要自己管理对象的生命周期

示例

下面通过一个简单的示例，展示使用 sync.Pool 的基本方法，以及性能提升

buffer.go

package buffer

type Buffer struct {
	data []byte
}

func NewBuffer(size int) *Buffer {
	return &Buffer{
		data: make([]byte, 0, size),
	}
}

func (b *Buffer) Write(p []byte) error {
	b.data = append(b.data, p...)
	return nil // 这里忽略了错误处理
}

func (b *Buffer) Read() (n []byte, err error) {
	res := make([]byte, len(b.data))
	copy(res, b.data)
	return res, nil // 这里忽略了错误处理
}

func (b *Buffer) Clear() {
	b.data = b.data[:0]
}

buffer_test.go

package buffer

import (
	"sync"
	"testing"
)

func TestBuffer(t *testing.T)  {
	b := NewBuffer(1024)
	if err := b.Write([]byte("Hello, world!")); err != nil {
		t.Errorf("Error writing to buffer: %v", err)
	}

	tmp, _ := b.Read()
	if string(tmp) != "Hello, world!" {
		t.Errorf("Expected 'Hello, world!', got '%s'", string(tmp))
	}

	b.Write([]byte("\nGoodbye, world!"))
	tmp, _ = b.Read()
	if string(tmp) != "Hello, world!\nGoodbye, world!" {
	    t.Errorf("Expected 'Hello, world!\nGoodbye, world!', got '%s'", string(tmp))
	}

	b.Clear()
	tmp, _ = b.Read()
	if string(tmp) != "" {
		t.Errorf("Expected '', got '%s'", string(tmp))
	}
}

func BenchmarkBuffer(b *testing.B) {
	op := func (buffer *Buffer)  {
		buffer.Write([]byte("Hello, world!"))
		tmp, _ := buffer.Read()
		if string(tmp) != "Hello, world!" {
			b.Errorf("Expected 'Hello, world!', got '%s'", string(tmp))
		}
	}

    for i := 0; i < b.N; i++ {
		buffer := NewBuffer(1024)
		op(buffer)
	}
}

func BenchmarkBufferWithPool(b *testing.B)  {
	pool := sync.Pool{
	    New: func() any {
			return NewBuffer(1024)
		},
	}

	op := func (buffer *Buffer)  {
		buffer.Write([]byte("Hello, world!"))
		tmp, _ := buffer.Read()
		if string(tmp) != "Hello, world!" {
			b.Errorf("Expected 'Hello, world!', got '%s'", string(tmp))
		}
	}

    for i := 0; i < b.N; i++ {
		buffer := pool.Get().(*Buffer)
		op(buffer)
		buffer.Clear()
		pool.Put(buffer)
	}
}

基准测试结果如下：

BenchmarkBuffer

Sky_Lee@SkyLeeMacBook-Pro test % go test -count=5 -benchmem -run=^$ -bench ^BenchmarkBuffer$ test/buffer
goos: darwin
goarch: amd64
pkg: test/buffer
cpu: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
BenchmarkBuffer-8        4847030               247.0 ns/op          1040 B/op          2 allocs/op
BenchmarkBuffer-8        4912086               240.2 ns/op          1040 B/op          2 allocs/op
BenchmarkBuffer-8        4875544               245.5 ns/op          1040 B/op          2 allocs/op
BenchmarkBuffer-8        4924796               246.3 ns/op          1040 B/op          2 allocs/op
BenchmarkBuffer-8        4623854               244.6 ns/op          1040 B/op          2 allocs/op
PASS
ok      test/buffer     7.365s

BenchmarkBufferWithPool

Sky_Lee@SkyLeeMacBook-Pro test % go test -count=5 -benchmem -run=^$ -bench ^BenchmarkBufferWithPool$ test/buffer
goos: darwin
goarch: amd64
pkg: test/buffer
cpu: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
BenchmarkBufferWithPool-8       19232821                53.97 ns/op           16 B/op          1 allocs/op
BenchmarkBufferWithPool-8       21850822                54.03 ns/op           16 B/op          1 allocs/op
BenchmarkBufferWithPool-8       20813865                53.75 ns/op           16 B/op          1 allocs/op
BenchmarkBufferWithPool-8       20970561                55.24 ns/op           16 B/op          1 allocs/op
BenchmarkBufferWithPool-8       19820340                54.02 ns/op           16 B/op          1 allocs/op
PASS
ok      test/buffer     6.032s

首先，让我们看看不使用池的 BenchmarkBuffer：

4847030 到 4924796 次迭代（不同的运行次数）
每次操作平均耗时大约 240 到 247 纳秒（ns/op）
每次操作分配 1040 字节（B/op）
每次操作进行 2 次内存分配（allocs/op）

现在，让我们看看使用了池的 BenchmarkBufferWithPool：

19232821 到 21850822 次迭代（不同的运行次数）
每次操作平均耗时大约 53.75 到 55.24 纳秒（ns/op）
每次操作分配 16 字节（B/op）
每次操作进行 1 次内存分配（allocs/op）

从这些结果中，我们可以得出以下结论：

使用内存池的版本 (BenchmarkBufferWithPool) 比不使用内存池的版本 (BenchmarkBuffer) 快得多。平均耗时从约 245 纳秒减少到约 54 纳秒，几乎是 4.5 倍的性能提升。
使用内存池的版本在内存分配上也更高效。每次操作的分配量从 1040 字节减少到 16 字节，减少了超过 98%。
使用内存池的版本的内存分配次数也减少了。每次操作的分配次数从 2 次减少到 1 次。

综上所述，使用内存池可以 显著提高性能和内存使用效率，这在需要 频繁分配和释放内存的场景 中非常有用，可以有效减少 GC 的压力，提升程序整体性能。

源码分析

关于源码分析，可以看看这篇文章：深度解密 Go 语言之 sync.Pool