如何在Python中使用线程本地存储？

4 回答

例如，如果您有一个线程工作池，并且每个线程都需要访问自己的资源（如网络或数据库连接），则线程本地存储很有用 . 请注意， threading 模块使用常规的线程概念（可以访问进程全局数据），但由于全局解释器锁定，这些概念不太有用 . 不同的 multiprocessing 模块为每个创建一个新的子流程，因此任何全局都将是线程本地的 .

线程模块

这是一个简单的例子：

import threading
from threading import current_thread

threadLocal = threading.local()

def hi():
    initialized = getattr(threadLocal, 'initialized', None)
    if initialized is None:
        print("Nice to meet you", current_thread().name)
        threadLocal.initialized = True
    else:
        print("Welcome back", current_thread().name)

hi(); hi()

这将打印出来：

Nice to meet you MainThread
Welcome back MainThread

一个很容易被忽视的重要事情是：一个 threading.local() 对象只需要创建一次，不是每个线程一次，也不是每个函数调用一次 . global 或 class 级别是理想的位置 .

原因如下： threading.local() 实际上每次调用时都会创建一个新实例（就像任何工厂或类调用一样），因此多次调用 threading.local() 会不断覆盖原始对象，这很可能不是人们想要的 . 当任何线程访问现有的 threadLocal 变量（或其所谓的任何变量）时，它将获得该变量的私有视图 .

这不会按预期工作：

import threading
from threading import current_thread

def wont_work():
    threadLocal = threading.local() #oops, this creates a new dict each time!
    initialized = getattr(threadLocal, 'initialized', None)
    if initialized is None:
        print("First time for", current_thread().name)
        threadLocal.initialized = True
    else:
        print("Welcome back", current_thread().name)

wont_work(); wont_work()

将导致此输出：

First time for MainThread
First time for MainThread

多处理模块

所有全局变量都是线程本地的，因为 multiprocessing 模块为每个线程创建一个新进程 .

考虑这个例子，其中 processed 计数器是线程本地存储的一个例子：

from multiprocessing import Pool
from random import random
from time import sleep
import os

processed=0

def f(x):
    sleep(random())
    global processed
    processed += 1
    print("Processed by %s: %s" % (os.getpid(), processed))
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)
    print(pool.map(f, range(10)))

它将输出如下内容：

Processed by 7636: 1
Processed by 9144: 1
Processed by 5252: 1
Processed by 7636: 2
Processed by 6248: 1
Processed by 5252: 2
Processed by 6248: 2
Processed by 9144: 2
Processed by 7636: 3
Processed by 5252: 3
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

...当然，每个和订单的线程ID和计数因运行而异 .

回复于 2024-05-06T15:36:30+08:00

正如问题所述，Alex Martelli给出了解决方案here . 此函数允许我们使用工厂函数为每个线程生成默认值 .

#Code originally posted by Alex Martelli
#Modified to use standard Python variable name conventions
import threading
threadlocal = threading.local()    

def threadlocal_var(varname, factory, *args, **kwargs):
  v = getattr(threadlocal, varname, None)
  if v is None:
    v = factory(*args, **kwargs)
    setattr(threadlocal, varname, v)
  return v

回复于 2024-05-06T15:36:30+08:00

84
线程局部存储可以简单地被认为是命名空间（通过属性表示法访问值） . 不同之处在于每个线程透明地获取自己的一组属性/值，因此一个线程不会看到另一个线程的值 .

就像普通对象一样，您可以在代码中创建多个 threading.local 实例 . 它们可以是局部变量，类或实例成员或全局变量 . 每个都是一个单独的命名空间 .

这是一个简单的例子：
```
import threading

class Worker(threading.Thread):
    ns = threading.local()
    def run(self):
        self.ns.val = 0
        for i in range(5):
            self.ns.val += 1
            print("Thread:", self.name, "value:", self.ns.val)

w1 = Worker()
w2 = Worker()
w1.start()
w2.start()
w1.join()
w2.join()
```
输出：
```
Thread: Thread-1 value: 1
Thread: Thread-2 value: 1
Thread: Thread-1 value: 2
Thread: Thread-2 value: 2
Thread: Thread-1 value: 3
Thread: Thread-2 value: 3
Thread: Thread-1 value: 4
Thread: Thread-2 value: 4
Thread: Thread-1 value: 5
Thread: Thread-2 value: 5
```
注意每个线程如何维护自己的计数器，即使 ns 属性是类成员（因此在线程之间共享） .

同一个例子可能使用了一个实例变量或一个局部变量，但这并没有显示太多，因为当时没有共享（一个dict也能正常工作） . 在某些情况下，您需要将线程局部存储作为实例变量或局部变量，但它们往往相对较少（而且非常微妙） .
回复于 2024-05-06T15:36:30+08:00
13
也可以写
```
import threading
mydata = threading.local()
mydata.x = 1
```
mydata.x只存在于当前线程中
回复于 2024-05-06T15:36:30+08:00

在Python中线程化本地存储

相关

4 回答

线程模块

多处理模块

在Python中线程化本地存储

相关

4 回答

线程模块

多处理模块

相关问题