GIL, Global Interprete Lock, has been a safety mechanism for CPython memory management, ensuring thread safety when dealing with Python objects.
However, It also comes with a performance cost, especially for embarrassingly CPU-bound tasks that couldn't truly run in parallel with threads. Even if we spawn multiple threads, only one can execute at a time, leading to underutilization of multi-core processors.
Traditionally, the workaround was to use multiprocessing.Pool
for parallel execution.
While effective, it brings its own overhead:
The exciting news? Since Python 3.13, CPython has introduced experimental GIL-free builds, such as python 3.14t
.
This means we can run threads in parallel and no need to rely on heavy multiprocessing for CPU-bound tasks.
Python 3.14t, removes the GIL, allowing threads to run truly in parallel on multi-core processors. To explore this exciting new feature, I wrote a simple python script to check whether a number is a prime using multiple threads. The idea was inspired by fluent python by Luciano Ramalho - a great resource if you want to dive deeper into threading concepts.
Below is a quick snippet of the threaded prime-checking logic. You can find the full script here.
... # Omitting TEST_CASES
def is_prime(n: int) -> bool:
if n < 2:
return False
if n == 2:
return True
if n % 2 == 0:
return False
root = math.isqrt(n)
for i in range(3, root + 1, 2):
if n % i == 0:
return False
return True
class IsPrimeWorker:
def __init__(self, n):
self.n = n
self.name = hash(n)
self.result = None
def run(self):
self.result = is_prime(self.n)
@classmethod
def create_workers(cls):
return [cls(n) for n in NUMBERS]
def main():
workers = IsPrimeWorker.create_workers()
threads = [Thread(target=worker.run) for worker in workers]
for t in threads:
t.start()
for t in threads:
t.join()
To run the script with both Python 3.14 (with GIL) and Python 3.14t (GIL-free), run the following commands:
# Run with GIL enabled
uv run -p 3.14 is_prime.py
# Run with GIL removed
uv run -p 3.14t is_prime.py
To benchmark the performance difference between the two versions, use:
uv run --python 3.14 benchmark.py
The benchmark script is availble here, if you'd like to try it yourself.
Here are the benchmark results from my Macbook:
Testing Python 3.14 (GIL)...
Run 1: 12.51s, 100%, 16.7MB
Run 2: 12.43s, 100%, 15.2MB
Run 3: 12.77s, 100%, 15.0MB
Testing Python 3.14t (No GIL)...
Run 1: 4.84s, 569%, 22.2MB
Run 2: 4.69s, 575%, 22.1MB
Run 3: 4.83s, 565%, 27.3MB
Metric | Python 3.14 (GIL) | Python 3.14t (No GIL) | Improvement |
---|---|---|---|
Wall Time | 12.57 s | 4.79 s | 2.63× faster |
CPU Usage | 100 % | 570 % | 5.7× cores |
Memory Usage | 15.6 MB | 23.9 MB | 1.53× overhead |
This comparison, fresh from my own local machine 🤗, reveals the following:
Things to watch out for threading:
python 3.14t
to speed up your code without rewriting it.ProcessPool
, consider switching to a ThreadPool
with python 3.14t
to leverage parallel threads without the overhead of multiprocessing.