Concurrency in Python is a powerful ability that enables programs to perform multiple tasks simultaneously. This provides better responsive and efficient programming. Don’t confuse it with a true parallel execution on multiple CPU cores. It is rather an interleaving or overlapping of tasks. Still confused? No problem! This comprehensive guide is all about explaining Concurrency in Python.
The responsiveness and performance have become a major consideration in programming. This is where the understanding of concurrency becomes essential for developers. It helps them build efficient and scalable applications. In this article, you’ll learn what concurrency is, how it differs from parallelism and how the multiprocessing module in Python helps you achieve true parallel execution using multiple CPU cores.
Also Explore: Python Tutorial for Understanding Each Concept
Concurrency in Python refers to an approach of executing multiple tasks in overlapping time periods by maximizing the usage of the system. Not all tasks need to be performed at the same time. It can be achieved using the following techniques:
Threading allows you to run multiple threads (smaller units of a process) within a single process. Threads share memory space, making communication easy but requiring careful synchronization to avoid conflicts.
import threading
import time
def download_file(file_num):
print(f"Downloading file {file_num}...")
# Simulate a delay
time.sleep(2)
print(f"File {file_num} downloaded.")
threads = []
for i in range(3):
t = threading.Thread(target=download_file, args=(i + 1,))
threads.append(t)
t.start()
for t in threads:
t.join()
print("All downloads completed.")
|
Note: Due to the Global Interpreter Lock (GIL), threads in Python cannot achieve true parallelism for CPU-heavy tasks.
AsyncIO lets you write asynchronous code using the async and await keywords and is commonly used in large applications organized using Python packages. It’s perfect for applications that need to handle many I/O operations without blocking the main thread.
import asyncio
async def fetch_data(task_id):
print(f"Fetching data for task {task_id}...")
await asyncio.sleep(2)
print(f"Data fetched for task {task_id}.")
async def main():
tasks = [fetch_data(i) for i in range(3)]
await asyncio.gather(*tasks)
asyncio.run(main())
|
Multiprocessing creates separate processes, each with its own Python interpreter and memory space. This allows Python to take advantage of multiple CPU cores for true parallelism.
import multiprocessing
def compute_square(n):
print(f"Computing square of {n}")
return n * n
if __name__ == "__main__":
numbers = [1, 2, 3, 4]
with multiprocessing.Pool() as pool:
results = pool.map(compute_square, numbers)
print("Squares:", results)
|
Python’s Global Interpreter Lock (GIL) prevents multiple threads from executing Python bytecode at once. The multiprocessing module overcomes this limitation by spawning multiple processes. Each of these has its own Python interpreter and memory space. This means:
Many individuals often get confused with the ‘Concurrency vs Parallelism’. Although they sound similar, concurrency and parallelism have distinct meanings Both of these are two completely distinct approaches to managing multiple tasks in Python. Their use depends on specific scenarios. Understanding their differences and implementations is crucial for optimizing performance in Python applications. The table given below will help you understand the difference:
| Aspect | Concurrency | Parallelism |
| Definition | Multiple tasks make progress together | Multiple tasks run simultaneously |
| Goal | Efficiently manage multiple operations | Speed up execution through simultaneous processing |
| Example | Switching between tasks (like multitasking) | Running tasks on separate CPU cores |
Multiprocessing is a Python technique to achieve true parallelism by running multiple tasks in separate processes. It is ideal for CPU-bound tasks and improves performance for CPU-intensive operations like data processing, machine learning and simulations. All these applications allow programs to take full advantage of multi-core processors by executing code concurrently across different cores.
Developers can use it for:
Here’s a simple example demonstrating how multiprocessing works in Python:
import multiprocessing
import time
def square_numbers():
for i in range(1_000_000):
i * i
if __name__ == "__main__":
processes = []
for _ in range(multiprocessing.cpu_count()):
p = multiprocessing.Process(target=square_numbers)
processes.append(p)
p.start()
for p in processes:
p.join()
print("Completed multiprocessing example.")
|
This script creates multiple processes—each running independently on a different CPU core. The join() method is used to ensures that the main program waits for all processes to complete before exiting.
When using concurrency in Python, you will see two types of tasks—CPU-bound and I/O-bound. Both of these are used for different purposes. Here, the question comes: how to make the choice. The table given below will help you understand how:
| Task Type | Examples | Best Technique |
| CPU-bound | Image processing, mathematical computations, ML training | Multiprocessing |
| I/O-bound | API calls, database queries, file operations, network requests | Threading or AsyncIO |
Read Also- Python Interview Questions
When multiple threads access shared data such as global variables in Python, you must ensure thread safety to avoid race conditions and data corruption.
Common synchronization tools:
import threading
counter = 0
lock = threading.Lock()
def increment():
global counter
for _ in range(10000):
with lock:
counter += 1
threads = [threading.Thread(target=increment) for _ in range(5)]
for t in threads:
t.start()
for t in threads:
t.join()
print("Final counter value:", counter)
|
The concurrent.futures module provides a simple way to run tasks asynchronously using either threads or processes.
from concurrent.futures import ThreadPoolExecutor
def greet(name):
return f"Hello, {name}!"
with ThreadPoolExecutor(max_workers=3) as executor:
results = executor.map(greet, ["Alice", "Bob", "Charlie"])
for res in results:
print(res)
|
You can also use ProcessPoolExecutor for CPU-bound tasks.
Task Type Best Approach Why CPU-bound Multiprocessing True parallelism, bypasses the GIL I/O-bound Threading/AsyncIO Efficient for waiting on I/O Many network tasks AsyncIO Scales well with lots of connections
Understanding the best practices is one of the most important when learning a technical concept. One must follow the given best practices while using Concurrency in Python.
Apart from the best practices, one must also know what mistakes they should avoid to save their time. Here are some of them:
Understanding concurrency in Python and how to use multiprocessing can dramatically improve your program’s efficiency. This comprehensive guide explains it all. It also helps to choose the right concurrency model for your task type. With this knowledge, you can optimize CPU usage, reduce runtime and build scalable systems—an important part of modern Python developer skills.
It is a mechanism in CPython. Its reference is to the implementation of Python, which ensures only one thread can execute Python bytecode at any given time.
The GIL typically allows only one thread to execute Python bytecode at a time, which limits true parallelism for CPU-bound tasks.
Threading, Multiprocessing, Asyncio, Concurrent.futures and Subprocess module are the common approaches to achieve concurrency in Python.
Yes, the choice of concurrency approach does impact performance.
Additional Topics To Learn: