Chapter 13: Concurrency and Parallelism
Concurrency and parallelism are essential concepts for improving the efficiency and performance of Python programs. They enable the execution of tasks in overlapping time periods or simultaneously, making better use of system resources.
Key Concepts
Concurrency:
Refers to tasks running independently but not necessarily simultaneously.
Suitable for I/O-bound operations (e.g., reading files, network requests).
Parallelism:
Involves tasks running simultaneously on multiple processors or cores.
Suitable for CPU-bound operations (e.g., mathematical computations).
Multithreading
Threads are lightweight processes that share the same memory space. Python’s threading
module allows concurrent execution of code using threads.
Creating Threads
import threading
def print_numbers():
for i in range(5):
print(f"Thread: {i}")
# Create and start threads
thread = threading.Thread(target=print_numbers)
thread.start()
# Wait for the thread to finish
thread.join()
print("Main program finished.")
Thread Safety
Use locks to avoid race conditions when threads access shared resources.
lock = threading.Lock()
def safe_increment(counter):
with lock:
counter[0] += 1
Multiprocessing
The multiprocessing
module creates separate processes with independent memory spaces, making it ideal for CPU-bound tasks.
Creating Processes
import multiprocessing
def compute_square(number):
print(f"Square of {number}: {number ** 2}")
# Create and start processes
process = multiprocessing.Process(target=compute_square, args=(5,))
process.start()
process.join()
Using Process Pool
The Pool
class manages multiple processes efficiently.
from multiprocessing import Pool
def square(num):
return num ** 2
# Create a pool of processes
with Pool(4) as pool:
results = pool.map(square, [1, 2, 3, 4, 5])
print(results) # Output: [1, 4, 9, 16, 25]
Async Programming
Asynchronous programming allows tasks to run independently without blocking the main program. The asyncio
module is used for asynchronous programming in Python.
Asyncio Basics
import asyncio
async def greet(name):
await asyncio.sleep(1)
print(f"Hello, {name}!")
async def main():
await asyncio.gather(greet("Alice"), greet("Bob"))
asyncio.run(main())
# Output:
# Hello, Alice!
# Hello, Bob!
Async with HTTP Requests
import aiohttp
import asyncio
async def fetch_url(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
return await response.text()
async def main():
url = "https://example.com"
content = await fetch_url(url)
print(content)
asyncio.run(main())
Choosing Between Concurrency and Parallelism
Scenario
Recommendation
I/O-bound tasks
Use multithreading or asyncio
CPU-bound tasks
Use multiprocessing
Exercises
Exercise 1:
Write a multithreaded program that prints numbers from 1 to 10 in two separate threads.
Solution:
import threading
def print_numbers(start, end):
for i in range(start, end + 1):
print(i)
thread1 = threading.Thread(target=print_numbers, args=(1, 5))
thread2 = threading.Thread(target=print_numbers, args=(6, 10))
thread1.start()
thread2.start()
thread1.join()
thread2.join()
Exercise 2:
Write a multiprocessing program to calculate the factorial of numbers using a process pool.
Solution:
from multiprocessing import Pool
import math
def factorial(n):
return math.factorial(n)
with Pool(3) as pool:
results = pool.map(factorial, [1, 2, 3, 4, 5])
print(results) # Output: [1, 2, 6, 24, 120]
Exercise 3:
Write an asyncio program to fetch data from three different URLs concurrently.
Solution:
import aiohttp
import asyncio
async def fetch(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
return await response.text()
async def main():
urls = ["https://example.com", "https://httpbin.org", "https://python.org"]
tasks = [fetch(url) for url in urls]
results = await asyncio.gather(*tasks)
for content in results:
print(content[:100]) # Print first 100 characters of each response
asyncio.run(main())
Best Practices
Use
threading
for I/O-bound tasks that involve waiting (e.g., file reads).Use
multiprocessing
for CPU-intensive computations.Use
asyncio
for non-blocking, high-performance I/O tasks.Avoid race conditions by using synchronization tools like locks.
Test concurrency and parallelism for thread-safety and correctness.
In the next chapter, we will explore testing and debugging, focusing on writing unit tests, mocking, and using debugging tools in Python.
Last updated