Chapter 13: Concurrency and Parallelism

Concurrency and parallelism are essential concepts for improving the efficiency and performance of Python programs. They enable the execution of tasks in overlapping time periods or simultaneously, making better use of system resources.

Key Concepts

Concurrency:
- Refers to tasks running independently but not necessarily simultaneously.
- Suitable for I/O-bound operations (e.g., reading files, network requests).
Parallelism:
- Involves tasks running simultaneously on multiple processors or cores.
- Suitable for CPU-bound operations (e.g., mathematical computations).

Multithreading

Threads are lightweight processes that share the same memory space. Python’s threading module allows concurrent execution of code using threads.

Creating Threads

import threading

def print_numbers():
    for i in range(5):
        print(f"Thread: {i}")

# Create and start threads
thread = threading.Thread(target=print_numbers)
thread.start()

# Wait for the thread to finish
thread.join()
print("Main program finished.")

Thread Safety

Use locks to avoid race conditions when threads access shared resources.

lock = threading.Lock()

def safe_increment(counter):
    with lock:
        counter[0] += 1

Multiprocessing

The multiprocessing module creates separate processes with independent memory spaces, making it ideal for CPU-bound tasks.

Creating Processes

import multiprocessing

def compute_square(number):
    print(f"Square of {number}: {number ** 2}")

# Create and start processes
process = multiprocessing.Process(target=compute_square, args=(5,))
process.start()
process.join()

Using Process Pool

The Pool class manages multiple processes efficiently.

from multiprocessing import Pool

def square(num):
    return num ** 2

# Create a pool of processes
with Pool(4) as pool:
    results = pool.map(square, [1, 2, 3, 4, 5])
print(results)  # Output: [1, 4, 9, 16, 25]

Async Programming

Asynchronous programming allows tasks to run independently without blocking the main program. The asyncio module is used for asynchronous programming in Python.

Asyncio Basics

import asyncio

async def greet(name):
    await asyncio.sleep(1)
    print(f"Hello, {name}!")

async def main():
    await asyncio.gather(greet("Alice"), greet("Bob"))

asyncio.run(main())
# Output:
# Hello, Alice!
# Hello, Bob!

Async with HTTP Requests

import aiohttp
import asyncio

async def fetch_url(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            return await response.text()

async def main():
    url = "https://example.com"
    content = await fetch_url(url)
    print(content)

asyncio.run(main())

Choosing Between Concurrency and Parallelism

Scenario

Recommendation

I/O-bound tasks

Use multithreading or asyncio

CPU-bound tasks

Use multiprocessing

Exercises

Exercise 1:

Write a multithreaded program that prints numbers from 1 to 10 in two separate threads.

Solution:

import threading

def print_numbers(start, end):
    for i in range(start, end + 1):
        print(i)

thread1 = threading.Thread(target=print_numbers, args=(1, 5))
thread2 = threading.Thread(target=print_numbers, args=(6, 10))

thread1.start()
thread2.start()

thread1.join()
thread2.join()

Exercise 2:

Write a multiprocessing program to calculate the factorial of numbers using a process pool.

Solution:

from multiprocessing import Pool
import math

def factorial(n):
    return math.factorial(n)

with Pool(3) as pool:
    results = pool.map(factorial, [1, 2, 3, 4, 5])
print(results)  # Output: [1, 2, 6, 24, 120]

Exercise 3:

Write an asyncio program to fetch data from three different URLs concurrently.

Solution:

import aiohttp
import asyncio

async def fetch(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            return await response.text()

async def main():
    urls = ["https://example.com", "https://httpbin.org", "https://python.org"]
    tasks = [fetch(url) for url in urls]
    results = await asyncio.gather(*tasks)
    for content in results:
        print(content[:100])  # Print first 100 characters of each response

asyncio.run(main())

Best Practices

Use threading for I/O-bound tasks that involve waiting (e.g., file reads).
Use multiprocessing for CPU-intensive computations.
Use asyncio for non-blocking, high-performance I/O tasks.
Avoid race conditions by using synchronization tools like locks.
Test concurrency and parallelism for thread-safety and correctness.

In the next chapter, we will explore testing and debugging, focusing on writing unit tests, mocking, and using debugging tools in Python.

PreviousChapter 12: Decorators and Context Managers NextChapter 14: Testing and Debugging

Last updated 6 months ago