Chapter 13: Concurrency and Parallelism

Concurrency and parallelism are essential concepts for improving the efficiency and performance of Python programs. They enable the execution of tasks in overlapping time periods or simultaneously, making better use of system resources.

Key Concepts

  1. Concurrency:

    • Refers to tasks running independently but not necessarily simultaneously.

    • Suitable for I/O-bound operations (e.g., reading files, network requests).

  2. Parallelism:

    • Involves tasks running simultaneously on multiple processors or cores.

    • Suitable for CPU-bound operations (e.g., mathematical computations).

Multithreading

Threads are lightweight processes that share the same memory space. Python’s threading module allows concurrent execution of code using threads.

Creating Threads

import threading

def print_numbers():
    for i in range(5):
        print(f"Thread: {i}")

# Create and start threads
thread = threading.Thread(target=print_numbers)
thread.start()

# Wait for the thread to finish
thread.join()
print("Main program finished.")

Thread Safety

Use locks to avoid race conditions when threads access shared resources.

Multiprocessing

The multiprocessing module creates separate processes with independent memory spaces, making it ideal for CPU-bound tasks.

Creating Processes

Using Process Pool

The Pool class manages multiple processes efficiently.

Async Programming

Asynchronous programming allows tasks to run independently without blocking the main program. The asyncio module is used for asynchronous programming in Python.

Asyncio Basics

Async with HTTP Requests

Choosing Between Concurrency and Parallelism

Scenario

Recommendation

I/O-bound tasks

Use multithreading or asyncio

CPU-bound tasks

Use multiprocessing

Exercises

Exercise 1:

Write a multithreaded program that prints numbers from 1 to 10 in two separate threads.

Solution:

Exercise 2:

Write a multiprocessing program to calculate the factorial of numbers using a process pool.

Solution:

Exercise 3:

Write an asyncio program to fetch data from three different URLs concurrently.

Solution:

Best Practices

  1. Use threading for I/O-bound tasks that involve waiting (e.g., file reads).

  2. Use multiprocessing for CPU-intensive computations.

  3. Use asyncio for non-blocking, high-performance I/O tasks.

  4. Avoid race conditions by using synchronization tools like locks.

  5. Test concurrency and parallelism for thread-safety and correctness.

In the next chapter, we will explore testing and debugging, focusing on writing unit tests, mocking, and using debugging tools in Python.

Last updated