I’ve been fairly interested in python and multithread programming recently, so I decided “what better way to learn it than to try and teach it?”, and so here it is: a simplified explanation to python threads, what are they, how to create one, the advantages and ways to communicate between threads. This will be kind of a spin-off of the “A basic introduction” series about programming languages and tools I’ve been creating. If you wanna learn about the basics of python, read this article.
What Are Threads?
Think of a program like a Beehive, and the thread as its bees. The important parts of it will be run by the queen, or in our case the main thread, while minor, background or side things can run in parallel, by a tiny work bee or a group of them.
This is basically the idea behind threads: a simple way to parallelize work, breaking a big task into multiple independent ones and then sharing those between “worker units” that will run them, making more efficient use of multi-core processors.
Why Use Threads?
This is a question some people have when they first learn about this topic. Why use multiple threads? Can’t I run it all in the main part of a program?
The answer for that is yes, but sometimes you shouldn’t. For example: Say you have to iterate through an array, with just 100 elements. That can be done faster than the blink of an eye by any modern programming language. However, scale that to 100 BILLION, and now that will take forever, even if written in Rust. Now, if you split those 100 billion elements between 12 threads, the iteration now runs about 12 times faster.
Similarly, some tasks like I/O(read and write to disk, for example) will block a thread at particular points. So spawning more of them to do the task guarantees that you’re reaching a higher amount of operations per second.
How to Use Threads in Python
In python, we will use a standard library called threading
to work with threads. It already comes with the default interpreter, so no need to download anything else. Here’s how it works:
Step 1: Import the threading Library
To get started, you need to import the threading
library:
import threading
Step 2: Create a Thread
Now, you can create a thread by defining a function and passing it to the Thread
class:
def my_function():
# Code to be executed by the thread goes here
my_thread = threading.Thread(target=my_function)
Step 3: Start the Thread
Once you’ve created a thread, you need to start it using the start()
method:
my_thread.start()
This initiates the execution of my_function
in a separate thread.
Step 4: Wait for the Thread to Finish (Optional)
If you want to wait for the thread to finish before proceeding in the main body of your program, you can use the join()
method:
my_thread.join()
This ensures that your main program waits for my_thread
to complete its work.
Step 5: Putting It All Together
Here’s a complete example of using threads in Python:
import threading
def print_numbers():
for i in range(1, 6):
print(f"Number: {i}")
def print_letters():
for letter in 'abcde':
print(f"Letter: {letter}")
# Create two threads
thread1 = threading.Thread(target=print_numbers)
thread2 = threading.Thread(target=print_letters)
# Start the threads
thread1.start()
thread2.start()
# Wait for both threads to finish
thread1.join()
thread2.join()
print("Both threads have finished.")
In this example, print_numbers
and print_letters
are executed concurrently in separate threads.
Communication Between Threads
You’ll need a way to communicate between threads if you hope to accomplish any slightly complex task. In Python, you can use various tools to help threads share information and work together effectively. Let’s explore two common ways to do this:
1. Shared Data (Simple Variables)
Imagine you have two threads, Thread A and Thread B, and you want them to cooperate by sharing some information.
Example: Suppose both threads need to keep track of a shared counter.
- Create a variable (e.g.,
shared_counter
) that both threads can access. - When Thread A or Thread B wants to update the counter, they can simply modify this shared variable.
import threading
shared_counter = 0 # This is the shared variable
def thread_a_function():
global shared_counter
shared_counter += 2 # Thread A increments the counter
def thread_b_function():
global shared_counter
shared_counter -= 1 # Thread B decrements the counter
# Create the threads
thread_a = threading.Thread(target=thread_a_function)
thread_b = threading.Thread(target=thread_b_function)
# Start the threads
thread_a.start()
thread_b.start()
# Wait for both threads to finish
thread_a.join()
thread_b.join()
print(f"Shared counter value: {shared_counter}")
In this example, both Thread A and Thread B can access and modify the shared_counter
variable. Be careful, though, as shared data can lead to race conditions or unexpected behavior if not synchronized properly.
2. Queues (Safe Messaging)
Usually, you want threads to communicate by passing messages securely. This is where queues come into play.
Example: Suppose you have a producer thread that generates data and a consumer thread that processes that data.
Here’s how you can use a queue to facilitate communication:
- Import the
queue
module from Python’s standard library. - Create a queue object that both threads can access.
- The producer thread puts messages (data) into the queue, and the consumer thread retrieves and processes them.
import threading
import queue
# Create a queue
data_queue = queue.Queue()
def producer_thread():
for i in range(1, 6):
data = f"Data {i}"
data_queue.put(data) # Put data into the queue
def consumer_thread():
while True:
data = data_queue.get() # Get data from the queue
if data is None:
break # Exit when there's no more data
print(f"Processing: {data}")
# Create the threads
producer = threading.Thread(target=producer_thread)
consumer = threading.Thread(target=consumer_thread)
# Start the threads
producer.start()
consumer.start()
# Wait for the producer to finish
producer.join()
# Signal the consumer to exit when there's no more data
data_queue.put(None)
consumer.join()
print("Both threads have finished.")
In this example, the producer thread puts data into the data_queue
, and the consumer thread retrieves and processes it safely. Using queues helps prevent issues like data conflicts and ensures smooth communication between threads.
Remember, communication between threads should be well-coordinated to avoid problems like race conditions or deadlocks. These simple techniques can help you make your threads work together efficiently.