Python Semaphore

Summary: in this tutorial, you will learn how to use Python Semaphore to control the number of threads that can access a shared resource simultaneously.

Introduction to the Python Semaphore

A Python semaphore is a synchronization primitive that allows you to control access to a shared resource. Basically, a semaphore is a counter associated with a lock that limits the number of threads that can access a shared resource simultaneously.

A semaphore helps prevent thread synchronization issues like race conditions, where multiple threads attempt to access the resource at the same time and interfere with each other’s operations.

A semaphore maintains a count. When a thread wants to access the shared resource, the semaphore checks the count.

If the count is greater than zero, it decreases the count and lets the thread accesses the resource. If the count is zero, the semaphore blocks the thread until the count becomes greater than zero.

A semaphore has two main operations:

  • Acquire: the acquire operation checks the count and decrement it if it is greater than zero. If the count is zero, the semaphore will block the thread until another thread releases the semaphore.
  • Release: the release operation increments the counts that allow other threads to acquire it.

Using a Python semaphore

To use semaphore, you follow these steps:

First, import the threading module:

import threadingCode language: Python (python)

Second, create a Semaphore object and specify the number of threads that can acquire it at the same time:

semaphore = threading.Semaphore(3)Code language: Python (python)

In this example, we create a Semaphore object that only allows up to three threads to acquire it at the same time.

Third, acquire a semaphore from a thread by calling the acquire() method:

semaphore.acquire()Code language: Python (python)

If the semaphore count is zero, the thread will wait until another thread releases the semaphore. Once having the semaphore, you can execute a critical section of code.

Finally, release a semaphore after running the critical section of code by calling the release() method:

semaphore.release()Code language: Python (python)

To ensure a semaphore is properly acquired and released, even if exceptions occur during running the critical section of a code, you can use the with statement:

with semaphore:
    # Code within this block has acquired the semaphore

    # Perform operations on the shared resource
    # ...
    
# The semaphore is released outside the with blockCode language: Python (python)

The with statement acquire and release the semaphore automatically, making your code less error-prone.

Python semaphore example

The following example illustrates how to use the semaphore to limit the max number of concurrent downloads to three using multithreading in Python:

import threading
import urllib.request

MAX_CONCURRENT_DOWNLOADS = 3
semaphore = threading.Semaphore(MAX_CONCURRENT_DOWNLOADS)

def download(url):
    with semaphore:
        print(f"Downloading {url}...")
        
        response = urllib.request.urlopen(url)
        data = response.read()
        
        print(f"Finished downloading {url}")

        return data

        

def main():
    # URLs to download
    urls = [
        'https://www.ietf.org/rfc/rfc791.txt',
        'https://www.ietf.org/rfc/rfc792.txt',
        'https://www.ietf.org/rfc/rfc793.txt',
        'https://www.ietf.org/rfc/rfc794.txt',
        'https://www.ietf.org/rfc/rfc795.txt',
    ]

    # Create threads for each download
    threads = []
    for url in urls:
        thread = threading.Thread(target=download, args=(url,))
        threads.append(thread)
        thread.start()

    # Wait for all threads to complete
    for thread in threads:
        thread.join()


if __name__ == '__main__':
    main()Code language: Python (python)

Output:

Downloading https://www.ietf.org/rfc/rfc791.txt...
Downloading https://www.ietf.org/rfc/rfc792.txt...
Downloading https://www.ietf.org/rfc/rfc793.txt...
Finished downloading https://www.ietf.org/rfc/rfc792.txt
Downloading https://www.ietf.org/rfc/rfc794.txt...
Finished downloading https://www.ietf.org/rfc/rfc791.txt
Downloading https://www.ietf.org/rfc/rfc795.txt...
Finished downloading https://www.ietf.org/rfc/rfc793.txt
Finished downloading https://www.ietf.org/rfc/rfc794.txt
Finished downloading https://www.ietf.org/rfc/rfc795.txtCode language: Python (python)

The output shows that only a maximum of three threads can download at the same time:

Downloading https://www.ietf.org/rfc/rfc791.txt...
Downloading https://www.ietf.org/rfc/rfc792.txt...
Downloading https://www.ietf.org/rfc/rfc793.txt...Code language: Python (python)

Once the number of threads reaches three, the next thread needs to wait for the semaphore to be released by another thread.

For example, the following shows that thread #2 completed and released the semaphore, and the next thread start downloading the URL https://www.ietf.org/rfc/rfc794.txt

Finished downloading https://www.ietf.org/rfc/rfc792.txt
Downloading https://www.ietf.org/rfc/rfc794.txt...Code language: Python (python)

How the program works.

First, import the threading and urlib.request modules:

import threading
import urllib.requestCode language: Python (python)

Second, create a Semaphore object to control the number of threads that can download simultaneously at the same time to three:

MAX_CONCURRENT_DOWNLOADS = 3
semaphore = threading.Semaphore(MAX_CONCURRENT_DOWNLOADS)Code language: Python (python)

Third, define the download() function that downloads from a URL. The download function acquires and releases the semaphore using the with statement. It also uses the urllib.request module to download data from a URL:

def download(url):
    with semaphore:
        print(f"Downloading {url}...")
        
        response = urllib.request.urlopen(url)
        data = response.read()
        
        print(f"Finished downloading {url}")

        return dataCode language: Python (python)

Fourth, define the main() function that creates five threads based on a URL list and starts them to download data:

def main():
    # URLs to download
    urls = [
        'https://www.ietf.org/rfc/rfc791.txt',
        'https://www.ietf.org/rfc/rfc792.txt',
        'https://www.ietf.org/rfc/rfc793.txt',
        'https://www.ietf.org/rfc/rfc794.txt',
        'https://www.ietf.org/rfc/rfc795.txt',
    ]

    # Create threads for each download
    threads = []
    for url in urls:
        thread = threading.Thread(target=download, args=(url,))
        threads.append(thread)
        thread.start()

    # Wait for all threads to complete
    for thread in threads:
        thread.join()Code language: Python (python)

Finally, call the main() function in the if __name__ == ‘__main__’ section:

if __name__ == '__main__':
    main()Code language: Python (python)

Summary

  • Use Python semaphore to control the number of threads that can access a shared resource simultaneously.
Did you find this tutorial helpful ?