Technology Tales

Notes drawn from experiences in consumer and enterprise technology

16:35, 5th February 2023

Python PIL | UnsharpMask() method

The Python Imaging Library, known as PIL, offers tools for image manipulation, including the UnsharpMask() method within the ImageFilter module, which enhances image sharpness by applying a blur and contrast adjustment. This method uses parameters such as radius to control the blur intensity, percent to determine the strength of the sharpening effect and threshold to specify the minimum brightness change that contributes to the sharpening process. Example code demonstrates how to apply this filter to an image, adjusting these parameters to achieve different levels of detail enhancement, with visual outputs illustrating the effects of varying the settings.

16:34, 5th February 2023

10 Python Image Manipulation Tools You Can Try Today

Python offers a range of libraries for image processing, enabling tasks such as filtering, enhancement, feature extraction and computer vision applications. Scikit-Image provides research-grade tools with peer-reviewed code, while NumPy and SciPy support array manipulation and advanced image operations. Pillow simplifies common tasks like cropping and colour conversion, and OpenCV-Python delivers high-performance computer vision capabilities. Other libraries such as Mahotas, SimpleITK and PgMagick serve specific needs, from rapid prototyping to handling complex image formats, with varying levels of ease of use and maintenance. These tools collectively provide developers with versatile options for manipulating digital images, whether for basic adjustments or sophisticated analysis.

23:47, 3rd February 2023

Python enumerate(): Simplify Looping With Counters

Exploring Python's enumerate() function reveals its utility in simplifying loops that require both index and value from an iterable. The function pairs each element with its position, eliminating the need to manually track counters. This approach is particularly effective in scenarios like extracting specific segments of a string, linking related lists, or generating sequential pairs.

While enumerate() is powerful, alternatives such as zip() and itertools offer more refined solutions for multi-sequence iteration or complex patterns. Understanding when to use each tool ensures cleaner, more efficient code. The function's flexibility, including the ability to adjust starting indices, makes it a staple in Pythonic programming. By mastering enumerate() and its counterparts, developers can write more readable and maintainable loops tailored to their specific needs.

23:46, 3rd February 2023

When to Use a List Comprehension in Python

Python list comprehensions offer a concise and readable way to create and transform lists by combining a loop and optional conditional logic into a single line, serving as an alternative to traditional for loops and the map() function. They support filtering through conditional statements, can be extended to create sets and dictionaries, and the walrus operator (:=) introduced in Python 3.8 allows values to be assigned within a comprehension's conditional clause.

However, they are not always the best choice, as nested comprehensions can reduce code clarity and their eager loading of entire lists into memory makes them unsuitable for very large datasets, where generator expressions are more appropriate since they evaluate values lazily and maintain a smaller memory footprint. When performance is a priority, profiling tools such as the timeit library can help determine which approach, whether a list comprehension, map() or a standard loop, is most efficient for a given situation.

23:44, 3rd February 2023

How to return multiple values from a function in Python

Returning multiple values from a Python function can be achieved by separating them with commas in the return statement, which results in a tuple being returned. This approach allows the function to yield several values simultaneously, and the returned tuple can be unpacked into separate variables for individual use. Alternatively, a list can be returned by enclosing the values in square brackets, offering a different data structure for the output. Both methods enable the function to deliver multiple results efficiently, with the tuple method being particularly concise and commonly used in practice.

23:43, 3rd February 2023

Parallel For-Loop With a Multiprocessing Pool

Python's multiprocessing.Pool class allows sequential for loops to be converted into parallel operations that utilise all available CPU cores simultaneously. Before parallelising a loop, the code must be refactored so that each iteration calls a self-contained target function with its own arguments and no reliance on shared resources, which helps avoid concurrency issues such as race conditions.

The pool's map() function handles single-argument tasks and returns results once all tasks are complete, while starmap() serves the same purpose for functions requiring multiple arguments. For greater memory efficiency and responsiveness, imap() issues tasks one at a time as workers become available and yields results as each task finishes, rather than waiting for the entire batch to complete.

By default, the pool creates one worker process per logical CPU core, though this can be adjusted manually via the processes argument. The number of logical CPU cores on a system can be retrieved using either multiprocessing.cpu_count() or os.cpu_count(), with logical cores typically being twice the number of physical cores due to hyperthreading. The pool is best suited to computationally intensive tasks involving small amounts of data, whereas input/output-bound tasks are generally better handled using thread-based concurrency such as ThreadPoolExecutor.

23:42, 3rd February 2023

Pandas – Create DataFrame From Multiple Series

Creating a DataFrame from multiple Pandas Series involves using the concat() function to merge them as columns, aligning their index values and filling gaps with NaN where Series lengths differ. Each Series can be assigned a name to serve as a column header, and custom index labels can be applied to both Series and the resulting DataFrame. Techniques such as reset_index() allow for reorganising the DataFrame's structure, while additional Series can be appended to existing DataFrames by specifying new column names. This approach facilitates the construction of structured tabular data from individual Series, ensuring flexibility in data alignment and formatting.

23:41, 3rd February 2023

How to run command or code in parallel in bash shell under Linux or Unix

Running multiple commands or code simultaneously in a Bash shell on Linux and Unix-like systems can be achieved through several methods. The simplest approach involves appending an ampersand to a command to push it into the background, with the built-in wait command used to pause execution until all background processes have completed before continuing. This allows multiple processes to run concurrently within a script. A more powerful option is GNU parallel, a shell tool that executes jobs across one or more computers simultaneously, accepting input such as file lists, URLs or hostnames, and offering features like job limiting, SSH-based remote execution and filename replacement strings for batch operations such as image conversion or file compression. The xargs command also provides parallel execution capabilities. GNU parallel can be installed on Debian and Ubuntu systems via apt, on RHEL and CentOS via yum and on Fedora via dnf.

23:39, 3rd February 2023

How to Rename Columns in Pandas

Renaming columns in a Pandas DataFrame can be achieved through three primary approaches. The first method involves specifying individual column names and their new labels using the rename function, which allows selective updates without altering other column names. The second method replaces all column names simultaneously by directly assigning a new list of names to the DataFrame's columns attribute, which is particularly efficient when renaming most or all columns. The third method uses string manipulation to replace specific characters across all column names, such as removing a common prefix or suffix, which is useful for standardising naming conventions. Each technique offers flexibility depending on the scope and nature of the renaming task, ensuring that users can efficiently manage column labels in their datasets.

23:28, 3rd February 2023

Python: Iterate over multiple lists simultaneously

Python offers multiple approaches to iterate over multiple lists simultaneously, enabling the processing of related data in a synchronised manner. The zip() function pairs elements from each list, stopping when the shortest list ends, while itertools.zip_longest() continues iteration until all lists are exhausted, filling missing values with None or a specified fill value. Also, enumerate() allows tracking of indexes to access corresponding elements from other lists, and generator expressions can be used with zip() for efficient iteration over large datasets. Each method provides distinct advantages depending on the specific requirements of the task, such as handling varying list lengths or optimising memory usage.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.