Context

Using print() statements is the most common way to output information in a Python program. (The other alternative would be to write out relevant information directly to a file or database.)

And the usage of print statements is very easy to pick up. Overall Python is said to be 'a language that fits your brain' - there's even T-Shirts :-) - and I mostly agree with that. I love how intuitive coding in Python feels after having worked with the language for a while and taken the time to become familiar with the basic protocols that all Python objects adhere to.

One aspect, however, that I frequently need to look up again is Python's string formatting specification. That specification forms a kind of mini-language which allows me to tell Python to format outputs just the way I intend them to look. For example to keep multiple output rows nicely aligned, to explicitly specify the precision of decimals, to display percentages, and similar.

This isn't a complex topic. But it's not completely intuitive either. So I felt this would make a good fit for a small post to solidify my learning and collect the important details here for future reference. And maybe in the process of writing the information sinks in deep enough this time, so I won't need to look this up (as often) anymore going forward.

Quick aside on the different forms of string formatting in Python

Python has (at least) three different ways to do string formatting:

  • The old-school 'my %s string' % 'legacy' string formatting syntax (similar to the C language)
  • The 'my {} string'.format('useful') string formatting syntax
  • The modern f-String syntax: f'my {'modern'.upper()} string' (since Python 3.6)

I won't go into detail about these. I, like most modern Pythonistas, use f-Strings exclusively these days for the clarity they provide. In fact, it is said, that f-Strings were so popular that their introduction in Python 3.6 significantly contributed to many older projects finally making the switch from Python 2.7 to Python 3.

One important note: While f-Strings these days should be the go-to for all standard output formatting needs, when working with the logging module there are cases where the usage of the % syntax is still recommended (Especially in combination with services like Sentry). But that shall not be of concern to us today.

So from here onwards, the usage of f-Strings is assumed.

The Python Format Specification Language

To format a single value, we can use the format() function:

x = 3.141_592_653       # float with 9 decimals

format(x, '.2f')        # '3.14'
format(x, '>6.2f')      # '  3.14'
format(x, '#<10.2f')    # '3.14##'

The format() function takes the object and a format specificer that determines how the object will be formatted.

The general format of the format specifier is:

[fill][align][sign][0][width][,][.precision][type]

So there are eight possible specification elements that can be included. Each of them is optional. Next, let's see what each of these elements does.

I will first cover the last four, as they (together with [align]) are the most common:

  • [width]
    • -> An integer that specifies the minimum field width of the resulting string
    • Sample: for e in dir(obj): print(f'{e:20}')
      • This will make sure that multiple elements e – even if they are of varying length (up to 20 characters in this case) – will all take up exactly the same amount of space in the output string
  • [,]
    • -> Can be , or _ and is used as a separator between each 3 digits for easier legibility
    • Sample: x=1000000; f'{x:_}' will display as '1_000_000'
  • [.precision]
    • -> How many decimal places to display
    • Sample: x=3.141_592_653; f'{x:.2f} will display '3.14'
  • [type]
    • -> Indicates the type of data which can affect its representation. I.e. display an integer as a binary or hexadecimal number

The most common types are:

  • Integers
    • d: integer
    • b: binary integer
    • x: hexadecimal integer
  • Float or Percentage
    • f: float
    • e: float in scientific notation
    • %: percentage (multiples a number by 100)
  • String or Character
    • s: string or any object
      • Uses str(obj) to generate the string and will use repr(obj) as fallback if __str__() is not defined.
    • c: a single character

As we see with just four out of the eight specification building blocks we can already achieve a lot.

Now, let's see what the remaining four spec elements can be used for (with [align] being by far the most common of the remaining ones):

  • [fill]
    • -> Takes a character to use as padding e.g. # or *
  • [align]
    • -> One of <,^,> for left, center, right-align
  • [sign]
    • -> One of +,-, or space
      • A space,, adds a leading space for positive numbers so all numbers align
      • A + will add a leading sign to all numbers
      • - is the default and only will show a leading minus sign for negative numbers
  • [0]
    • -> If this zero is provided, numeric values are padded with leading zeroes to fill the allocated space
      • Can also use '0>10.2f' to pad with zeroes on the left
      • But using [0] before the width with '>010.2f' makes it explicit, that it is the padding of numbers that we are interested in

Examples of usage

# Demonstrate zero padding
x = 42
s = format(x, '0>5d')  # s = '00042'
s = format(x, '>05d')  # s = '00052'

# Demonstrate alignment of pos. and neg. numbers
x = 42
y = -42
format(x, '+d') # '+42'
format(y, '+d') # '-42'
format(x, ' d') # ' 42'

# Demonstrate usage of percentages
x = 0.054_32
format(x, '.2%')    # '5.43%'

# Demonstrate conversion to binary & hexadecimal numbers
x = 3
y = 255
format(x, 'd')  # '3'
format(x, 'b')  # '11'
format(x, 'x')  # '3'
format(y, 'x')  # 'ff'

Using f-String to dynamically adjust the format

x = 3.141_592_653
width = 6
precison = 2
s = f'{x:#>{width}.{precision}f}'    # s = '##3.14'

Usage of !r and !s

Using !r or !s allows to force the object to be converted to a string via either str() or repr() before the formatting is applied. !s is the default for non-strings. Using !r can be useful in debugging contexts when we need a more complete representation of the object and it's current state.

f'{my_obj!r:spec}'   # Equivalent to `repr(my_obj).__format__('spec')`
f'{my_obj!s:spec}'   # Equivalent to `str(my_obj).__format__('spec')`

Usage of f'{my_var=}'

Sometimes, for debugging, it is useful to not only display the value of an object but also the name of the variable that contains it. We can use f'{my_var=}' to obtain both:

my_var = 42
f'{my_var=}'         # results in 'my_var=42'
f'my_var={my_var}'   # equivalent but more verbose

Note: Python doesn't actually access the variable name when using this syntax (which in general is not easy to achieve as the variable is merely a pointer to the object) but instead it used the literal provided inside the {}. So f'{"demo".upper()=}' would result in "demo".upper()='DEMO'.

Trick: Combining padding and conditional end= for nice formatting

I like to think of this one as "The \t Trick" and frequently use it to get a nicer overview of all the modules available when familiarizing myself with a module with dir().

import pandas as pd

# Display all methods of pandas nicely formatted
for m in dir(pd):
    print(f'{m:<20}', end='\t' if len(m) < 20 else '\n')

And that's it for now. :-)

This post grew longer than I expected. But the details and code examples provided here should cover most common situations. For more detailed information you can refer to the resources at the bottom.

Reference / Further Reading


Published

Category

Python

Tags

Contact