Why format()
is more flexible than %
string operations
I think you should really stick to format()
method of str
, because it is the preferred way to format strings and will probably replace string formatting operation in the future.
Furthermore, it has some really good features, that can also combine position-based formatting with keyword-based one:
>>> string = 'I will be {} years and {} months on {month} {day}'
>>> some_date = {'month': 'January', 'day': '1st'}
>>> diff = [3, 11] # years, months
>>> string.format(*diff, **some_date)
'I will be 3 years and 11 months on January 1st'
even the following will work:
>>> string = 'On {month} {day} it will be {1} months, {0} years'
>>> string.format(*diff, **some_date)
'On January 1st it will be 11 months, 3 years'
There is also one other reason in favor of format()
. Because it is a method, it can be passed as a callback like in the following example:
>>> data = [(1, 2), ('a', 'b'), (5, 'ABC')]
>>> formatter = 'First is "{0[0]}", then comes "{0[1]}"'.format
>>> for item in map(formatter, data):
print item
First is "1", then comes "2"
First is "a", then comes "b"
First is "5", then comes "ABC"
Isn't it a lot more flexible than string formatting operation?
See more examples on documentation page for comparison between %
operations and .format()
method.
Comparing tuple-based %
string formatting with dictionary-based
Generally there are three ways of invoking %
string operations (yes, three, not two) like that:
base_string % values
and they differ by the type of values
(which is a consequence of what is the content of base_string
):
it can be a tuple
, then they are replaced one by one, in the order they are appearing in tuple,
>>> 'Three first values are: %f, %f and %f' % (3.14, 2.71, 1)
'Three first values are: 3.140000, 2.710000 and 1.000000'
it can be a dict
(dictionary), then they are replaced based on the keywords,
>>> 'My name is %(name)s, I am %(age)s years old' % {'name':'John','age':98}
'My name is John, I am 98 years old'
it can be a single value, if the base_string
contains single place where the value should be inserted:
>>> 'This is a string: %s' % 'abc'
'This is a string: abc'
There are obvious differences between them and these ways cannot be combined (in contrary to format()
method which is able to combine some features, as mentioned above).
But there is something that is specific only to dictionary-based string formatting operation and is rather unavailable in remaining three formatting operations' types. This is ability to replace specificators with actual variable names in a simple manner:
>>> name = 'John'
>>> surname = 'Smith'
>>> age = 87
# some code goes here
>>> 'My name is %(surname)s, %(name)s %(surname)s. I am %(age)i.' % locals()
'My name is Smith, John Smith. I am 87.'
Just for the record: of course the above could be easily replaced by using format()
by unpacking the dictionary like that:
>>> 'My name is {surname}, {name} {surname}. I am {age}.'.format(**locals())
'My name is Smith, John Smith. I am 87.'
Does anyone else have an idea what could be a feature specific to one type of string formatting operation, but not to the other? It could be quite interesting to hear about it.
It's just for the looks. You can see at one glance what the format is. Many of us like readability better than micro-optimization.
Let's see what IPython's %timeit
says:
Python 3.7.2 (default, Jan 3 2019, 02:55:40)
IPython 5.8.0
Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz
In [1]: %timeit root = "sample"; output = "output"; path = "{}/{}".format(root, output)
The slowest run took 12.44 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 223 ns per loop
In [2]: %timeit root = "sample"; output = "output"; path = root + '/' + output
The slowest run took 13.82 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 101 ns per loop
In [3]: %timeit root = "sample"; output = "output"; path = "%s/%s" % (root, output)
The slowest run took 27.97 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 155 ns per loop
In [4]: %timeit root = "sample"; output = "output"; path = f"{root}/{output}"
The slowest run took 19.52 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 77.8 ns per loop
Best Answer
To answer the general question first, you would use printing in general to output information in your scripts to the screen when you're writing code to ensure that you're getting what you expect.
As your code becomes more sophisticated, you may find that logging would be better than printing, but that's information for another answer.
There is a big difference between printing and the return values' representations that are echoed in an interactive session with the Python interpreter. Printing should print to your standard output. The echoed representation of the expression's return value (that show up in your Python shell if not
None
) will be silent when running the equivalent code in scripts.1. Printing
In Python 2, we had print statements. In Python 3, we get a print function, which we can also use in Python 2.
Print Statements with Commas (Python 2)
The print statement with commas separating items, uses a space to separate them. A trailing comma will cause another space to be appended. No trailing comma will append a newline character to be appended to your printed item.
You could put each item on a separate print statement and use a comma after each and they would print the same, on the same line.
For example (this would only work in a script, in an interactive shell, you'd get a new prompt after every line):
Would output:
Print Function
With the built-in print function from Python 3, also available in Python 2.6 and 2.7 with this import:
you can declare a separator and an end, which gives us a lot more flexibility:
The defaults are
' '
for sep and'\n'
for end:2. String Concatenation
Concatenation creates each string in memory, and then combines them together at their ends in a new string (so this may not be very memory friendly), and then prints them to your output at the same time. This is good when you need to join strings, likely constructed elsewhere, together.
will print
Be careful before you attempt to join in this manner literals of other types to strings, to convert the literals to strings first.
prints
If you attempt to concatenate the integer without coercing it to a string first:
This should demonstrate that you should only ever attempt to concatenate variables that are known to be strings. The new way of formatting demonstrated next handles this issue for you.
3. String Interpolation
The formatting you're demonstrating is the old style of string interpolation, borrowed from C. It takes the old string and one time creates a new one. What it does is fairly straightforward. You should use this when you may seem likely to building up a fairly large template (at 3+ lines and 3+ variables, you definitely should be doing it this way).
The new way of doing that would be to do this (using the index of the arguments):
or in python 2.7 or 3 (using the implied index):
or with named arguments (this is semantically easy to read, but the code doesn't look very DRY (i.e. Don't Repeat Yourself))
The biggest benefit of this over
%
style formatting (not demonstrated here) is that it lets you combine positional and keyword argumentsNew in Python 3.6, format literals
Python 3.6 will have format literals, with a more elegant syntax (less redundancy). The simple syntax is something like:
The format literals can actually execute code in-place: