Python String Formatting Concatenation – Is String Formatting More Pythonic Than String Concatenation in Python 3?

pythonstringstring-concatenationstring-formatting

So I'm programming a text game in Python 3.4 that requires the use of the print() function very often to display variables to the user.

The two ways I've always done this is with string formatting and string concatenation:

print('{} has {} health left.'.format(player, health))

And,

print(player + ' has ' + str(health) + ' health left.')

So which is better? They're both equally as readable and quick to type, and perform exactly the same. Which one is more Pythonic and why?

Question asked as I couldn't find an answer for this on Stack Overflow that wasn't concerned with Java.

Best Answer

Depends upon how long your string is and how many variables. For your use case I believe string.format is better as it has a better performance and looks cleaner to read.

Sometimes for longer strings + looks cleaner because the position of the variables are preserved where they should be in the string and you don't have to move your eyes around to map the position of {} to the corresponding variable.

If you can manage to upgrade to Python 3.6 you can use the newer more intuitive string formatting syntax like below and have best of both worlds:

player = 'Arbiter'
health = 100
print(f'{player} has {health} health left.')

If you have a very large string, I recommend to use a template engine like Jinja2 (http://jinja.pocoo.org/docs/dev/) or something along the line.

Ref: https://www.python.org/dev/peps/pep-0498/

Related Solutions

Python – String Concatenation vs String Substitution

Concatenation is (significantly) faster according to my machine. But stylistically, I'm willing to pay the price of substitution if performance is not critical. Well, and if I need formatting, there's no need to even ask the question... there's no option but to use interpolation/templating.

>>> import timeit
>>> def so_q_sub(n):
...  return "%s%s/%d" % (DOMAIN, QUESTIONS, n)
...
>>> so_q_sub(1000)
'http://stackoverflow.com/questions/1000'
>>> def so_q_cat(n):
...  return DOMAIN + QUESTIONS + '/' + str(n)
...
>>> so_q_cat(1000)
'http://stackoverflow.com/questions/1000'
>>> t1 = timeit.Timer('so_q_sub(1000)','from __main__ import so_q_sub')
>>> t2 = timeit.Timer('so_q_cat(1000)','from __main__ import so_q_cat')
>>> t1.timeit(number=10000000)
12.166618871951641
>>> t2.timeit(number=10000000)
5.7813972166853773
>>> t1.timeit(number=1)
1.103492206766532e-05
>>> t2.timeit(number=1)
8.5206360154188587e-06

>>> def so_q_tmp(n):
...  return "{d}{q}/{n}".format(d=DOMAIN,q=QUESTIONS,n=n)
...
>>> so_q_tmp(1000)
'http://stackoverflow.com/questions/1000'
>>> t3= timeit.Timer('so_q_tmp(1000)','from __main__ import so_q_tmp')
>>> t3.timeit(number=10000000)
14.564135316080637

>>> def so_q_join(n):
...  return ''.join([DOMAIN,QUESTIONS,'/',str(n)])
...
>>> so_q_join(1000)
'http://stackoverflow.com/questions/1000'
>>> t4= timeit.Timer('so_q_join(1000)','from __main__ import so_q_join')
>>> t4.timeit(number=10000000)
9.4431309007150048

Python String Formatting Options – Pros and Cons

Why `format()` is more flexible than `%` string operations

I think you should really stick to format() method of str, because it is the preferred way to format strings and will probably replace string formatting operation in the future.

Furthermore, it has some really good features, that can also combine position-based formatting with keyword-based one:

>>> string = 'I will be {} years and {} months on {month} {day}'
>>> some_date = {'month': 'January', 'day': '1st'}
>>> diff = [3, 11] # years, months
>>> string.format(*diff, **some_date)
'I will be 3 years and 11 months on January 1st'

even the following will work:

>>> string = 'On {month} {day} it will be {1} months, {0} years'
>>> string.format(*diff, **some_date)
'On January 1st it will be 11 months, 3 years'

There is also one other reason in favor of format(). Because it is a method, it can be passed as a callback like in the following example:

>>> data = [(1, 2), ('a', 'b'), (5, 'ABC')]
>>> formatter = 'First is "{0[0]}", then comes "{0[1]}"'.format
>>> for item in map(formatter, data):
    print item


First is "1", then comes "2"
First is "a", then comes "b"
First is "5", then comes "ABC"

Isn't it a lot more flexible than string formatting operation?

See more examples on documentation page for comparison between % operations and .format() method.

Comparing tuple-based `%` string formatting with dictionary-based

Generally there are three ways of invoking % string operations (yes, three, not two) like that:

base_string % values

and they differ by the type of values (which is a consequence of what is the content of base_string):

it can be a tuple, then they are replaced one by one, in the order they are appearing in tuple,

>>> 'Three first values are: %f, %f and %f' % (3.14, 2.71, 1)
'Three first values are: 3.140000, 2.710000 and 1.000000'

it can be a dict (dictionary), then they are replaced based on the keywords,

>>> 'My name is %(name)s, I am %(age)s years old' % {'name':'John','age':98}
'My name is John, I am 98 years old'

it can be a single value, if the base_string contains single place where the value should be inserted:
```
>>> 'This is a string: %s' % 'abc'
'This is a string: abc'
```

There are obvious differences between them and these ways cannot be combined (in contrary to format() method which is able to combine some features, as mentioned above).

But there is something that is specific only to dictionary-based string formatting operation and is rather unavailable in remaining three formatting operations' types. This is ability to replace specificators with actual variable names in a simple manner:

>>> name = 'John'
>>> surname = 'Smith'
>>> age = 87
# some code goes here
>>> 'My name is %(surname)s, %(name)s %(surname)s. I am %(age)i.' % locals()
'My name is Smith, John Smith. I am 87.'

Just for the record: of course the above could be easily replaced by using format() by unpacking the dictionary like that:

>>> 'My name is {surname}, {name} {surname}. I am {age}.'.format(**locals())
'My name is Smith, John Smith. I am 87.'

Does anyone else have an idea what could be a feature specific to one type of string formatting operation, but not to the other? It could be quite interesting to hear about it.

Best Answer

Related Solutions

Python – String Concatenation vs String Substitution

Python String Formatting Options – Pros and Cons

Why format() is more flexible than % string operations

Comparing tuple-based % string formatting with dictionary-based

Related Question

Why `format()` is more flexible than `%` string operations

Comparing tuple-based `%` string formatting with dictionary-based