Python Strings – Differences Between Commas, Concatenation, and Formatters

pythonstring

I am learning python(2.7) on my own.
I have learned that we can use the following ways to put strings and variables together in printing:

x = "Hello"
y = "World"

By using commas:

print "I am printing" , x, y  # I know that using comma gives automatic space

By using concatenation :

print "I am printing" + " " + x + " " + y

By using string formatters

print "I am printing %s %s" % (x, y)

In this case all three print the same:

I am printing Hello World

What is the difference between the three and are there any particular instances where one is preferred over the other?

Best Answer

To answer the general question first, you would use printing in general to output information in your scripts to the screen when you're writing code to ensure that you're getting what you expect.

As your code becomes more sophisticated, you may find that logging would be better than printing, but that's information for another answer.

There is a big difference between printing and the return values' representations that are echoed in an interactive session with the Python interpreter. Printing should print to your standard output. The echoed representation of the expression's return value (that show up in your Python shell if not None) will be silent when running the equivalent code in scripts.

1. Printing

In Python 2, we had print statements. In Python 3, we get a print function, which we can also use in Python 2.

Print Statements with Commas (Python 2)

The print statement with commas separating items, uses a space to separate them. A trailing comma will cause another space to be appended. No trailing comma will append a newline character to be appended to your printed item.

You could put each item on a separate print statement and use a comma after each and they would print the same, on the same line.

For example (this would only work in a script, in an interactive shell, you'd get a new prompt after every line):

x = "Hello"
y = "World"

print "I am printing",
print x,
print y

Would output:

I am printing Hello World

Print Function

With the built-in print function from Python 3, also available in Python 2.6 and 2.7 with this import:

from __future__ import print_function

you can declare a separator and an end, which gives us a lot more flexibility:

>>> print('hello', 'world', sep='-', end='\n****\n')
hello-world
****
>>>

The defaults are ' ' for sep and '\n' for end:

>>> print('hello', 'world')
hello world
>>>

2. String Concatenation

Concatenation creates each string in memory, and then combines them together at their ends in a new string (so this may not be very memory friendly), and then prints them to your output at the same time. This is good when you need to join strings, likely constructed elsewhere, together.

print('hello' + '-' + 'world')

will print

hello-world

Be careful before you attempt to join in this manner literals of other types to strings, to convert the literals to strings first.

print('here is a number: ' + str(2))

prints

here is a number: 2

If you attempt to concatenate the integer without coercing it to a string first:

>>> print('here is a number: ' + 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot concatenate 'str' and 'int' objects

This should demonstrate that you should only ever attempt to concatenate variables that are known to be strings. The new way of formatting demonstrated next handles this issue for you.

3. String Interpolation

The formatting you're demonstrating is the old style of string interpolation, borrowed from C. It takes the old string and one time creates a new one. What it does is fairly straightforward. You should use this when you may seem likely to building up a fairly large template (at 3+ lines and 3+ variables, you definitely should be doing it this way).

The new way of doing that would be to do this (using the index of the arguments):

print('I am printing {0} and {1}'.format(x, y))

or in python 2.7 or 3 (using the implied index):

print('I am printing {} and {}'.format(x, y))

or with named arguments (this is semantically easy to read, but the code doesn't look very DRY (i.e. Don't Repeat Yourself))

print('I am printing {x} and {y}'.format(x=x, y=y))

The biggest benefit of this over % style formatting (not demonstrated here) is that it lets you combine positional and keyword arguments

print('I am printing {0} and {y}'.format(x, y=y))

New in Python 3.6, format literals

Python 3.6 will have format literals, with a more elegant syntax (less redundancy). The simple syntax is something like:

print(f'I am printing {x} and {y}')

The format literals can actually execute code in-place:

>>> print(f'I am printing {"hello".capitalize()} and {"Wo" + "rld"}')
I am printing Hello and World

Why `format()` is more flexible than `%` string operations

I think you should really stick to format() method of str, because it is the preferred way to format strings and will probably replace string formatting operation in the future.

Furthermore, it has some really good features, that can also combine position-based formatting with keyword-based one:

>>> string = 'I will be {} years and {} months on {month} {day}'
>>> some_date = {'month': 'January', 'day': '1st'}
>>> diff = [3, 11] # years, months
>>> string.format(*diff, **some_date)
'I will be 3 years and 11 months on January 1st'

even the following will work:

>>> string = 'On {month} {day} it will be {1} months, {0} years'
>>> string.format(*diff, **some_date)
'On January 1st it will be 11 months, 3 years'

There is also one other reason in favor of format(). Because it is a method, it can be passed as a callback like in the following example:

>>> data = [(1, 2), ('a', 'b'), (5, 'ABC')]
>>> formatter = 'First is "{0[0]}", then comes "{0[1]}"'.format
>>> for item in map(formatter, data):
    print item


First is "1", then comes "2"
First is "a", then comes "b"
First is "5", then comes "ABC"

Isn't it a lot more flexible than string formatting operation?

See more examples on documentation page for comparison between % operations and .format() method.

Comparing tuple-based `%` string formatting with dictionary-based

Generally there are three ways of invoking % string operations (yes, three, not two) like that:

base_string % values

and they differ by the type of values (which is a consequence of what is the content of base_string):

it can be a tuple, then they are replaced one by one, in the order they are appearing in tuple,

>>> 'Three first values are: %f, %f and %f' % (3.14, 2.71, 1)
'Three first values are: 3.140000, 2.710000 and 1.000000'

it can be a dict (dictionary), then they are replaced based on the keywords,

>>> 'My name is %(name)s, I am %(age)s years old' % {'name':'John','age':98}
'My name is John, I am 98 years old'

it can be a single value, if the base_string contains single place where the value should be inserted:
```
>>> 'This is a string: %s' % 'abc'
'This is a string: abc'
```

There are obvious differences between them and these ways cannot be combined (in contrary to format() method which is able to combine some features, as mentioned above).

But there is something that is specific only to dictionary-based string formatting operation and is rather unavailable in remaining three formatting operations' types. This is ability to replace specificators with actual variable names in a simple manner:

>>> name = 'John'
>>> surname = 'Smith'
>>> age = 87
# some code goes here
>>> 'My name is %(surname)s, %(name)s %(surname)s. I am %(age)i.' % locals()
'My name is Smith, John Smith. I am 87.'

Just for the record: of course the above could be easily replaced by using format() by unpacking the dictionary like that:

>>> 'My name is {surname}, {name} {surname}. I am {age}.'.format(**locals())
'My name is Smith, John Smith. I am 87.'

Does anyone else have an idea what could be a feature specific to one type of string formatting operation, but not to the other? It could be quite interesting to hear about it.

Python – Format Strings vs Concatenation

It's just for the looks. You can see at one glance what the format is. Many of us like readability better than micro-optimization.

Let's see what IPython's %timeit says:

Python 3.7.2 (default, Jan  3 2019, 02:55:40)
IPython 5.8.0
Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz

In [1]: %timeit root = "sample"; output = "output"; path = "{}/{}".format(root, output)
The slowest run took 12.44 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 223 ns per loop

In [2]: %timeit root = "sample"; output = "output"; path = root + '/' + output
The slowest run took 13.82 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 101 ns per loop

In [3]: %timeit root = "sample"; output = "output"; path = "%s/%s" % (root, output)
The slowest run took 27.97 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 155 ns per loop

In [4]: %timeit root = "sample"; output = "output"; path = f"{root}/{output}"
The slowest run took 19.52 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 77.8 ns per loop

Best Answer

1. Printing

Print Statements with Commas (Python 2)

Print Function

2. String Concatenation

3. String Interpolation

New in Python 3.6, format literals

Related Solutions

Python String Formatting Options – Pros and Cons

Why format() is more flexible than % string operations

Comparing tuple-based % string formatting with dictionary-based

Python – Format Strings vs Concatenation

Related Question

Why `format()` is more flexible than `%` string operations

Comparing tuple-based `%` string formatting with dictionary-based