You are here: Home ‣ Dive Into Python 3 ‣
Difficulty level: ♦♦♦♦♦
Porting Code to Python 3 with 2to3
❝ Life is pleasant. Death is peaceful. It’s the transition that’s troublesome. ❞
— Isaac Asimov (attributed)
Diving In
So much has changed between Python 2 and Python 3, there are vanishingly few programs that will run unmodified under both. But don’t despair! To help with this transition, Python 3 comes with a utility script called 2to3
, which takes your actual Python 2 source code as input and auto-converts as much as it can to Python 3. Case study: porting chardet
to Python 3 describes how to run the 2to3
script, then shows some things it can’t fix automatically. This appendix documents what it can fix automatically.
print
statement
In Python 2, print
was a statement. Whatever you wanted to print simply followed the print
keyword. In Python 3, print()
is a function. Whatever you want to print, pass it to print()
like any other function.
Notes | Python 2 | Python 3 |
---|---|---|
① | print
| print()
|
② | print 1
| print(1)
|
③ | print 1, 2
| print(1, 2)
|
④ | print 1, 2,
| print(1, 2, end=' ')
|
⑤ | print >>sys.stderr, 1, 2, 3
| print(1, 2, 3, file=sys.stderr)
|
- To print a blank line, call
print()
without any arguments. - To print a single value, call
print()
with one argument. - To print two values separated by a space, call
print()
with two arguments. - This one is a little tricky. In Python 2, if you ended a
print
statement with a comma, it would print the values separated by spaces, then print a trailing space, then stop without printing a carriage return. (Technically, it’s a little more complicated than that. Theprint
statement in Python 2 used a now-deprecated attribute called softspace. Instead of printing a space, Python 2 would setsys.stdout.softspace
to 1. The space character wasn’t really printed until something else got printed on the same line. If the nextprint
statement printed a carriage return,sys.stdout.softspace
would be set to 0 and the space would never be printed. You probably never noticed the difference unless your application was sensitive to the presence or absence of trailing whitespace inprint
-generated output.) In Python 3, the way to do this is to passend=' '
as a keyword argument to theprint()
function. Theend
argument defaults to'\n'
(a carriage return), so overriding it will suppress the carriage return after printing the other arguments. - In Python 2, you could redirect the output to a pipe — like
sys.stderr
— by using the>>pipe_name
syntax. In Python 3, the way to do this is to pass the pipe in thefile
keyword argument. Thefile
argument defaults tosys.stdout
(standard out), so overriding it will output to a different pipe instead.
Unicode string literals
Python 2 had two string types: Unicode strings and non-Unicode strings. Python 3 has one string type: Unicode strings.
Notes | Python 2 | Python 3 |
---|---|---|
① | u'PapayaWhip'
| 'PapayaWhip'
|
② | ur'PapayaWhip\foo'
| r'PapayaWhip\foo'
|
- Unicode string literals are simply converted into string literals, which, in Python 3, are always Unicode.
- Unicode raw strings (in which Python does not auto-escape backslashes) are converted to raw strings. In Python 3, raw strings are always Unicode.
unicode()
global function
Python 2 had two global functions to coerce objects into strings: unicode()
to coerce them into Unicode strings, and str()
to coerce them into non-Unicode strings. Python 3 has only one string type, Unicode strings, so the str()
function is all you need. (The unicode()
function no longer exists.)
Notes | Python 2 | Python 3 |
---|---|---|
unicode(anything)
| str(anything)
|
long
data type
Python 2 had separate int
and long
types for non-floating-point numbers. An int
could not be any larger than sys.maxint
, which varied by platform. Longs were defined by appending an L
to the end of the number, and they could be, well, longer than ints. In Python 3, there is only one integer type, called int
, which mostly behaves like the long
type in Python 2. Since there are no longer two types, there is no need for special syntax to distinguish them.
Further reading: PEP 237: Unifying Long Integers and Integers.
Notes | Python 2 | Python 3 |
---|---|---|
① | x = 1000000000000L
| x = 1000000000000
|
② | x = 0xFFFFFFFFFFFFL
| x = 0xFFFFFFFFFFFF
|
③ | long(x)
| int(x)
|
④ | type(x) is long
| type(x) is int
|
⑤ | isinstance(x, long)
| isinstance(x, int)
|
- Base 10 long integer literals become base 10 integer literals.
- Base 16 long integer literals become base 16 integer literals.
- In Python 3, the old
long()
function no longer exists, since longs don’t exist. To coerce a variable to an integer, use theint()
function. - To check whether a variable is an integer, get its type and compare it to
int
, notlong
. - You can also use the
isinstance()
function to check data types; again, useint
, notlong
, to check for integers.
<> comparison
Python 2 supported <>
as a synonym for !=
, the not-equals comparison operator. Python 3 supports the !=
operator, but not <>
.
Notes | Python 2 | Python 3 |
---|---|---|
① | if x <> y:
| if x != y:
|
② | if x <> y <> z:
| if x != y != z:
|
- A simple comparison.
- A more complex comparison between three values.
has_key()
dictionary method
In Python 2, dictionaries had a has_key()
method to test whether the dictionary had a certain key. In Python 3, this method no longer exists. Instead, you need to use the in
operator.
Notes | Python 2 | Python 3 |
---|---|---|
① | a_dictionary.has_key('PapayaWhip')
| 'PapayaWhip' in a_dictionary
|
② | a_dictionary.has_key(x) or a_dictionary.has_key(y)
| x in a_dictionary or y in a_dictionary
|
③ | a_dictionary.has_key(x or y)
| (x or y) in a_dictionary
|
④ | a_dictionary.has_key(x + y)
| (x + y) in a_dictionary
|
⑤ | x + a_dictionary.has_key(y)
| x + (y in a_dictionary)
|
- The simplest form.
- The
in
operator takes precedence over theor
operator, so there is no need for parentheses aroundx in a_dictionary
or aroundy in a_dictionary
. - On the other hand, you do need parentheses around
x or y
here, for the same reason —in
takes precedence overor
. (Note: this code is completely different from the previous line. Python interpretsx or y
first, which results in either x (if x is true in a boolean context) or y. Then it takes that singular value and checks whether it is a key in a_dictionary.) - The
+
operator takes precedence over thein
operator, so this form technically doesn’t need parentheses aroundx + y
, but2to3
includes them anyway. - This form definitely needs parentheses around
y in a_dictionary
, since the+
operator takes precedence over thein
operator.
Dictionary methods that return lists
In Python 2, many dictionary methods returned lists. The most frequently used methods were keys()
, items()
, and values()
. In Python 3, all of these methods return dynamic views. In some contexts, this is not a problem. If the method’s return value is immediately passed to another function that iterates through the entire sequence, it makes no difference whether the actual type is a list or a view. In other contexts, it matters a great deal. If you were expecting a complete list with individually addressable elements, your code will choke, because views do not support indexing.
Notes | Python 2 | Python 3 |
---|---|---|
① | a_dictionary.keys()
| list(a_dictionary.keys())
|
② | a_dictionary.items()
| list(a_dictionary.items())
|
③ | a_dictionary.iterkeys()
| iter(a_dictionary.keys())
|
④ | [i for i in a_dictionary.iterkeys()]
| [i for i in a_dictionary.keys()]
|
⑤ | min(a_dictionary.keys())
| no change |
2to3
errs on the side of safety, converting the return value fromkeys()
to a static list with thelist()
function. This will always work, but it will be less efficient than using a view. You should examine the converted code to see if a list is absolutely necessary, or if a view would do.- Another view-to-list conversion, with the
items()
method.2to3
will do the same thing with thevalues()
method. - Python 3 does not support the
iterkeys()
method anymore. Usekeys()
, and if necessary, convert the view to an iterator with theiter()
function. 2to3
recognizes when theiterkeys()
method is used inside a list comprehension, and converts it to thekeys()
method (without wrapping it in an extra call toiter()
). This works because views are iterable.2to3
recognizes that thekeys()
method is immediately passed to a function which iterates through an entire sequence, so there is no need to convert the return value to a list first. Themin()
function will happily iterate through the view instead. This applies tomin()
,max()
,sum()
,list()
,tuple()
,set()
,sorted()
,any()
, andall()
.
Modules that have been renamed or reorganized
Several modules in the Python Standard Library have been renamed. Several other modules which are related to each other have been combined or reorganized to make their association more logical.
http
In Python 3, several related HTTP modules have been combined into a single package, http
.
Notes | Python 2 | Python 3 |
---|---|---|
① | import httplib
| import http.client
|
② | import Cookie
| import http.cookies
|
③ | import cookielib
| import http.cookiejar
|
④ |
| import http.server
|
- The
http.client
module implements a low-level library that can request HTTP resources and interpret HTTP responses. - The
http.cookies
module provides a Pythonic interface to browser cookies that are sent in aSet-Cookie:
HTTP header. - The
http.cookiejar
module manipulates the actual files on disk that popular web browsers use to store cookies. - The
http.server
module provides a basic HTTP server.
urllib
Python 2 had a rat’s nest of overlapping modules to parse, encode, and fetch URLs. In Python 3, these have all been refactored and combined in a single package, urllib
.
Notes | Python 2 | Python 3 |
---|---|---|
① | import urllib
| import urllib.request, urllib.parse, urllib.error
|
② | import urllib2
| import urllib.request, urllib.error
|
③ | import urlparse
| import urllib.parse
|
④ | import robotparser
| import urllib.robotparser
|
⑤ |
|
|
⑥ |
|
|
- The old
urllib
module in Python 2 had a variety of functions, includingurlopen()
for fetching data andsplittype()
,splithost()
, andsplituser()
for splitting a URL into its constituent parts. These functions have been reorganized more logically within the newurllib
package.2to3
will also change all calls to these functions so they use the new naming scheme. - The old
urllib2
module in Python 2 has been folded into theurllib
package in Python 3. All yoururllib2
favorites — thebuild_opener()
method,Request
objects, andHTTPBasicAuthHandler
and friends — are still available. - The
urllib.parse
module in Python 3 contains all the parsing functions from the oldurlparse
module in Python 2. - The
urllib.robotparser
module parsesrobots.txt
files. - The
FancyURLopener
class, which handles HTTP redirects and other status codes, is still available in the newurllib.request
module. Theurlencode()
function has moved tourllib.parse
. - The
Request
object is still available inurllib.request
, but constants likeHTTPError
have been moved tourllib.error
.
Did I mention that 2to3
will rewrite your function calls too? For example, if your Python 2 code imports the urllib
module and calls urllib.urlopen()
to fetch data, 2to3
will fix both the import statement and the function call.
Notes | Python 2 | Python 3 |
---|---|---|
|
|
dbm
All the various DBM clones are now in a single package, dbm
. If you need a specific variant like GNU DBM, you can import the appropriate module within the dbm
package.
Notes | Python 2 | Python 3 |
---|---|---|
import dbm
| import dbm.ndbm
| |
import gdbm
| import dbm.gnu
| |
import dbhash
| import dbm.bsd
| |
import dumbdbm
| import dbm.dumb
| |
| import dbm
|
xmlrpc
XML-RPC is a lightweight method of performing remote RPC calls over HTTP. The XML-RPC client library and several XML-RPC server implementations are now combined in a single package, xmlrpc
.
Notes | Python 2 | Python 3 |
---|---|---|
import xmlrpclib
| import xmlrpc.client
| |
| import xmlrpc.server
|
Other modules
Notes | Python 2 | Python 3 |
---|---|---|
① |
| import io
|
② |
| import pickle
|
③ | import __builtin__
| import builtins
|
④ | import copy_reg
| import copyreg
|
⑤ | import Queue
| import queue
|
⑥ | import SocketServer
| import socketserver
|
⑦ | import ConfigParser
| import configparser
|
⑧ | import repr
| import reprlib
|
⑨ | import commands
| import subprocess
|
- A common idiom in Python 2 was to try to import
cStringIO as StringIO
, and if that failed, to importStringIO
instead. Do not do this in Python 3; theio
module does it for you. It will find the fastest implementation available and use it automatically. - A similar idiom was used to import the fastest pickle implementation. Do not do this in Python 3; the
pickle
module does it for you. - The
builtins
module contains the global functions, classes, and constants used throughout the Python language. Redefining a function in thebuiltins
module will redefine the global function everywhere. That is exactly as powerful and scary as it sounds. - The
copyreg
module adds pickle support for custom types defined in C. - The
queue
module implements a multi-producer, multi-consumer queue. - The
socketserver
module provides generic base classes for implementing different kinds of socket servers. - The
configparser
module parses INI-style configuration files. - The
reprlib
module reimplements the built-inrepr()
function, with additional controls on how long the representations can be before they are truncated. - The
subprocess
module allows you to spawn processes, connect to their pipes, and obtain their return codes.
Relative imports within a package
A package is a group of related modules that function as a single entity. In Python 2, when modules within a package need to reference each other, you use import foo
or from foo import Bar
. The Python 2 interpreter first searches within the current package to find foo.py
, and then moves on to the other directories in the Python search path (sys.path
). Python 3 works a bit differently. Instead of searching the current package, it goes directly to the Python search path. If you want one module within a package to import another module in the same package, you need to explicitly provide the relative path between the two modules.
Suppose you had this package, with multiple files in the same directory:
chardet/ | +--__init__.py | +--constants.py | +--mbcharsetprober.py | +--universaldetector.py
Now suppose that universaldetector.py
needs to import the entire constants.py
file and one class from mbcharsetprober.py
. How do you do it?
Notes | Python 2 | Python 3 |
---|---|---|
① | import constants
| from . import constants
|
② | from mbcharsetprober import MultiByteCharSetProber
| from .mbcharsetprober import MultiByteCharsetProber
|
- When you need to import an entire module from elsewhere in your package, use the new
from . import
syntax. The period is actually a relative path from this file (universaldetector.py
) to the file you want to import (constants.py
). In this case, they are in the same directory, thus the single period. You can also import from the parent directory (from .. import anothermodule
) or a subdirectory. - To import a specific class or function from another module directly into your module’s namespace, prefix the target module with a relative path, minus the trailing slash. In this case,
mbcharsetprober.py
is in the same directory asuniversaldetector.py
, so the path is a single period. You can also import form the parent directory (from ..anothermodule import AnotherClass
) or a subdirectory.
next()
iterator method
In Python 2, iterators had a next()
method which returned the next item in the sequence. That’s still true in Python 3, but there is now also a global next()
function that takes an iterator as an argument.
Notes | Python 2 | Python 3 |
---|---|---|
① | anIterator.next()
| next(anIterator)
|
② | a_function_that_returns_an_iterator().next()
| next(a_function_that_returns_an_iterator())
|
③ |
|
|
④ |
| no change |
⑤ |
|
|
- In the simplest case, instead of calling an iterator’s
next()
method, you now pass the iterator itself to the globalnext()
function. - If you have a function that returns an iterator, call the function and pass the result to the
next()
function. (The2to3
script is smart enough to convert this properly.) - If you define your own class and mean to use it as an iterator, define the
__next__()
special method. - If you define your own class and just happen to have a method named
next()
that takes one or more arguments,2to3
will not touch it. This class can not be used as an iterator, because itsnext()
method takes arguments. - This one is a bit tricky. If you have a local variable named next, then it takes precedence over the new global
next()
function. In this case, you need to call the iterator’s special__next__()
method to get the next item in the sequence. (Alternatively, you could also refactor the code so the local variable wasn’t named next, but2to3
will not do that for you automatically.)
filter()
global function
In Python 2, the filter()
function returned a list, the result of filtering a sequence through a function that returned True
or False
for each item in the sequence. In Python 3, the filter()
function returns an iterator, not a list.
Notes | Python 2 | Python 3 |
---|---|---|
① | filter(a_function, a_sequence)
| list(filter(a_function, a_sequence))
|
② | list(filter(a_function, a_sequence))
| no change |
③ | filter(None, a_sequence)
| [i for i in a_sequence if i]
|
④ | for i in filter(None, a_sequence):
| no change |
⑤ | [i for i in filter(a_function, a_sequence)]
| no change |
- In the most basic case,
2to3
will wrap a call tofilter()
with a call tolist()
, which simply iterates through its argument and returns a real list. - However, if the call to
filter()
is already wrapped inlist()
,2to3
will do nothing, since the fact thatfilter()
is returning an iterator is irrelevant. - For the special syntax of
filter(None, ...)
,2to3
will transform the call into a semantically equivalent list comprehension. - In contexts like
for
loops, which iterate through the entire sequence anyway, no changes are necessary. - Again, no changes are necessary, because the list comprehension will iterate through the entire sequence, and it can do that just as well if
filter()
returns an iterator as if it returns a list.
map()
global function
In much the same way as filter()
, the map()
function now returns an iterator. (In Python 2, it returned a list.)
Notes | Python 2 | Python 3 |
---|---|---|
① | map(a_function, 'PapayaWhip')
| list(map(a_function, 'PapayaWhip'))
|
② | map(None, 'PapayaWhip')
| list('PapayaWhip')
|
③ | map(lambda x: x+1, range(42))
| [x+1 for x in range(42)]
|
④ | for i in map(a_function, a_sequence):
| no change |
⑤ | [i for i in map(a_function, a_sequence)]
| no change |
- As with
filter()
, in the most basic case,2to3
will wrap a call tomap()
with a call tolist()
. - For the special syntax of
map(None, ...)
, the identity function,2to3
will convert it to an equivalent call tolist()
. - If the first argument to
map()
is a lambda function,2to3
will convert it to an equivalent list comprehension. - In contexts like
for
loops, which iterate through the entire sequence anyway, no changes are necessary. - Again, no changes are necessary, because the list comprehension will iterate through the entire sequence, and it can do that just as well if
map()
returns an iterator as if it returns a list.
reduce()
global function
In Python 3, the reduce()
function has been removed from the global namespace and placed in the functools
module.
Notes | Python 2 | Python 3 |
---|---|---|
reduce(a, b, c)
|
|
apply()
global function
Python 2 had a global function called apply()
, which took a function f and a list [a, b, c]
and returned f(a, b, c)
. You can accomplish the same thing by calling the function directly and passing it the list of arguments preceded by an asterisk. In Python 3, the apply()
function no longer exists; you must use the asterisk notation.
Notes | Python 2 | Python 3 |
---|---|---|
① | apply(a_function, a_list_of_args)
| a_function(*a_list_of_args)
|
② | apply(a_function, a_list_of_args, a_dictionary_of_named_args)
| a_function(*a_list_of_args, **a_dictionary_of_named_args)
|
③ | apply(a_function, a_list_of_args + z)
| a_function(*a_list_of_args + z)
|
④ | apply(aModule.a_function, a_list_of_args)
| aModule.a_function(*a_list_of_args)
|
- In the simplest form, you can call a function with a list of arguments (an actual list like
[a, b, c]
) by prepending the list with an asterisk (*
). This is exactly equivalent to the oldapply()
function in Python 2. - In Python 2, the
apply()
function could actually take three parameters: a function, a list of arguments, and a dictionary of named arguments. In Python 3, you can accomplish the same thing by prepending the list of arguments with an asterisk (*
) and the dictionary of named arguments with two asterisks (**
). - The
+
operator, used here for list concatenation, takes precedence over the*
operator, so there is no need for extra parentheses arounda_list_of_args + z
. - The
2to3
script is smart enough to convert complexapply()
calls, including calling functions within imported modules.
intern()
global function
In Python 2, you could call the intern()
function on a string to intern it as a performance optimization. In Python 3, the intern()
function has been moved to the sys
module.
Notes | Python 2 | Python 3 |
---|---|---|
intern(aString)
| sys.intern(aString)
|
exec
statement
Just as the print
statement became a function in Python 3, so too has the exec
statement. The exec()
function takes a string which contains arbitrary Python code and executes it as if it were just another statement or expression. exec()
is like eval()
, but even more powerful and evil. The eval()
function can only evaluate a single expression, but exec()
can execute multiple statements, imports, function declarations — essentially an entire Python program in a string.
Notes | Python 2 | Python 3 |
---|---|---|
① | exec codeString
| exec(codeString)
|
② | exec codeString in a_global_namespace
| exec(codeString, a_global_namespace)
|
③ | exec codeString in a_global_namespace, a_local_namespace
| exec(codeString, a_global_namespace, a_local_namespace)
|
- In the simplest form, the
2to3
script simply encloses the code-as-a-string in parentheses, sinceexec()
is now a function instead of a statement. - The old
exec
statement could take a namespace, a private environment of globals in which the code-as-a-string would be executed. Python 3 can also do this; just pass the namespace as the second argument to theexec()
function. - Even fancier, the old
exec
statement could also take a local namespace (like the variables defined within a function). In Python 3, theexec()
function can do that too.
execfile
statement
Like the old exec
statement, the old execfile
statement will execute strings as if they were Python code. Where exec
took a string, execfile
took a filename. In Python 3, the execfile
statement has been eliminated. If you really need to take a file of Python code and execute it (but you’re not willing to simply import it), you can accomplish the same thing by opening the file, reading its contents, calling the global compile()
function to force the Python interpreter to compile the code, and then call the new exec()
function.
Notes | Python 2 | Python 3 |
---|---|---|
execfile('a_filename')
| exec(compile(open('a_filename').read(), 'a_filename', 'exec'))
|
repr
literals (backticks)
In Python 2, there was a special syntax of wrapping any object in backticks (like `x`
) to get a representation of the object. In Python 3, this capability still exists, but you can no longer use backticks to get it. Instead, use the global repr()
function.
Notes | Python 2 | Python 3 |
---|---|---|
① | `x`
| repr(x)
|
② | `'PapayaWhip' + `2``
| repr('PapayaWhip' + repr(2))
|
- Remember, x can be anything — a class, a function, a module, a primitive data type, &c. The
repr()
function works on everything. - In Python 2, backticks could be nested, leading to this sort of confusing (but valid) expression. The
2to3
tool is smart enough to convert this into nested calls torepr()
.
try...except
statement
The syntax for catching exceptions has changed slightly between Python 2 and Python 3.
Notes | Python 2 | Python 3 |
---|---|---|
① |
|
|
② |
|
|
③ |
| no change |
④ |
| no change |
- Instead of a comma after the exception type, Python 3 uses a new keyword,
as
. - The
as
keyword also works for catching multiple types of exceptions at once. - If you catch an exception but don’t actually care about accessing the exception object itself, the syntax is identical between Python 2 and Python 3.
- Similarly, if you use a fallback to catch all exceptions, the syntax is identical.
☞You should never use a fallback to catch all exceptions when importing modules (or most other times). Doing so will catch things like
KeyboardInterrupt
(if the user pressed Ctrl-C to interrupt the program) and can make it more difficult to debug errors.
raise
statement
The syntax for raising your own exceptions has changed slightly between Python 2 and Python 3.
Notes | Python 2 | Python 3 |
---|---|---|
① | raise MyException
| unchanged |
② | raise MyException, 'error message'
| raise MyException('error message')
|
③ | raise MyException, 'error message', a_traceback
| raise MyException('error message').with_traceback(a_traceback)
|
④ | raise 'error message'
| unsupported |
- In the simplest form, raising an exception without a custom error message, the syntax is unchanged.
- The change becomes noticeable when you want to raise an exception with a custom error message. Python 2 separated the exception class and the message with a comma; Python 3 passes the error message as a parameter.
- Python 2 supported a more complex syntax to raise an exception with a custom traceback (stack trace). You can do this in Python 3 as well, but the syntax is quite different.
- In Python 2, you could raise an exception with no exception class, just an error message. In Python 3, this is no longer possible.
2to3
will warn you that it was unable to fix this automatically.
throw
method on generators
In Python 2, generators have a throw()
method. Calling a_generator.throw()
raises an exception at the point where the generator was paused, then returns the next value yielded by the generator function. In Python 3, this functionality is still available, but the syntax is slightly different.
Notes | Python 2 | Python 3 |
---|---|---|
① | a_generator.throw(MyException)
| no change |
② | a_generator.throw(MyException, 'error message')
| a_generator.throw(MyException('error message'))
|
③ | a_generator.throw('error message')
| unsupported |
- In the simplest form, a generator throws an exception without a custom error message. In this case, the syntax has not changed between Python 2 and Python 3.
- If the generator throws an exception with a custom error message, you need to pass the error string to the exception when you create it.
- Python 2 also supported throwing an exception with only a custom error message. Python 3 does not support this, and the
2to3
script will display a warning telling you that you will need to fix this code manually.
xrange()
global function
In Python 2, there were two ways to get a range of numbers: range()
, which returned a list, and xrange()
, which returned an iterator. In Python 3, range()
returns an iterator, and xrange()
doesn’t exist.
Notes | Python 2 | Python 3 |
---|---|---|
① | xrange(10)
| range(10)
|
② | a_list = range(10)
| a_list = list(range(10))
|
③ | [i for i in xrange(10)]
| [i for i in range(10)]
|
④ | for i in range(10):
| no change |
⑤ | sum(range(10))
| no change |
- In the simplest case, the
2to3
script will simply convertxrange()
torange()
. - If your Python 2 code used
range()
, the2to3
script does not know whether you needed a list, or whether an iterator would do. It errs on the side of caution and coerces the return value into a list by calling thelist()
function. - If the
xrange()
function was inside a list comprehension, the2to3
script is clever enough not to wrap therange()
function with a call tolist()
. The list comprehension will work just fine with the iterator that therange()
function returns. - Similarly, a
for
loop will work just fine with an iterator, so there is no need to change anything here. - The
sum()
function will also work with an iterator, so2to3
makes no changes here either. Like dictionary methods that return views instead of lists, this applies tomin()
,max()
,sum()
,list()
,tuple()
,set()
,sorted()
,any()
, andall()
.
raw_input()
and input()
global functions
Python 2 had two global functions for asking the user for input on the command line. The first, called input()
, expected the user to enter a Python expression (and returned the result). The second, called raw_input()
, just returned whatever the user typed. This was wildly confusing for beginners and widely regarded as a “wart” in the language. Python 3 excises this wart by renaming raw_input()
to input()
, so it works the way everyone naively expects it to work.
Notes | Python 2 | Python 3 |
---|---|---|
① | raw_input()
| input()
|
② | raw_input('prompt')
| input('prompt')
|
③ | input()
| eval(input())
|
- In the simplest form,
raw_input()
becomesinput()
. - In Python 2, the
raw_input()
function could take a prompt as a parameter. This has been retained in Python 3. - If you actually need to ask the user for a Python expression to evaluate, use the
input()
function and pass the result toeval()
.
func_*
function attributes
In Python 2, code within functions can access special attributes about the function itself. In Python 3, these special function attributes have been renamed for consistency with other attributes.
Notes | Python 2 | Python 3 |
---|---|---|
① | a_function.func_name
| a_function.__name__
|
② | a_function.func_doc
| a_function.__doc__
|
③ | a_function.func_defaults
| a_function.__defaults__
|
④ | a_function.func_dict
| a_function.__dict__
|
⑤ | a_function.func_closure
| a_function.__closure__
|
⑥ | a_function.func_globals
| a_function.__globals__
|
⑦ | a_function.func_code
| a_function.__code__
|
- The
__name__
attribute (previouslyfunc_name
) contains the function’s name. - The
__doc__
attribute (previouslyfunc_doc
) contains the docstring that you defined in the function’s source code. - The
__defaults__
attribute (previouslyfunc_defaults
) is a tuple containing default argument values for those arguments that have default values. - The
__dict__
attribute (previouslyfunc_dict
) is the namespace supporting arbitrary function attributes. - The
__closure__
attribute (previouslyfunc_closure
) is a tuple of cells that contain bindings for the function’s free variables. - The
__globals__
attribute (previouslyfunc_globals
) is a reference to the global namespace of the module in which the function was defined. - The
__code__
attribute (previouslyfunc_code
) is a code object representing the compiled function body.
xreadlines()
I/O method
In Python 2, file objects had an xreadlines()
method which returned an iterator that would read the file one line at a time. This was useful in for
loops, among other places. In fact, it was so useful, later versions of Python 2 added the capability to file objects themselves.
In Python 3, the xreadlines()
method no longer exists. 2to3
can fix the simple cases, but some edge cases will require manual intervention.
Notes | Python 2 | Python 3 |
---|---|---|
① | for line in a_file.xreadlines():
| for line in a_file:
|
② | for line in a_file.xreadlines(5):
| no change (broken) |
- If you used to call
xreadlines()
with no arguments,2to3
will convert it to just the file object. In Python 3, this will accomplish the same thing: read the file one line at a time and execute the body of thefor
loop. - If you used to call
xreadlines()
with an argument (the number of lines to read at a time),2to3
will not fix it, and your code will fail with anAttributeError: '_io.TextIOWrapper' object has no attribute 'xreadlines'
. You can manually changexreadlines()
toreadlines()
to get it to work in Python 3. (Thereadlines()
method now returns an iterator, so it is just as efficient asxreadlines()
was in Python 2.)
☃
lambda
functions that take a tuple instead of multiple parameters
In Python 2, you could define anonymous lambda
functions which took multiple parameters by defining the function as taking a tuple with a specific number of items. In effect, Python 2 would “unpack” the tuple into named arguments, which you could then reference (by name) within the lambda
function. In Python 3, you can still pass a tuple to a lambda
function, but the Python interpreter will not unpack the tuple into named arguments. Instead, you will need to reference each argument by its positional index.
Notes | Python 2 | Python 3 |
---|---|---|
① | lambda (x,): x + f(x)
| lambda x1: x1[0] + f(x1[0])
|
② | lambda (x, y): x + f(y)
| lambda x_y: x_y[0] + f(x_y[1])
|
③ | lambda (x, (y, z)): x + y + z
| lambda x_y_z: x_y_z[0] + x_y_z[1][0] + x_y_z[1][1]
|
④ | lambda x, y, z: x + y + z
| unchanged |
- If you had defined a
lambda
function that took a tuple of one item, in Python 3 that would become alambda
with references to x1[0]. The name x1 is autogenerated by the2to3
script, based on the named arguments in the original tuple. - A
lambda
function with a two-item tuple (x, y) gets converted to x_y with positional arguments x_y[0] and x_y[1]. - The
2to3
script can even handlelambda
functions with nested tuples of named arguments. The resulting Python 3 code is a bit unreadable, but it works the same as the old code did in Python 2. - You can define
lambda
functions that take multiple arguments. Without parentheses around the arguments, Python 2 just treats it as alambda
function with multiple arguments; within thelambda
function, you simply reference the arguments by name, just like any other function. This syntax still works in Python 3.
Special method attributes
In Python 2, class methods can reference the class object in which they are defined, as well as the method object itself. im_self
is the class instance object; im_func
is the function object; im_class
is the class of im_self
. In Python 3, these special method attributes have been renamed to follow the naming conventions of other attributes.
Notes | Python 2 | Python 3 |
---|---|---|
aClassInstance.aClassMethod.im_func
| aClassInstance.aClassMethod.__func__
| |
aClassInstance.aClassMethod.im_self
| aClassInstance.aClassMethod.__self__
| |
aClassInstance.aClassMethod.im_class
| aClassInstance.aClassMethod.__self__.__class__
|
__nonzero__
special method
In Python 2, you could build your own classes that could be used in a boolean context. For example, you could instantiate the class and then use the instance in an if
statement. To do this, you defined a special __nonzero__()
method which returned True
or False
, and it was called whenever the instance was used in a boolean context. In Python 3, you can still do this, but the name of the method has changed to __bool__()
.
Notes | Python 2 | Python 3 |
---|---|---|
① |
|
|
② |
| no change |
- Instead of
__nonzero__()
, Python 3 calls the__bool__()
method when evaluating an instance in a boolean context. - However, if you have a
__nonzero__()
method that takes arguments, the2to3
tool will assume that you were using it for some other purpose, and it will not make any changes.
Octal literals
The syntax for defining base 8 (octal) numbers has changed slightly between Python 2 and Python 3.
Notes | Python 2 | Python 3 |
---|---|---|
x = 0755
| x = 0o755
|
sys.maxint
Due to the integration of the long
and int
types, the sys.maxint
constant is no longer accurate. Because the value may still be useful in determining platform-specific capabilities, it has been retained but renamed as sys.maxsize
.
Notes | Python 2 | Python 3 |
---|---|---|
① | from sys import maxint
| from sys import maxsize
|
② | a_function(sys.maxint)
| a_function(sys.maxsize)
|
maxint
becomesmaxsize
.- Any usage of
sys.maxint
becomessys.maxsize
.
callable()
global function
In Python 2, you could check whether an object was callable (like a function) with the global callable()
function. In Python 3, this global function has been eliminated. To check whether an object is callable, check for the existence of the __call__()
special method.
Notes | Python 2 | Python 3 |
---|---|---|
callable(anything)
| hasattr(anything, '__call__')
|
zip()
global function
In Python 2, the global zip()
function took any number of sequences and returned a list of tuples. The first tuple contained the first item from each sequence; the second tuple contained the second item from each sequence; and so on. In Python 3, zip()
returns an iterator instead of a list.
Notes | Python 2 | Python 3 |
---|---|---|
① | zip(a, b, c)
| list(zip(a, b, c))
|
② | d.join(zip(a, b, c))
| no change |
- In the simplest form, you can get the old behavior of the
zip()
function by wrapping the return value in a call tolist()
, which will run through the iterator thatzip()
returns and return a real list of the results. - In contexts that already iterate through all the items of a sequence (such as this call to the
join()
method), the iterator thatzip()
returns will work just fine. The2to3
script is smart enough to detect these cases and make no change to your code.
StandardError
exception
In Python 2, StandardError
was the base class for all built-in exceptions other than StopIteration
, GeneratorExit
, KeyboardInterrupt
, and SystemExit
. In Python 3, StandardError
has been eliminated; use Exception
instead.
Notes | Python 2 | Python 3 |
---|---|---|
x = StandardError()
| x = Exception()
| |
x = StandardError(a, b, c)
| x = Exception(a, b, c)
|
types
module constants
The types
module contains a variety of constants to help you determine the type of an object. In Python 2, it contained constants for all primitive types like dict
and int
. In Python 3, these constants have been eliminated; just use the primitive type name instead.
Notes | Python 2 | Python 3 |
---|---|---|
types.UnicodeType
| str
| |
types.StringType
| bytes
| |
types.DictType
| dict
| |
types.IntType
| int
| |
types.LongType
| int
| |
types.ListType
| list
| |
types.NoneType
| type(None)
| |
types.BooleanType
| bool
| |
types.BufferType
| memoryview
| |
types.ClassType
| type
| |
types.ComplexType
| complex
| |
types.EllipsisType
| type(Ellipsis)
| |
types.FloatType
| float
| |
types.ObjectType
| object
| |
types.NotImplementedType
| type(NotImplemented)
| |
types.SliceType
| slice
| |
types.TupleType
| tuple
| |
types.TypeType
| type
| |
types.XRangeType
| range
|
☞
types.StringType
gets mapped tobytes
instead ofstr
because a Python 2 “string” (not a Unicode string, just a regular string) is really just a sequence of bytes in a particular character encoding.
isinstance()
global function
The isinstance()
function checks whether an object is an instance of a particular class or type. In Python 2, you could pass a tuple of types, and isinstance()
would return True
if the object was any of those types. In Python 3, you can still do this, but passing the same type twice is deprecated.
Notes | Python 2 | Python 3 |
---|---|---|
isinstance(x, (int, float, int))
| isinstance(x, (int, float))
|
basestring
datatype
Python 2 had two string types: Unicode and non-Unicode. But there was also another type, basestring
. It was an abstract type, a superclass for both the str
and unicode
types. It couldn’t be called or instantiated directly, but you could pass it to the global isinstance()
function to check whether an object was either a Unicode or non-Unicode string. In Python 3, there is only one string type, so basestring
has no reason to exist.
Notes | Python 2 | Python 3 |
---|---|---|
isinstance(x, basestring)
| isinstance(x, str)
|
itertools
module
Python 2.3 introduced the itertools
module, which defined variants of the global zip()
, map()
, and filter()
functions that returned iterators instead of lists. In Python 3, those global functions return iterators, so those functions in the itertools
module have been eliminated. (There are still lots of useful functions in the itertools
module, just not these.)
Notes | Python 2 | Python 3 |
---|---|---|
① | itertools.izip(a, b)
| zip(a, b)
|
② | itertools.imap(a, b)
| map(a, b)
|
③ | itertools.ifilter(a, b)
| filter(a, b)
|
④ | from itertools import imap, izip, foo
| from itertools import foo
|
- Instead of
itertools.izip()
, just use the globalzip()
function. - Instead of
itertools.imap()
, just usemap()
. itertools.ifilter()
becomesfilter()
.- The
itertools
module still exists in Python 3, it just doesn’t have the functions that have migrated to the global namespace. The2to3
script is smart enough to remove the specific imports that no longer exist, while leaving other imports intact.
sys.exc_type
, sys.exc_value
, sys.exc_traceback
Python 2 had three variables in the sys
module that you could access while an exception was being handled: sys.exc_type
, sys.exc_value
, sys.exc_traceback
. (Actually, these date all the way back to Python 1.) Ever since Python 1.5, these variables have been deprecated in favor of sys.exc_info()
, which is a function that returns a tuple containing those three values. In Python 3, these individual variables have finally gone away; you must use the sys.exc_info()
function.
Notes | Python 2 | Python 3 |
---|---|---|
sys.exc_type
| sys.exc_info()[0]
| |
sys.exc_value
| sys.exc_info()[1]
| |
sys.exc_traceback
| sys.exc_info()[2]
|
List comprehensions over tuples
In Python 2, if you wanted to code a list comprehension that iterated over a tuple, you did not need to put parentheses around the tuple values. In Python 3, explicit parentheses are required.
Notes | Python 2 | Python 3 |
---|---|---|
[i for i in 1, 2]
| [i for i in (1, 2)]
|
os.getcwdu()
function
Python 2 had a function named os.getcwd()
, which returned the current working directory as a (non-Unicode) string. Because modern file systems can handle directory names in any character encoding, Python 2.3 introduced os.getcwdu()
. The os.getcwdu()
function returned the current working directory as a Unicode string. In Python 3, there is only one string type (Unicode), so os.getcwd()
is all you need.
Notes | Python 2 | Python 3 |
---|---|---|
os.getcwdu()
| os.getcwd()
|
Metaclasses
In Python 2, you could create metaclasses either by defining the metaclass
argument in the class declaration, or by defining a special class-level __metaclass__
attribute. In Python 3, the class-level attribute has been eliminated.
Notes | Python 2 | Python 3 |
---|---|---|
① |
| unchanged |
② |
|
|
③ |
|
|
- Declaring the metaclass in the class declaration worked in Python 2, and it still works the same in Python 3.
- Declaring the metaclass in a class attribute worked in Python 2, but doesn’t work in Python 3.
- The
2to3
script is smart enough to construct a valid class declaration, even if the class is inherited from one or more base classes.
Matters of style
The rest of the “fixes” listed here aren’t really fixes per se. That is, the things they change are matters of style, not substance. They work just as well in Python 3 as they do in Python 2, but the developers of Python have a vested interest in making Python code as uniform as possible. To that end, there is an official Python style guide which outlines — in excruciating detail — all sorts of nitpicky details that you almost certainly don’t care about. And given that 2to3
provides such a great infrastructure for converting Python code from one thing to another, the authors took it upon themselves to add a few optional features to improve the readability of your Python programs.
set()
literals (explicit)
In Python 2, the only way to define a literal set in your code was to call set(a_sequence)
. This still works in Python 3, but a clearer way of doing it is to use the new set literal notation: curly braces. This works for everything except empty sets, because dictionaries also use curly braces, so {}
is an empty dictionary, not an empty set.
☞The
2to3
script will not fixset()
literals by default. To enable this fix, specify -f set_literal on the command line when you call2to3
.
Notes | Before | After |
---|---|---|
set([1, 2, 3])
| {1, 2, 3}
| |
set((1, 2, 3))
| {1, 2, 3}
| |
set([i for i in a_sequence])
| {i for i in a_sequence}
|
buffer()
global function (explicit)
Python objects implemented in C can export a “buffer interface,” which allows other Python code to directly read and write a block of memory. (That is exactly as powerful and scary as it sounds.) In Python 3, buffer()
has been renamed to memoryview()
. (It’s a little more complicated than that, but you can almost certainly ignore the differences.)
☞The
2to3
script will not fix thebuffer()
function by default. To enable this fix, specify -f buffer on the command line when you call2to3
.
Notes | Before | After |
---|---|---|
x = buffer(y)
| x = memoryview(y)
|
Whitespace around commas (explicit)
Despite being draconian about whitespace for indenting and outdenting, Python is actually quite liberal about whitespace in other areas. Within lists, tuples, sets, and dictionaries, whitespace can appear before and after commas with no ill effects. However, the Python style guide states that commas should be preceded by zero spaces and followed by one. Although this is purely an aesthetic issue (the code works either way, in both Python 2 and Python 3), the 2to3
script can optionally fix this for you.
☞The
2to3
script will not fix whitespace around commas by default. To enable this fix, specify -f wscomma on the command line when you call2to3
.
Notes | Before | After |
---|---|---|
a ,b
| a, b
| |
{a :b}
| {a: b}
|
Common idioms (explicit)
There were a number of common idioms built up in the Python community. Some, like the while 1:
loop, date back to Python 1. (Python didn’t have a true boolean type until version 2.3, so developers used 1
and 0 instead.) Modern Python programmers should train their brains to use modern versions of these idioms instead.
☞The
2to3
script will not fix common idioms by default. To enable this fix, specify -f idioms on the command line when you call2to3
.
Notes | Before | After |
---|---|---|
|
| |
type(x) == T
| isinstance(x, T)
| |
type(x) is T
| isinstance(x, T)
| |
|
|
© 2001–11 Mark Pilgrim