MY PYTHON CODE ISN’T WORKING!! We’ve all been there right? This is a series where I’ll share miscellaneous tips I’ve learned for troubleshooting python code. This is aimed at people who are relatively new to python. In this first series, I’d like to cover one of those common things you’ll run into: the traceback.
Reading Python Tracebacks
Many times an error in python code is accompanied by a traceback. If you want to get really good at troubleshooting python programs, you’ll need to become really comfortable with reading a traceback. You should be able to look at a traceback and have a general idea of what’s happening in the traceback. One of the things I always notice when working with people new to python is how puzzled they look when they first see tracebacks.
So let’s work through an example. Consider this script:
import httplib2
def a():
b()
def b():
c()
def c():
d()
def d():
h = httplib2.Http()
h.request(uri=None)
a()
When this script is run we get this traceback:
Traceback (most recent call last):
File "issue.py", line 19, in <module>
a()
File "issue.py", line 5, in a
b()
File "issue.py", line 9, in b
c()
File "issue.py", line 12, in c
d()
File "issue.py", line 16, in d
h.request(uri=None)
File "/Users/jsaryer/.virtualenvs/90a/lib/python2.7/site-packages/httplib2/__init__.py", line 1394, in request
(scheme, authority, request_uri, defrag_uri) = urlnorm(uri)
File "/Users/jsaryer/.virtualenvs/90a/lib/python2.7/site-packages/httplib2/__init__.py", line 206, in urlnorm
(scheme, authority, path, query, fragment) = parse_uri(uri)
File "/Users/jsaryer/.virtualenvs/90a/lib/python2.7/site-packages/httplib2/__init__.py", line 202, in parse_uri
groups = URI.match(uri).groups()
TypeError: expected string or buffer
While this can look intimidating at first, there’s a few basic things to remember when reading a traceback:
- The oldest frame in the stack is at the top, and the newest frame is at the bottom. This means that the bottom of the traceback output is where the uncaught exception was originally raised. This is the opposite of other languages such as java and c/c++ where the first line shows the newest frame (the frame where the uncaught exception originated).
- Pay attention to the filenames associated with each level of the traceback, and pay attention where the frames jump across modules and package “types” (more on this later).
- Read the bottom most line to read the actual exception message.
- Above all, remember that the traceback alone may not be sufficient to understand what went wrong.
So let’s see how we can apply these steps to the traceback above.
First, let’s use the first item: the stack frames go from oldest frame
at the beginning of the output to the newest frame at the bottom. To be
absolutely clear, in the above code, the call chain is:
a() -> b() -> c() -> d() -> httplib2.Http.request
. The oldest stack
frame is associated with the a()
function call (it’s the call the
triggered all the remaining calls), and the newest stack frame is for
httplib2.Http.request
(it’s the call that actually triggered the
exception being raised). Conceptually, you think of a python traceback
as growing downwards, any time something is pushed onto the stack, it is
appended to the output. And when something is popped off the stack, its
output is removed from the end of the stack.
Now let’s apply the second item: pay attention to the filenames
associated with each level of the traceback. Right off the bat we can
see there are two main modules involved in this interaction. There’s
the issue
module, which looks like this in the traceback:
File "issue.py", line 19, in <module>
a()
File "issue.py", line 5, in a
b()
and there’s httplib2, which looks like this in the traceback:
File "/Users/jsaryer/.virtualenvs/90a/lib/python2.7/site-packages/httplib2/__init__.py", line 1394, in request
(scheme, authority, request_uri, defrag_uri) = urlnorm(uri)
There’s a few important observations:
- The length of the filenames. In this case the
issue.py
filename suggests that this originated from our current working directory, hence the relative path. - The error actually occurs in a 3rd party library (the last three lines of the output from the traceback).
We know that an error occurs in a 3rd party library because the location of this library is under the “site-packages” directory. As a rule of thumb, if something is under the “site-packages” directory it’s a third party module (i.e. not something in the python standard library). This is typically where packages installed the pip are replaced (e.g. pip install httplib2).
The second item also says to pay attention to where the frames jump across modules or package “types.” In this traceback we can see that we jump across modules and packages “types” here:
File "issue.py", line 16, in d
h.request(uri=None)
File "/Users/jsaryer/.virtualenvs/90a/lib/python2.7/site-packages/httplib2/__init__.py", line 1394, in request
(scheme, authority, request_uri, defrag_uri) = urlnorm(uri)
In these four lines we can see that we jump from issue.py
to
httplib2
. By jumping across package “types”, I simply mean where we
jump from our modules/packages to either standard library packages or
3rd party packages. From the four lines shown above we can see that by
calling h.request()
we jump into the httplib2
module.
Now let’s apply the third item: Read the bottom most line to read the actual exception message. In our example, the actual exception that’s raised is:
TypeError: expected string or buffer
Admittedly, not the most helpful error message. If we look at the line before this line, we can see the actual line that caused this TypeError:
groups = URI.match(uri).groups()
The two most likely things to cause a TypeError would be a call to
match()
or a call to groups()
. Noticing that uri
arg is seen at
multiple frames in the traceback, our first guess would be that the
value of uri
is causing a TypeError. If we go bottom up until we
don’t see the uri param mentioned, we can see that it’s first
mentioned here:
File "issue.py", line 16, in d
h.request(uri=None)
File "/Users/jsaryer/.virtualenvs/90a/lib/python2.7/site-packages/httplib2/__init__.py", line 1394, in request
(scheme, authority, request_uri, defrag_uri) = urlnorm(uri)
Given that the h.request(uri=None)
comes from our code, this is
probably the first place we should look.
It turns out that the uri
parameter needs to be a string:
h = httplib2.Http()
response = h.request(uri='http://www.google.com')
Now, it doesn’t always work out as nicely as this, but having a basic example helps to serve as a basis for further debugging techniques.