Optimising Python code without profiling is like navigating a maze blindfolded - you might get lucky, but you’ll probably waste time.
Profiling is the process of measuring how your code performs, whether it’s tracking execution time or memory usage. Without it, you’re guessing where the bottlenecks are, and guesses are often wrong. In this post, we’ll explore how to profile Python code for both time and memory usage, interpret the results, and use that data to make your code faster and more efficient.
Why Profile Python Code?#
Profiling helps you answer critical questions about your code’s performance:
- Is your function slow because it’s doing too much work, or because it’s calling an inefficient library?
- Is your code using too much memory, and if so, where is that memory being allocated?
- Are there hidden inefficiencies in your algorithms or data structures?
By profiling your code, you can focus your optimisation efforts where they matter most-saving time and frustration.
Built-in Python Profiling Tools#
Python provides two powerful built-in tools for profiling:
cProfile: Measures execution time and function call statistics.tracemalloc: Tracks memory allocations and identifies memory leaks.
Let’s combine these tools into a reusable context manager that profiles both time and memory in a single run.
The profile_code Context Manager#
Here’s a context manager that profiles execution time and memory usage:
import cProfile
import io
import linecache
import pstats
import tracemalloc
from contextlib import contextmanager
from textwrap import dedent
from typing import Literal
@contextmanager
def profile_code(include: tuple[Literal["time", "memory"], ...] = ("time", "memory")):
"""
Profile execution time using cProfile and tracemalloc.
Args:
include: A tuple of strings specifying what to profile ("time", "memory", or both).
"""
print("=" * 60)
print(f"{' & '.join(include).upper()} PROFILING")
print("-" * 60)
# Create profiler
profiler = cProfile.Profile()
# Start profiling
if "memory" in include:
tracemalloc.start()
if "time" in include:
profiler.enable()
yield
# Stop profiling and print results
if "time" in include:
profiler.disable()
# Get execution time statistics
string_io = io.StringIO()
stats = pstats.Stats(profiler, stream=string_io)
_ = stats.strip_dirs()
_ = stats.sort_stats("cumtime")
_ = stats.print_stats(20)
print(dedent(string_io.getvalue()).strip())
if "memory" in include:
# Get memory statistics
current, peak = tracemalloc.get_traced_memory()
print(f"Current memory usage: {current / 1024 / 1024:.2f} MB")
print(f"Peak memory usage: {peak / 1024 / 1024:.2f} MB\n")
# Get top memory allocations
snapshot = tracemalloc.take_snapshot()
tracemalloc.stop()
snapshot = snapshot.filter_traces(
(
tracemalloc.Filter(False, "<frozen importlib._bootstrap>"),
tracemalloc.Filter(False, "<unknown>"),
)
)
top_stats = snapshot.statistics("lineno")
print("Top 10 memory-consuming lines:")
for index, stat in enumerate(top_stats[:10], 1):
frame = stat.traceback[0]
print(
f"#{index}: {frame.filename}:{frame.lineno}: {stat.size / 1024:.1f} KiB"
)
line = linecache.getline(frame.filename, frame.lineno).strip()
if line:
print(f" {line}")
other = top_stats[10:]
if other:
size = sum(stat.size for stat in other)
print(f"{len(other)} other lines: {size / 1024:.1f} KiB")
total = sum(stat.size for stat in top_stats)
print(f"Total allocated size: {total / 1024:.1f} KiB")How It Works#
The profile_code context manager works as follows:
- Start Profiling: When you enter the context manager, it initialises
cProfilefor time profiling andtracemallocfor memory profiling, based on theincludeparameter. - Execute Your Code: The code inside the
withblock runs while being profiled. - Stop Profiling and Print Results: When you exit the context manager, it stops profiling and prints:
- Execution time statistics (top 20 functions by total time).
- Memory usage statistics (current and peak memory usage, top 10 memory-consuming lines).
Example Usage#
Let’s use the profile_code context manager to profile a slow function:
def slow_function(duration):
"""A function that simulates a time-consuming task."""
print(f"Running slow_function for {duration} seconds...")
time.sleep(duration)
print("slow_function finished.")
def fast_function():
"""A function that performs a quick task."""
print("Running fast_function...")
total = 0
for i in range(10000):
total += i
print("fast_function finished.")
def process_data():
"""A function that calls other functions."""
print("Starting data processing...")
slow_function(2)
for _ in range(3):
fast_function()
print("Data processing finished.")
if __name__ == '__main__':
with profile_code():
process_data()This will output:
- The top 20 functions by execution time.
- The current and peak memory usage.
- The top 10 lines where memory is allocated.
Profiling Results#
============================================================
TIME & MEMORY PROFILING
------------------------------------------------------------
Starting data processing...
Running slow_function for 2 seconds...
slow_function finished.
Running fast_function...
fast_function finished.
Running fast_function...
fast_function finished.
Running fast_function...
fast_function finished.
Data processing finished.
20 function calls in 2.038 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 2.038 2.038 test.py:103(process_data)
1 0.000 0.000 2.005 2.005 test.py:87(slow_function)
1 2.005 2.005 2.005 2.005 {built-in method time.sleep}
3 0.032 0.011 0.033 0.011 test.py:94(fast_function)
10 0.000 0.000 0.000 0.000 {built-in method builtins.print}
1 0.000 0.000 0.000 0.000 contextlib.py:145(__exit__)
1 0.000 0.000 0.000 0.000 {built-in method builtins.next}
1 0.000 0.000 0.000 0.000 test.py:12(profile_code)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Current memory usage: 0.02 MB
Peak memory usage: 0.02 MB
Top 10 memory-consuming lines:
#1: /opt/homebrew/Cellar/python@3.13/3.13.11/Frameworks/Python.framework/Versions/3.13/lib/python3.13/pstats.py:230: 2.1 KiB
fragment = fragment[:-1]
#2: /opt/homebrew/Cellar/python@3.13/3.13.11/Frameworks/Python.framework/Versions/3.13/lib/python3.13/pstats.py:229: 1.5 KiB
dict[fragment] = tup
#3: /Users/toby/dev/projects/tobydevlin.com-3.0/test.py:32: 1.4 KiB
profiler.enable()
#4: /opt/homebrew/Cellar/python@3.13/3.13.11/Frameworks/Python.framework/Versions/3.13/lib/python3.13/pstats.py:264: 1.1 KiB
stats_list.append((cc, nc, tt, ct) + func +
#5: /opt/homebrew/Cellar/python@3.13/3.13.11/Frameworks/Python.framework/Versions/3.13/lib/python3.13/pstats.py:547: 1.1 KiB
return os.path.basename(filename), line, name
#6: /opt/homebrew/Cellar/python@3.13/3.13.11/Frameworks/Python.framework/Versions/3.13/lib/python3.13/pstats.py:289: 1.1 KiB
newcallers[func_strip_path(func2)] = caller
#7: /Users/toby/dev/projects/tobydevlin.com-3.0/test.py:46: 1.1 KiB
print(dedent(string_io.getvalue()).strip())
#8: /opt/homebrew/Cellar/python@3.13/3.13.11/Frameworks/Python.framework/Versions/3.13/lib/python3.13/pstats.py:296: 1.0 KiB
newstats[newfunc] = (cc, nc, tt, ct, newcallers)
#9: /Users/toby/dev/projects/tobydevlin.com-3.0/test.py:38: 0.9 KiB
profiler.disable()
#10: /opt/homebrew/Cellar/python@3.13/3.13.11/Frameworks/Python.framework/Versions/3.13/lib/python3.13/cProfile.py:59: 0.8 KiB
entries = self.getstats()
80 other lines: 9.2 KiB
Total allocated size: 21.4 KiBThe most important section is the table sorted by cumtime (cumulative time). This column shows the total time spent in a function, including all the functions it calls. It’s the best indicator of where your program is spending the most time overall.
Top Bottleneck: Look at the first few lines. You can see that
process_datais at the top, but the real workhorse of time consumption isslow_function, which directly callstime.sleep. Thecumtimeof2.005seconds forslow_functionis almost entirely spent in thetime.sleepcall.Function Calls: The
ncallscolumn tells you how many times a function was called.fast_functionwas called 3 times, but its total time is negligible compared toslow_function.
The memory usage for this script is very low (0.02 MB). The “Top 10 memory-consuming lines” are mostly showing memory used by the profiler itself (pstats.py, cProfile.py), not the actual code. For this particular run, memory is not a concern.
Key Takeaways#
In this post, we explored how to profile Python code for time and memory usage. Here’s what you should remember:
- Profiling is essential: It helps you identify bottlenecks in your code so you can optimise the right parts.
- Use
cProfilefor time profiling: It measures execution time and function call statistics. - Use
tracemallocfor memory profiling: It tracks memory allocations and identifies memory leaks. - The
profile_codecontext manager: Combines both tools into a reusable utility for profiling time and memory in a single run. - Always profile before optimising: Don’t guess where the bottlenecks are-let the data guide you.
Try It Yourself!#
Now that you know how to profile your Python code, it’s time to put it into practice:
- Profile your own code: Use the
profile_codecontext manager above to identify bottlenecks in your projects. - Experiment with optimisations: Try different approaches (e.g., list comprehensions, generator expressions) and measure the impact.
- Share your results: Let me know in the comments what you discovered-did profiling reveal any surprises?
Happy profiling!
