Fast Python

The Problem

  • Your Language is FAST
  • Python is SLOW

Python

  • Is a widely used general-purpose, high-level programming language
  • Its design philosophy emphasizes code readability
  • Its syntax allows programmers to express concepts in fewer lines of code than would be possible in languages such as C++ or Java
  • More from Wikipedia

The Zen of Python

import this
  • Beautiful is better than ugly
  • Explicit is better than implicit
  • Simple is better than complex
  • Complex is better than complicated
  • Readability counts

The Problem Again

  • Python Is Simple
  • Python Is Easy
  • You write somehow on Python
  • You assume Python is SLOW

Finding the Solution

  1. Profile and optimize your existing code
  2. Use a C module (or write your own)
  3. Try a JIT-enabled interpreter like Jython or PyPy
  4. Parallelize your workload

Profile and Optimize

Tools

The Function

def calc(operation, a, b):
    mapping = {
        'add': lambda a, b: a + b,
        'subtract': lambda a, b: a - b,
        'multiply': lambda a, b: a * b,
        'divide': lambda a, b: a / b,
    }
    return mapping[operation](a, b)

%timeit

In [16]: %timeit calc('divide', 1, 2)
1000000 loops, best of 3: 699 ns per loop

%prun

In [17]: %prun calc('divide', 1, 2)
         5 function calls in 0.000 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 {built-in method exec}
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 <ipython-input-14-cf58e5c65b2c>:1(calc)
        1    0.000    0.000    0.000    0.000 <ipython-input-14-cf58e5c65b2c>:6(<lambda>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Optimization

def calc(operator, a, b):
    if operator == 'add':
        return a + b
    elif operator == 'subtract':
        return a - b
    elif operator == 'multiply':
        return a * b
    elif operator == 'divide':
        if b == 0:
            raise ValueError('Unable divide to zero.')
        return a / b
    raise NotImplementedError

%timeit

In [18]: %timeit calc('divide', 1, 2)
1000000 loops, best of 3: 258 ns per loop

%prun

In [19]: %prun calc('divide', 1, 2)
         4 function calls in 0.000 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 {built-in method exec}
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 <ipython-input-17-966ad73e599d>:1(calc)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Common Optimizations

  • Do not assign constants inside the loops
  • Use operator library in map/filter/sorted/...

Python 2 optimizations

  • xrange instead of range
  • '{0}'.format(...) instead of '{}'.format(...)

Django optimizations

  • Use .iterator() for big collections
  • Use foreign_id instead of foreign__id

Using a C module

Example. Parsing ISO dates

Slow Python

In [1]: import datetime

In [2]: %timeit datetime.datetime.strptime('2015-06-20', '%Y-%m-%d')
100000 loops, best of 3: 9.16 µs per loop

Example. Parsing ISO dates

Fast Python

In [3]: from ciso8601 import parse_datetime

In [4]: %timeit parse_datetime('2015-06-20')
10000000 loops, best of 3: 144 ns per loop

Result: 63x Faster

Example. Working with JSON

Slow Python

In [9]: import json
In [10]: %timeit json.dumps({'key': 'value', 'date': datetime.date.today().isoformat()})
100000 loops, best of 3: 8.36 µs per loop

Example. Working with JSON

Fast Python

In [11]: import ujson

In [12]: %timeit ujson.dumps({'key': 'value', 'date': datetime.date.today().isoformat()})
100000 loops, best of 3: 4.6 µs per loop

Result: 2x faster

Example. Pickling values

Slow Python. Python 2 only

In [2]: import pickle

In [3]: %timeit pickle.dumps({'key': 'value', 'date': datetime.date.today()})
10000 loops, best of 3: 36.1 µs per loop

Example. Pickling values

Fast Python. Python 2 only

In [4]: import cPickle

In [5]: %timeit cPickle.dumps({'key': 'value', 'date': datetime.date.today()})
100000 loops, best of 3: 8.32 µs per loop

Result: 4.3x faster

Other Libraries

  • Library: slow / fast
  • Memcached: python-memcache / pylibmc
  • HTML Parsing: BeautifulSoup / lxml
  • Decimals: decimal / cdecimal Python 2 only

Write your own C module?

Cython allows,

  • Write Python code that calls back and forth from and to C or C++ code natively at any point
  • Easily tune readable Python code into plain C performance by adding static type declarations
  • Use combined source code level debugging to find bugs in your Python, Cython and C code

Still on Python 2?

Use PyPy

PyPy

PyPy is a fast, compliant alternative implementation of the Python language

  • It Fast. Thanks to its Just-in-Time compiler, Python programs often run faster on PyPy.
  • “If you want your code to run faster, you should probably just use PyPy.” — Guido van Rossum
  • It eats less Memory. Memory-hungry Python programs (several hundreds of MBs or more) might end up taking less space than they do in CPython.
  • It Stackless. PyPy comes by default with support for stackless mode, providing micro-threads for massive concurrency.

PyPy Is Fast

PyPy over the time

PyPy Is Good

  • Supports Django, Twisted and other frameworks
  • All you need to run pypy yourfile.py instead of python yourfile.py
  • But designed to use Python 2.7.9

Parallelize your workload

MapReduce

  • Hadoop
  • Elastic MapReduce
  • Disco
  • And many-many others

Not MapReduce

  • concurrent
  • multiprocessing
  • eventlet / gevent
  • asyncio
  • Celery / rq

Questions?