Is asyncio stack ready

for web development?

I am…

  • Igor Davydenko
  • Python / JavaScript developer
  • Making web applications for last 14 years
  • Making them primarily on Python for last 10 years

Agenda, kind of

My last to date Python project

  • Written on aiohttp
  • Works at production for 15+ months
  • Project for signing and sharing legal documents in Ukraine
  • Sign process done in JavaScript mostly
  • All other necessary stuff done in Python

Why to choose aiohttp?

Over Django / Flask?

Get aboard to hype train

aiohttp to be chosen

When sync frameworks fail

Django request-response cycle

Django request-response cycle

Some general problems with sync flow

  • Serving big amount of concurrent users
  • Handling big amount of connections with data sources
  • Making big amount of requests to external sources

Solving problmes with sync flow

  • Scale horizontally
  • Adding magic: eventlet / gevent
  • Start looking for another solutions
  • ...
  • Switch to Golang

Another solutions

  • Python developers experienced same problems
  • They want to bring a better concurrency in Python
  • asyncio was born
  • asyncio start primarily using for web development

import asyncio

asyncio

  • Added to Python in 3.4
  • Just infrastructure for writing concurrent code
  • Async I/O, event loop, coroutines, and tasks
import asyncio

async def hello_world() -> None:
    print('Hello, world!')

asyncio.run(hello.world())

asyncio is just an infrasctructure

  • asyncio itself is not suitable for web development
  • You need to:
    • have web framework built on top of it: aiohttp.web
    • communicate with data sources: aiopg, asyncpg, aioredis, aio*
    • communicate with external API: aiohttp.client

asyncio stack provides

  • Concurrent code
    • Concurrent view execution
    • Concurrent communication with data sources
    • Concurrent fetching data from external API
  • Deployment without application server
  • async / await all around your code

This why we chose aiohttp

  • Attempt to use as low resources as possible
  • A lot of concurrent requests from users
  • A lot of data to receive (uploaded documents) from users
  • A lot of data to send to external sources
  • Communicate with external APIs

from aiohttp import web

aiohttp.web is not a rocket science

  • Very similar to sync frameworks
  • Init app, setup routes, run the app
from aiohttp import web

async def hello_world(request: web.Request) -> web.Response:
    return web.json_response({'data': 'Hello, world!'})

app = web.Application()
app.router.add_get('/', hello_world)

web.run_app(app)

Deploying without application server

yourapp/__main__.py

import asyncio
import sys

import uvloop

from aiohttp import web

from yourapp.app import create_app

if __name__ == '__main__':
  asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
  app = create_app()
  sys.exit(web.run_app(app))

$ python -m yourapp

Batteries not included

from aiopg.sa import create_engine


async def connect_db(app: web.Application) -> None:
    app['db'] = await create_engine(app['config']['db']['url'])


async def disconnect_db(app: web.Application) -> None:
    db = app.get('db')
    if db:
        db.close()
        await db.wait_closed()


app.on_startup.append(connect_db)
app.on_shutdown.append(disconnect_db)

Views are same old views

This is not a production-ready code!

async def upload_document(request: web.Request) -> web.Response:
    data = await request.post()
    files = {}

    for file in data.values():
        files[file.filename] = file.body
        await upload_to_s3(file.filename, file.body)

    async with request.app['db'].acquire() as conn:
        for file in files:
            await insert_document(conn, file)

    return web.json_response(status=201)

REST API?

  • REST API done manually
    • Validate / transform request data with trafaret
    • Process safe data
    • Return JSON response (mostly empty one)
  • For public API we're used Swagger for docs
  • Maybe there is Django REST Framework for aiohttp, but we didn't aware of

GraphQL?

  • GraphQL built on top of hiku:
    • Read only for fetching data
    • Mutations not yet supported
    • As well as many other neat features
  • If you need more powerful, try graphql-aiohttp

Other?

  • Tasks queue (Kafka)
  • Internal API for communicating with other holding products
  • And more…

Did we really need aiohttp?

Developers are lazy

  • They wanted that framework X covered all cases
  • But each tool / library may be good in one particular case
    • Django is from good to great for newspaper site
    • aiohttp is not

Batteries not included

  • No Django admin
  • There is aiohttp-admin, but c'mon. Same as Flask-Admin vs Django Admin
  • No Django REST Framework
  • No ORM

No ORM

And you might don't need it

query = (
    sa.select([sa.func.max(document_file_table.c.content_length)])
    .select_from(
        select_from
        .outerjoin(
            document_file_table,
            document_file_table.c.document_id == document_table.c.id,
        )
        .outerjoin(
            document_meta_table,
            document_meta_table.c.id == document_file_table.c.meta_id,
        ),
    )
    .where(sa.and_(
        clause,
        document_meta_table.c.is_current.is_(True),
        document_file_table.c.type == DocumentFileType.original,
    ))
)

Evaluate before implement

  • We didn't need admin for managing data
  • We were fine with querying data without ORM
  • We were fine with omiting REST API framework
  • We chose aiohttp cause of concurrency and let see how it payed off

Real life lessons

From running aiohttp in production

aiohttp doesn't push your app structure

  • The freedom is great
  • But for one dev project
  • More devs -> your project became a mess
  • You need to enforce not only coding style, but to conform on structure of your code as well

App structure

  • app
    • subapp
      • db.py
      • enums.py
      • tables.py
      • types.py
      • utils.py
      • views.py
      • validators.py
    • __main__.py
    • app.py
    • config.py
    • routes.py
    • signals.py

Settings

  • The freedom. Part 2 :(
  • You need to agree on how to manage settings for your app by yourself
  • Our setup:
    • Store settings in *.yaml
    • Provide trafaret schema to validate settings
    • Use trafaret-config for reading settings
  • Error in settings? App even didn't start

Expect everything

  • aiohttp is a quite young project
  • You need to expect everything
  • 2 times our code broken after aiohttp update
  • Still used old uvloop version due inability to upgrade to 0.9.1

Tests are the necessity

  • With aiohttp it becames more obvious
  • Try to achieve 85% coverage
  • We started with 5% and each deploy was …
  • You also need to test routine things as well

Typing will save you as well

  • Type annotations indirectly enforce you to write more understandable code
  • Instead of dict you might want to start using namedtuple or dataclass, or at least StructedDict
  • Documentation for your code
  • Your teammates with IDE thank you everyday

CPU Bound code? Run it in executor

  • The power of asyncio in await statement
  • No await – asyncio is not your choice
  • But there is run_in_executor
async def gzip_buffer(buffer: io.BytesIO,
                      *,
                      loop: asyncio.AbstractEventLoop) -> io.BytesIO:
    """Non-blocking way of compressing prepared bytes IO with gzip."""
    return await loop.run_in_executor(None, sync_gzip_buffer, buffer)

aiohttp still in development

  • aiohttp.web sill in semi-active development
  • New feature are coming for sure (like @route decorator)
  • aio-* libs might not cover your case
  • You might need to payback to open-source
  • Like fixing bugs, that blocks you from update to new aiohttp version

web.Application is a context holder

  • It's just a dict
  • You can use your app instance everywhere
  • In web server context
  • In tasks queue context

Where to find aiohttp devs?

  • Your project grows – you need more devs
  • Where to find them?
  • The market offers much more Django / Flask devs, then aiohttp devs
  • Especially it is hard to substitute senior / lead dev

How to grow aiohttp dev?

  • Start with basics: how asyncio works, tests, etc…
  • Asyncio requires time for diving in
  • Continue with basics: read & discuss aiohttp code
  • aiohttp is not a rocket science
  • It may pay dividends later

More lessons

From running aiohttp app in production

Logs are important

Kibana

Metrics are important

Grafana

Dev env == prod env

  • Attempt to make dev env as close as possible to prod env
  • vagga make containers for dev
  • lithos run containers at staging / prod

Profile your app

import cProfile

profiler = cProfile.Profile()

if __name__ == '__main__':
    use_profiler = os.environ.get('USE_PROFILER') == '1'
    if use_profiler:
        profiler.enable()

    app = create_app()
    try:
        sys.exit(web.run_app(app))
    finally:
        if use_profiler:
            profiler.dump_stats('yourapp.prof')
            profiler.disable()

Profile your app

QCacheGrind

Debug your app

What's next for asyncio?

What's new in Python 3.7

  • asyncio.run
  • contextvars support
  • More performance improvements

Competitors

  • As async / await is part of Python, other devs able to make other async libraries
  • curio
  • trio
  • Maybe we are close to curhttp or trhttp :)

Does Python became better after asyncio?

Yes!

Was aiohttp choice worth it?

Yes!

Did I repeat a ride?

I'm not sure

asyncio stack is harder

  • Still new technology
  • Less developers involved
  • More challenges ahead of you

But, asyncio stack is ready

  • Choose wisely
  • Expect everything
  • With more challenges you became a better dev
  • Join the ride!

Questions?

Twitter: @playpausenstop
GitHub: @playpauseandstop