Working with Engines and Connections
A connection pool is a standard technique used to maintain long running connections in memory for efficient re-use, as well as to provide management for the total number of connections an application might use simultaneously.
Particularly for server-side web applications, a connection pool is the standard way to maintain a “pool” of active database connections in memory which are reused across requests.
SQLAlchemy includes several connection pool implementations
which integrate with the Engine
. They can also be used
directly for applications that want to add pooling to an otherwise
plain DBAPI approach.
The Engine
returned by the
create_engine()
function in most cases has a QueuePool
integrated, pre-configured with reasonable pooling defaults. If
you’re reading this section only to learn how to enable pooling - congratulations!
You’re already done.
The most common QueuePool
tuning parameters can be passed
directly to create_engine()
as keyword arguments:
pool_size
, max_overflow
, pool_recycle
and
pool_timeout
. For example:
engine = create_engine('postgresql://me@localhost/mydb',
pool_size=20, max_overflow=0)
In the case of SQLite, the SingletonThreadPool
or
NullPool
are selected by the dialect to provide
greater compatibility with SQLite’s threading and locking
model, as well as to provide a reasonable default behavior
to SQLite “memory” databases, which maintain their entire
dataset within the scope of a single connection.
All SQLAlchemy pool implementations have in common
that none of them “pre create” connections - all implementations wait
until first use before creating a connection. At that point, if
no additional concurrent checkout requests for more connections
are made, no additional connections are created. This is why it’s perfectly
fine for create_engine()
to default to using a QueuePool
of size five without regard to whether or not the application really needs five connections
queued up - the pool would only grow to that size if the application
actually used five connections concurrently, in which case the usage of a
small pool is an entirely appropriate default behavior.
The usual way to use a different kind of pool with create_engine()
is to use the poolclass
argument. This argument accepts a class
imported from the sqlalchemy.pool
module, and handles the details
of building the pool for you. Common options include specifying
QueuePool
with SQLite:
from sqlalchemy.pool import QueuePool
engine = create_engine('sqlite:///file.db', poolclass=QueuePool)
Disabling pooling using NullPool
:
from sqlalchemy.pool import NullPool
engine = create_engine(
'postgresql+psycopg2://scott:tiger@localhost/test',
poolclass=NullPool)
All Pool
classes accept an argument creator
which is
a callable that creates a new connection. create_engine()
accepts this function to pass onto the pool via an argument of
the same name:
import sqlalchemy.pool as pool
import psycopg2
def getconn():
c = psycopg2.connect(username='ed', host='127.0.0.1', dbname='test')
# do things with 'c' to set up
return c
engine = create_engine('postgresql+psycopg2://', creator=getconn)
For most “initialize on connection” routines, it’s more convenient
to use the PoolEvents
event hooks, so that the usual URL argument to
create_engine()
is still usable. creator
is there as
a last resort for when a DBAPI has some form of connect
that is not at all supported by SQLAlchemy.
To use a Pool
by itself, the creator
function is
the only argument that’s required and is passed first, followed
by any additional options:
import sqlalchemy.pool as pool
import psycopg2
def getconn():
c = psycopg2.connect(username='ed', host='127.0.0.1', dbname='test')
return c
mypool = pool.QueuePool(getconn, max_overflow=10, pool_size=5)
DBAPI connections can then be procured from the pool using the Pool.connect()
function. The return value of this method is a DBAPI connection that’s contained
within a transparent proxy:
# get a connection
conn = mypool.connect()
# use it
cursor = conn.cursor()
cursor.execute("select foo")
The purpose of the transparent proxy is to intercept the close()
call,
such that instead of the DBAPI connection being closed, it’s returned to the
pool:
# "close" the connection. Returns
# it to the pool.
conn.close()
The proxy also returns its contained DBAPI connection to the pool when it is garbage collected, though it’s not deterministic in Python that this occurs immediately (though it is typical with cPython).
The close()
step also performs the important step of calling the
rollback()
method of the DBAPI connection. This is so that any
existing transaction on the connection is removed, not only ensuring
that no existing state remains on next usage, but also so that table
and row locks are released as well as that any isolated data snapshots
are removed. This behavior can be disabled using the reset_on_return
option of Pool
.
A particular pre-created Pool
can be shared with one or more
engines by passing it to the pool
argument of create_engine()
:
e = create_engine('postgresql://', pool=mypool)
Connection pools support an event interface that allows hooks to execute
upon first connect, upon each new connection, and upon checkout and
checkin of connections. See PoolEvents
for details.
The connection pool has the ability to refresh individual connections as well as its entire set of connections, setting the previously pooled connections as “invalid”. A common use case is allow the connection pool to gracefully recover when the database server has been restarted, and all previously established connections are no longer functional. There are two approaches to this.
The most common approach is to let SQLAlchemy handle disconnects as they
occur, at which point the pool is refreshed. This assumes the Pool
is used in conjunction with a Engine
. The Engine
has
logic which can detect disconnection events and refresh the pool automatically.
When the Connection
attempts to use a DBAPI connection, and an
exception is raised that corresponds to a “disconnect” event, the connection
is invalidated. The Connection
then calls the Pool.recreate()
method, effectively invalidating all connections not currently checked out so
that they are replaced with new ones upon next checkout:
from sqlalchemy import create_engine, exc
e = create_engine(...)
c = e.connect()
try:
# suppose the database has been restarted.
c.execute("SELECT * FROM table")
c.close()
except exc.DBAPIError, e:
# an exception is raised, Connection is invalidated.
if e.connection_invalidated:
print "Connection was invalidated!"
# after the invalidate event, a new connection
# starts with a new Pool
c = e.connect()
c.execute("SELECT * FROM table")
The above example illustrates that no special intervention is needed, the pool continues normally after a disconnection event is detected. However, an exception is raised. In a typical web application using an ORM Session, the above condition would correspond to a single request failing with a 500 error, then the web application continuing normally beyond that. Hence the approach is “optimistic” in that frequent database restarts are not anticipated.
An additional setting that can augment the “optimistic” approach is to set the pool recycle parameter. This parameter prevents the pool from using a particular connection that has passed a certain age, and is appropriate for database backends such as MySQL that automatically close connections that have been stale after a particular period of time:
from sqlalchemy import create_engine
e = create_engine("mysql://scott:tiger@localhost/test", pool_recycle=3600)
Above, any DBAPI connection that has been open for more than one hour will be invalidated and replaced,
upon next checkout. Note that the invalidation only occurs during checkout - not on
any connections that are held in a checked out state. pool_recycle
is a function
of the Pool
itself, independent of whether or not an Engine
is in use.
At the expense of some extra SQL emitted for each connection checked out from the pool, a “ping” operation established by a checkout event handler can detect an invalid connection before it’s used:
from sqlalchemy import exc
from sqlalchemy import event
from sqlalchemy.pool import Pool
@event.listens_for(Pool, "checkout")
def ping_connection(dbapi_connection, connection_record, connection_proxy):
cursor = dbapi_connection.cursor()
try:
cursor.execute("SELECT 1")
except:
# optional - dispose the whole pool
# instead of invalidating one at a time
# connection_proxy._pool.dispose()
# raise DisconnectionError - pool will try
# connecting again up to three times before raising.
raise exc.DisconnectionError()
cursor.close()
Above, the Pool
object specifically catches DisconnectionError
and attempts
to create a new DBAPI connection, up to three times, before giving up and then raising
InvalidRequestError
, failing the connection. This recipe will ensure
that a new Connection
will succeed even if connections
in the pool have gone stale, provided that the database server is actually running. The expense
is that of an additional execution performed per checkout. When using the ORM Session
,
there is one connection checkout per transaction, so the expense is fairly low. The ping approach
above also works with straight connection pool usage, that is, even if no Engine
were
involved.
The event handler can be tested using a script like the following, restarting the database server at the point at which the script pauses for input:
from sqlalchemy import create_engine
e = create_engine("mysql://scott:tiger@localhost/test", echo_pool=True)
c1 = e.connect()
c2 = e.connect()
c3 = e.connect()
c1.close()
c2.close()
c3.close()
# pool size is now three.
print "Restart the server"
raw_input()
for i in xrange(10):
c = e.connect()
print c.execute("select 1").fetchall()
c.close()
sqlalchemy.pool.
Pool
(creator, recycle=-1, echo=None, use_threadlocal=False, logging_name=None, reset_on_return=True, listeners=None, events=None, _dispatch=None, _dialect=None)¶Bases: sqlalchemy.log.Identified
Abstract base class for connection pools.
__init__
(creator, recycle=-1, echo=None, use_threadlocal=False, logging_name=None, reset_on_return=True, listeners=None, events=None, _dispatch=None, _dialect=None)¶Construct a Pool.
Parameters: |
|
---|
connect
()¶Return a DBAPI connection from the pool.
The connection is instrumented such that when its
close()
method is called, the connection will be returned to
the pool.
dispose
()¶Dispose of this pool.
This method leaves the possibility of checked-out connections remaining open, as it only affects connections that are idle in the pool.
See also the Pool.recreate()
method.
recreate
()¶Return a new Pool
, of the same class as this one
and configured with identical creation arguments.
This method is used in conjunection with dispose()
to close out an entire Pool
and create a new one in
its place.
unique_connection
()¶Produce a DBAPI connection that is not referenced by any thread-local context.
This method is different from Pool.connect()
only if the
use_threadlocal
flag has been set to True
.
sqlalchemy.pool.
QueuePool
(creator, pool_size=5, max_overflow=10, timeout=30, **kw)¶Bases: sqlalchemy.pool.Pool
A Pool
that imposes a limit on the number of open connections.
QueuePool
is the default pooling implementation used for
all Engine
objects, unless the SQLite dialect is in use.
__init__
(creator, pool_size=5, max_overflow=10, timeout=30, **kw)¶Construct a QueuePool.
Parameters: |
|
---|
connect
()¶Return a DBAPI connection from the pool.
The connection is instrumented such that when its
close()
method is called, the connection will be returned to
the pool.
unique_connection
()¶Produce a DBAPI connection that is not referenced by any thread-local context.
This method is different from Pool.connect()
only if the
use_threadlocal
flag has been set to True
.
sqlalchemy.pool.
SingletonThreadPool
(creator, pool_size=5, **kw)¶Bases: sqlalchemy.pool.Pool
A Pool that maintains one connection per thread.
Maintains one connection per each thread, never moving a connection to a thread other than the one which it was created in.
Options are the same as those of Pool
, as well as:
Parameters: | pool_size – The number of threads in which to maintain connections at once. Defaults to five. |
---|
SingletonThreadPool
is used by the SQLite dialect
automatically when a memory-based database is used.
See SQLite.
__init__
(creator, pool_size=5, **kw)¶sqlalchemy.pool.
AssertionPool
(*args, **kw)¶Bases: sqlalchemy.pool.Pool
A Pool
that allows at most one checked out connection at
any given time.
This will raise an exception if more than one connection is checked out at a time. Useful for debugging code that is using more connections than desired.
Changed in version 0.7: AssertionPool
also logs a traceback of where
the original connection was checked out, and reports
this in the assertion error raised.
sqlalchemy.pool.
NullPool
(creator, recycle=-1, echo=None, use_threadlocal=False, logging_name=None, reset_on_return=True, listeners=None, events=None, _dispatch=None, _dialect=None)¶Bases: sqlalchemy.pool.Pool
A Pool which does not pool connections.
Instead it literally opens and closes the underlying DB-API connection per each connection open/close.
Reconnect-related functions such as recycle
and connection
invalidation are not supported by this Pool implementation, since
no connections are held persistently.
sqlalchemy.pool.
StaticPool
(creator, recycle=-1, echo=None, use_threadlocal=False, logging_name=None, reset_on_return=True, listeners=None, events=None, _dispatch=None, _dialect=None)¶Bases: sqlalchemy.pool.Pool
A Pool of exactly one connection, used for all requests.
Reconnect-related functions such as recycle
and connection
invalidation (which is also used to support auto-reconnect) are not
currently supported by this Pool implementation but may be implemented
in a future release.
Any PEP 249 DB-API module can be “proxied” through the connection
pool transparently. Usage of the DB-API is exactly as before, except
the connect()
method will consult the pool. Below we illustrate
this with psycopg2
:
import sqlalchemy.pool as pool
import psycopg2 as psycopg
psycopg = pool.manage(psycopg)
# then connect normally
connection = psycopg.connect(database='test', username='scott',
password='tiger')
This produces a _DBProxy
object which supports the same
connect()
function as the original DB-API module. Upon
connection, a connection proxy object is returned, which delegates its
calls to a real DB-API connection object. This connection object is
stored persistently within a connection pool (an instance of
Pool
) that corresponds to the exact connection arguments sent
to the connect()
function.
The connection proxy supports all of the methods on the original
connection object, most of which are proxied via __getattr__()
.
The close()
method will return the connection to the pool, and the
cursor()
method will return a proxied cursor object. Both the
connection proxy and the cursor proxy will also return the underlying
connection to the pool after they have both been garbage collected,
which is detected via weakref callbacks (__del__
is not used).
Additionally, when connections are returned to the pool, a
rollback()
is issued on the connection unconditionally. This is
to release any locks still held by the connection that may have
resulted from normal activity.
By default, the connect()
method will return the same connection
that is already checked out in the current thread. This allows a
particular connection to be used in a given thread without needing to
pass it around between functions. To disable this behavior, specify
use_threadlocal=False
to the manage()
function.
sqlalchemy.pool.
manage
(module, **params)¶Return a proxy for a DB-API module that automatically pools connections.
Given a DB-API 2.0 module and pool management parameters, returns a proxy for the module that will automatically pool connections, creating new connection pools for each distinct set of connection arguments sent to the decorated module’s connect() function.
Parameters: |
|
---|
sqlalchemy.pool.
clear_managers
()¶Remove all current DB-API 2.0 managers.
All pools and connections are disposed.