Normally, when the need arises to connect with a relational database from Python, SQLAlchemy and the ORM (Object Relational Mapper) capabilities it comes with, are the natural choice.

Sometimes however, all we want is quick access to an ad-hoc relational database to do some experiments with, to demonstrate/prototype something, or simply to store multiple Pandas DataFrames in a single file with the added benefit of having the schema preserved.
(Side-note: Placing all dfs in a dictionary and pickling or shelving that object would be another way to achieve this latter goal.)

While SQLAlchemy has support for SQLite database files as well, not every environment has the sqlalchemy library readily installed. And in some settings, for example working with a mobile Jupyter Environment on an iPad (with Juno), sqlalchemy can't easily be installed.

In such a situation it can be handy to use SQLite which comes directly bundled with Python via the sqlite3 module.

In this quick post I just want to show how easy it is to use Pandas with SQLlite natively.

Three easy steps

Three steps is all it takes:

  • Open db connection via
    • db_connection = sqlite3.connect(':memory:')
    • Use ':memory:' for a temporary in-memory database or a path to a file e.g. '../data/demo-db.sqlite'
  • Write a table to the database directy from a Pandas object via
    • df.to_sql('demo_table', db_connection, if_exists='fail')
    • Options for if_exists: 'fail','replace','append'
  • Retrieve data from the database via
    • pd.read_sql('SELECT * FROM demo_table', db_connection)
    • In the first argument we can pass any SQL string to be executed against the database

Note: When we supply sqlite3.connect() with a path to a sqlite file that doesn't yet exist it automatically gets created.

And this is already all that's really needed to get started using sqlite from Pandas.

All in one code block for easy copy-paste

import sqlite3
import pandas as pd

# Create sample data
df = pd.DataFrame({"Beautiful Numbers": [42.0, 3.141], 
                   "Meaning": ['The answer to everything', 'Pi']})

# Open connection to in-memory db
db_connection = sqlite3.connect(':memory:')

# Write table to the database
df.to_sql('demo_table', db_connection, if_exists = 'replace')

# Retrieve data from database
sql_query_string = 'SELECT * FROM demo_table'
pd.read_sql(sql_query_string, db_connection)

Possible future additions to this post

  • Add useful utility functions for working with sqlite from pandas
  • E.g. to automatically load a list of table names from the db and store them into DataFrames with matching names.

Reference / Further Reading