An Analytics Extension for Flask
Flask-Bitmapist is a Flask extension that creates a simple interface to the Bitmapist analytics library. It is easy to set up and start collecting data on how users are interacting with an application, then to work with that data and build out cohorts to learn more about user engagement and retention.
What We Wanted and What We Built
For the most part, current analytics libraries were not meeting our needs. Available options tended to be prohibitively expensive, particularly for use with small projects. Otherwise, they were too resource-intensive to implement for repeated use across projects to be appealing.
We found and liked the Bitmapist library because it addresses the first issue in an open-source way. We were also excited about the bitmap-based approach they took, which allows bitwise operations to be performed on the data; this means that the library's operations can be extremely fast and lightweight. You can read more about the what, the why, and the how of Bitmapist in the author's Medium post.
Still, we wanted something that we could essentially drop into a project each time without having to do much else. Most of our projects to date have been built using the Flask microframework for Python, so we decided to write a custom Flask extension.
Thus Flask-Bitmapist was born: a Flask extension that, once installed, requires only three lines of code (one line to import, two lines to initialize) to be ready to register events (i.e., to record given user actions and when they occur).
How to Get Set Up
One of our major goals for Flask-Bitmapist was for it to be easy to set up.
Add it to a project with pip install flask-bitmapist
as you would any other package. Then, import and initialize Flask-Bitmapist with your Flask app, and you are ready to start registering events.
from flask import Flask
from flask_bitmapist import FlaskBitmapist
app = Flask(__name__)
# Initialize Flask-Bitmapist with the Flask app
flaskbitmapist = FlaskBitmapist()
flaskbitmapist.init_app(app)
Note: Since Bitmapist uses Redis to record the registered user events, Redis must also be running for Flask-Bitmapist to work.
How to Register Events
Once you have set up Flask-Bitmapist, you get not one, not two, but four ways to register events.
If you are using Flask-Login, user login/logout events will be registered automatically. Add a mixin to your user model to register changes made to database objects. Use a decorator to register an event for a given view or function call, or call a function directly wherever else you might want to register an event.
Examples of when you might want to use each method:
- Flask-Login: You are already using Flask-Login for user session management, and you want to record when users are logging in and out of your application.
- ORM mixin: You are using SQLAlchemy, and you want to record when changes are being made to (user) objects in the database. (See the “Moving Forward” section for details about additional ORM support.)
- Decorator: You want to register an event for whenever a user accesses a particular view (e.g., to render a template, submit a form, access an API, etc.).
- Function: You want to register events at multiple points throughout a single process, or you want to register different events based on branching or conditional results within some process.
Flask-Login
Flask-Login is a popular library for user session management in Flask applications, so it was important for us to have built-in user login/logout event registration in Flask-Bitmapist.
Setting up Flask-Bitmapist to work with Flask-Login requires nothing beyond setting up both as you normally would: initialize Flask-Bitmapist and Flask-Login with the Flask app, create your User class (e.g., with the UserMixin), and Flask-Bitmapist will automatically listen for the user_logged_in
and user_logged_out
signals from Flask-Login to register the corresponding events.
from flask import Flask
from flask_bitmapist import FlaskBitmapist
from flask_login import LoginManager
app = Flask(__name__)
# Initialize Flask-Login with the Flask app
login_manager = LoginManager()
login_manager.init_app(app)
# Initialize Flask-Bitmapist with the Flask app
flaskbitmapist = FlaskBitmapist()
flaskbitmapist.init_app(app)
from flask_login import UserMixin, login_user, logout_user
class User(UserMixin):
pass
user = User() # needs id
login_user(user)
logout_user()
Note: The registered event names will be 'user:logged_in
' and 'user:logged_out
' for users logging in and out, respectively.
ORM Mixin
As with session management, we wanted event registration for changes to objects in the database to be straightforward.
Import the Bitmapistable
mixin and add it to your User class definition. Flask-Bitmapist will then automatically register the appropriate event whenever a user is created, updated, or deleted.
from flask_sqlalchemy import SQLAlchemy
from flask_bitmapist import Bitmapistable
db = SQLAlchemy(app) # with `app` as the initialized Flask app
class User(db.Model, Bitmapistable):
pass
Note: We started with the ORM we most commonly use (SQLAlchemy), but we are looking to add support for others (Peewee, MongoEngine, etc.) in the future.
Note: The mixin is currently intended for the application's User (or equivalent) class only, pending implementation of a flexible (i.e., not dependent on using Flask-Login) means of retrieving the current session's user id.
Decorator & Function
Import the mark
decorator function from flask_bitmapist
and attach it to the desired function, providing the event name and the id of the current user (e.g., Flask-Login's current_user
).
from flask_bitmapist import mark
@mark('user:reset_password', user.id) # with `user` as the current user
def reset_password():
pass
Import the mark_event
function from flask_bitmapist
and call it with the event name and user id.
from flask_bitmapist import mark_event
mark_event('user:completed_process', user.id) # with `user` as the current user
Note: The event name structure ~ 'user:action' is merely a convention. You could, potentially, name events any number of ways, including breaking them down by domain (e.g., 'support:bug_reported') if you so wished.
Note: In most cases, you will probably want the decorator and function to use the current time for registering the event. You can, however, specify a datetime for the event, by passing it with the now
argument.
How to Use the Data
Flask-Bitmapist provides multiple ways to retrieve and process the data as well.
You can get all of the users registered with an event for a given time/time scale using the get_event_data
function. To get a user cohort based on multiple events over a given time frame/time scale, the get_cohort
function will serve. Additionally, Flask-Bitmapist by default registers a blueprint with a sample interface for generating a heatmap, a table constructed to visually present the cohort data based on the selected inputs.
Single Event at a Single Time: get_event_data()
The get_event_data
function is what you will want to use to get the registered events for a single event (e.g., 'user:spilled_soda') at a single point in time. The optional time_group
argument determines the span of time to include (i.e., get the events spanning one day, a week, a month, or a year) and defaults to a scale of 'days'; the optional now
argument determines when to pull the events from (e.g., right now, last month, or October 21, 2014).
bitmapist_events = get_event_data('user:spilled_soda', time_group='days', now=datetime.utcnow())
The function returns a Bitmapist events collection object, which allows you to use Bitmapist's built-in operations (e.g., BitOpAnd
, BitOpOr
, etc.) to further combine with other Bitmapist event collections. Or, if you prefer, you can cast the collection to a list to get just the list of user ids for those users registered with the event within the given time frame.
spilled_popcorn = get_event_data('user:spilled_popcorn')
spilled_cheesepuffs = get_event_data('user:spilled_cheesepuffs')
bitmapist_events = BitOpOr(spilled_popcorn, spilled_cheesepuffs)
user_ids = list(bitmapist_events)
The resulting user_ids
will be a list of the ids for the users who either spilled popcorn OR spilled cheese puffs (today, with default time_group
of 'days' and default now
of datetime.utcnow()
).
Multiple Events Over Time: get_cohort()
The get_cohort
function takes, at minimum, two event names that form the foundation of a cohort. Optional arguments for additional events (to further refine the cohort's users), time group, and the size of the returned results are available as well. The function returns a tuple containing the cohort with its dates and totals; the returned cohort is a list of lists, with the items in each nested list containing the count of users who were registered with both the primary event and the secondary event (and any additional events provided) at that time. This perhaps makes more sense with an example.
Say you wanted to look at users who had ordered products from infomercials in the last six months and then, over the next four months, made the first of two easy payments AND either made their second easy payment OR returned the product. The first two event names would be passed as required positional arguments to get_cohort
('user:ordered_infomercial_product' and 'user:made_first_easy_payment' in the example below).
The latter two events would be passed as the optional additional_events
argument, with their corresponding operations (see assignment to additional_events
in example). The number of rows corresponds to how far back to get results (num_rows = 6
for looking at the last six months), and the number of columns corresponds to how far forward from each date to get results (num_cols = 4
for looking at four months from whenever the user ordered the product).
event1 = 'user:ordered_infomercial_product'
event2 = 'user:made_first_easy_payment'
additional_events = [
{'name': 'user:made_second_easy_payment', 'op': 'and'},
{'name': 'user:returned_product', 'op': 'or'}
]
cohort, cohort_dates, cohort_totals = get_cohort(
event1, event2, additional_events=additional_events, time_group='months', num_rows=6, num_cols=4
)
The get_cohort
function returns the cohort, the cohort dates, and the cohort totals.
- The cohort itself is a list of lists, structured like a matrix or a table, with each element containing the number of users who, from the initial event (e.g., ordered product in May 2016), were registered with the subsequent events (i.e., made first payment AND made second payment OR returned product) for the given date offset (e.g., + 2 months). See the heatmap example below for a visual representation; that table is laid out based on the cohort data's structure.
- The cohort dates will be the list of dates calculated for defining the cohort. If today were September 12, 2016, the cohort dates returned would be (the first day of) April 2016, May 2016, June 2016, July 2016, August 2016, and September 2016.
- The cohort totals will be the count of users for the primary event at each of the above cohort dates. For example, if the first cohort date were April 2016, and 416 users ordered a product in April 2016, the first cohort total will be 416.
Note: Currently, cohort generation prioritizes OR
in the order of operations, and allows OR
to operate only on its immediate predecessor. For example, first AND second AND third OR fourth
will be handled as first AND second AND (third OR fourth)
, since progressing through the chain of events otherwise would give ORs too much weight, e.g., (first AND second AND third) OR fourth
.
Visualizing a Cohort: Heatmap
By default, Flask-Bitmapist will register a bitmapist blueprint - this can be disabled by setting the BITMAPIST_DISABLE_BLUEPRINT
configuration option to True
. With the blueprint enabled, visiting /bitmapist/cohort
will provide you with a starter interface to retrieve a user cohort based on given events (pulled from your existing Bitmapist event names in Redis) and selected settings. The generated cohort is used to build a heatmap to display the data and help you visualize the results.
Moving Forward
But wait - there’s more! We've got big goals for Flask-Bitmapist moving forward. In the near future, for example, we are planning to add:
- A more robust and flexible object-oriented back-end structure
- Broader ORM support, to include multiple ORM options (Peewee, MongoEngine, etc.)
- Better ORM support, for flexible user id configuration (i.e., so that using the mixin with non-user models will not be dependent on a specific user session manager)
If you'd like to read through the code, or if you'd like to contribute, check out the project on GitHub. We'd also like to see a similar Django integration for Bitmapist, so hit us up if you're interested in helping out.