Introducing the California Civic Data Coalition

We’re here with two Django applications ready made to make California campaign finance data easier to access

Hello world.

We are the California Civic Data Coalition, a loosely coupled team of reporters and developers from the Los Angeles Times Data Desk, The Center for Investigative Reporting and Stanford’s new Computational Journalism Program.

Our aim: To make California’s public data easier for power users to access. Even though we represent rival media outlets, we’d rather compete at analyzing the data than downloading and parsing it.

Our inspiration: Raw data from CAL-ACCESS, the state of California’s campaign finance and lobbying activity database, is being published online for the first time.

Our opportunity: A two-day summit sponsored by OpenNews last month where we sprinted on two new open-source tools we’re ready to release today.

They are:

  • django-calaccess-raw-data: A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State’s CAL-ACCESS database

  • django-calaccess-campaign-browser: A Django app to refine, review and republish campaign finance data drawn from the California Secretary of State’s CAL-ACCESS database

Both are designed and packaged according to our experimental “pluggable data” method, which you can read about at greater length here. But here’s how to get hacking as soon as possible.

Plugging in

Assuming you have a Django project already setup, installation is simple.

$ pip install django-calaccess-raw-data

Update your settings.py:

DEBUG = False
INSTALLED_APPS = (
	....
    'calaccess_raw',
)

Currently we only support MySQL databases that allow bulk loading via LOAD DATA INFILE (that might sound annoying but it’s pretty handy), so make sure you have that configured in settings.py as well.

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'my_calaccess_db',
        'USER': 'username',
        'PASSWORD': 'password',
        'HOST': 'localhost',
        'PORT': '3306',
        # Here's the thing we're talking about
        'OPTIONS': {
            'local_infile': 1,
        }
    }
}

Now, sync your database and download that data:

$ python manage.py syncdb
$ python manage.py downloadcalaccessraw
$ python manage.py runserver

You’ve just installed 76 database tables and nearly 35 million records, including all of the campaign finance and lobbying activity records collected by California government stretching back more than a decade. Visit http://localhost:8000 and you can start exploring them right away.

admin.png

Taking it to the next level

We built django-calaccess-raw-data for folks who wanted to build applications on top of CAL-ACCESS. It doesn’t provide much abstraction, and still comes with a bring-your-own-analysis prerequisite, but it makes the database easier to consume.

We also wanted to build a secondary tool to help folks move more quickly. That’s where django-calaccess-campaign-browser comes in. It goes the next step and begins to clean, regroup, filter and transform the massive, hairy state database into something more legible. Installation is just as simple.

$ pip install django-calaccess-campaign-browser

Update your settings.py:

DEBUG = False
INSTALLED_APPS = (
    ....
    'calaccess_raw', # Note that calaccess_raw is a dependency!
    'calaccess_campaign_browser',
)

Now, sync your database and build the new, associated tables:

$ python manage.py syncdb
$ python manage.py buildcalaccesscampaignbrowser
$ python manage.py runserver

homepage.png

The campaign browser now provides a simple interface to look up individual filers and search for individual campaign contributions. You can search for a candidate and see all of their associated committees they created to run for a specific office. And if you want the data for that specific committee, all you have to do is click the download tab and select your preferred format.

How the calaccess campaign browser interface works

This code base is still a work in progress, however, and its analysis should be considered as provisional until it is further tested and debugged. We’re working better map out the state’s complex system and bulletproof our figures, but we’re not there yet.

Where you come in

This release represents a milestone for our team, but we still have a lot of work to do. This includes but is not limited to:

  • Bulletproofing the analysis process of the campaign browser
  • Expanding our documentation to more fully explain the contents of the raw CAL-ACCESS database
  • Bringing the campaign browser’s approach to the lobbying activity data also provided by CAL-ACCESS (Already underway but far from complete at django-calaccess-lobbying-browser)
  • And, most importantly, generating journalism that demonstrates the power of automating away access to this valuable data set.

Whether you’re a California journalist or developer passionate about our mission, or a curious person who’s looking to contribute, we’d got plenty of tickets for you.

Keep an eye out on the California Civic Data Coalition website for more updates on our progress.