Seven lessons I learned teaching data analysis with Python

A San Diego State journalism professor shares her experience with “First Python Notebook”

As a digital journalism educator at San Diego State University, I am always seeking to provide my students with the latest skills to help them be the best journalists they can when they enter the job market. This past spring I had the opportunity to introduce my students to the Python scripting language via the First Python Notebook site developed by Ben Welsh and the California Civic Data Coalition.

I was first introduced to Ben years ago at an online journalism conference. We kept in touch and reconnected last fall when Investigative Reporters and Editors and our School of Journalism and Media Studies co-hosted a two-day Watchdog Workshop. At the event, Ben introduced a group of journalists and SDSU journalism students to Python via an early version of the Coalition’s lesson plan.

As I worked through the lesson with the students I grew excited by the tools it teaches – the pandas analysis library and the Jupyter Notebook development framework. Right then, I knew that I would bring this exercise into my undergraduate class “JMS 430: Digital Journalism.”

This past spring, I had the opportunity to take my students through an expanded version of the tutorial that Ben gave last fall. Here are some of the lessons I learned that may be helpful for other journalism educators interested in using the Coalition’s open-source curriculum.

San Diego State University
Hepner Hall at San Diego State University by Stuart Seeger

1. First Python Notebook is the best place to jump in.

I did searches last fall and this spring on how to apply Python scripting in the journalism context. I couldn’t find a better resource than Ben’s site. Some sites I found were heavy on the technical aspects, others talked about the benefits of Python for journalism, but that was it.

Ben’s site is the best resource for journalists and journalism students as he makes the instructions easy to follow with one data project in a journalism context.

2. Practice the tutorial multiple times and on multiple computers.

When I went through the tutorial on my home and work computer, I had issues caused by different settings and operating systems among other things. Working through these problems allowed me to be better prepared for my students when I introduced it to them on their varying computers.

3. Prepare extra laptops with the necessary applications.

In my case, several of our students have laptops with limited capacity, space or administrative access. So I ended up setting up several Apple laptops with the necessary applications and administrative access so that students could go through the tutorial easily.

It made the threshold to learn a lot lower when the students knew it was not going to be their own computer that they would be using.

4. Allow plenty of time to teach.

The IRE workshop Ben taught fall lasted three to four hours. I thought having two full lectures devoted to the tutorial would be sufficient, but I was in for a surprise.

I went through the tutorial with my students line by line – so I could introduce the concepts and troubleshoot technical issues along the way – which took a lot longer than I imagined.

Revisiting what we did in previous class before jumping into the next day’s work also took time. I am not a newbie to computer programming and languages. Whether you are an expert or novice, it’s best to give yourself more time than not enough.

I ended up devoting two weeks of class time to the project. In hindsight, I wish I would have devoted three weeks instead.

5. Circulate tip sheets for yourself and your students.

As I practiced to teach the class, I drafted a tip sheet along the way for myself of how I went through the steps. It served as a teacher’s guide when I was in the classroom so I could easily look up information about an error, issue or module when a student asked a question.

I also created a shorter version of the tutorial itself, sans screen grabs and trimmed down to just the Python code scripts and major milestones in the class.

Providing the students with both helped students to see the bigger picture as well as a detailed view of how everything was connected. Providing different formats can make a difference for different kinds of learners.

6. Make the tutorial count for class credit.

For class credit, I made the students answer the specific questions Ben laid out in the tutorial. They had to then turn in their answers based on the scripts they ran.

This allowed me to see if they could follow along with the tutorial and perform the necessary steps and actions. Doing this assignment helped the students understand the bigger picture of what the scripts and applications can do in a journalism context versus just going through the motions of running scripts.

7. Remember the first time is only the beginning.

At the end of the two weeks, some of my students had completed the tutorial and other students hadn’t. Those who didn’t either ran out of time, made script errors or had issues with getting the prerequisites installed. Because we were ending the semester, there was not enough time for us to do more. Thus, I wish I would have scheduled more time for the tutorial.

The students who did finish were excited at the outcome. They felt rewarded for all the effort they put into the class. They also had a new perspective on how to explore data and use it in news gathering and reporting. Some of them were craving more Python. That made me happy in knowing that the time we spent was worthwhile.

I am excited to delve deeper into Python, pandas and the Jupyter Notebook so I can go further into my future digital journalism classes.

I am not a Python whiz, but I am willing to learn and try. There is so much power to what it can do. As journalism educators we should introduce our students to the possibilities of what Python can do for news gathering and reporting of important civic issues, one line of code at a time.