AI Basics

Other formats: Markdown version

In this optional tutorial, we will explore how to use AI assistants to help create data visualizations. Specifically, we can collaborate with AI or "large language models" (LLMs) to build visualizations more quickly. However, we will do so while still applying the design knowledge and multi-disciplinary lenses we've developed together throughout the course.

Contents

Motivation
Role of AI
Prepare
Initial Request
Local Run
First Iteration
Refining Labels
Final Polish
Reflection
Next

Motivation

So far, we have been building data visualizations through a combination of pre-built charting libraries (like through matplotlib) and custom drawing with technologies like Sketchingpy. However, sometimes it can be a lot of code. Iteration times between one idea to the next can be extensive. What if we could get through those cycles faster? Maybe we could explore more design possibilities!

That in mind, we next introduce AI assistants which can help you implement faster. That said, these tools are best when used to accelerate that iteration process we've already seen in the class. In other words, as we will see in a moment, there is still value that comes from building a prototype and experiencing it (as a human) to further improve design choices.

Let's try practicing going through this cycle by revisiting the moose and wolves example from the very start of class. However, this time, let's see what it feels like to be accelerated by AI while still applying our critical design perspective.

Note that this first tutorial will discuss the use of chat in order to build visualizations. We will expand to dedicated coding assistants and agentic workflows in the next tutorial.

Role of AI

AI assistants like Claude, ChatGPT, or other language models can help generate visualization code quickly. However, they don't necessarily know about design and can't experience visualizations the same way you (and your human audiences) will. A lot of the heavy lifting still comes from guiding solutions to a good conclusion, taking advantage of the techniques we discussed in class. Put another way, they work best when you bring your visualization expertise to direct the conversation which involves:

Knowing what kinds of visualization will best serve your data and purpose or what you might want to explore.
Recognizing when a visualization generated has issues with readability, accessibility, or other design considerations like poor gestalt.
Providing specific feedback to guide the output in the process of iterating towards a better result.

All that in mind, think of an AI assistant as an avenue to accelerate implementation quickly but ask if that moving faster is always better. How could it be an issue? Well, this depends on the task at hand but consider that, when writing code yourself, you are confronted in the implementation process with lots of questions and decisions because the code forces you to be explicit about what you want. Instead, when AI helps you move quickly, some of those choices might be made for you possibly without you realizing. In other words, you need to be even more conscientious of all of the concepts from the class in order to ensure your ideas are layered in. Be careful that AI doesn't leave you with less control.

Prepare

We'll be using matplotlib to create a visualization of predator-prey dynamics between wolves and moose populations at Isle Royale National Park, returning to an example from Lesson 1. This gives us a nice dataset which has some dimensionality but still tame enough that we can take advantage of some pre-built charts. The goal here is to create a split bar plot showing both wolf and moose populations over time, with wolves extending to the left and moose extending to the right.

Before we can get into it, we do need access to an AI assistant that can generate Python for data visualization. I used Claude which typically offers strong performance in testing but other options include ChatGPT, Gemini, and Mistral AI. If in need of a free option, Mistral has also performed quite well if given a CSV file with the Isle Royale data.

Note: The exact output you get from an AI assistant may differ slightly from what's shown here. That's completely normal and expected! The key learning objective here is to experience the process of iteration: making a request, reviewing the output, identifying what to improve, and refining your prompts accordingly.

Initial Request

Let's start by making our first request to the assistant, just to get a basic initial plot. Try copying this prompt into your chat window with your chosen AI:

Hello! Can you please use matplotlib to generate a graph of wolves vs moose populations as a split bar plot (one going to the left and the other going to the right) where the axes have different colors to correspond to the bar colors for the two series? Please have the left facing axis for wolves go from 0 to 50 and the right facing axis for moose go from 0 to 2500. See https://www.nps.gov/isro/learn/nature/wolf-moose-populations.htm for data.

Check the work: Run the code provided by the AI assistant. You should see a visualization with:

Blue bars extending to the left representing wolf populations
Orange (or similar color) bars extending to the right representing moose populations
A timeline showing years (likely 1980-2019 or similar range)
Color-coordinated axes with scales from 0-50 for wolves and 0-2500 for moose

Take a moment to examine the output. What do you notice? Can you see the predator-prey relationship in the data? The classic predator-prey oscillation pattern might be visible. However, it's likely that the wolf data appear compressed because AI made a symmetric scale: the range of the actual data on one side is much smaller (up to 50) than the other (up to 2500). This brings us to our first iteration.

Troubleshooting: If your AI assistant cannot generate the CSV file, it may be due to restrictions on its environment. You can still continue by downloading a pre-prepared CSV dataset and adding it to your chat.

Local Run

It is likely that the AI assistant wrote and ran Python code on its own. You may also wish to ask it for that Python code and the CSV file it generated. This allows you to can inspect the work and run it locally.

As in Tutorial 1, you'll need a Python environment where you can run the generated code. I recommend using Jupyter Lite which does not require installation. After opening the software, select Pyodide.

That said, if you used something other than Jupyter Lite, make sure you have matplotlib installed in your environment. For example, this could be done with (optionally in a virtual environment):

pip install matplotlib

Once you're set up, simply ask your AI assistant to provide the code and data.

First Iteration

Let's address this issue where the wolf population changes may be hard to see because they're compressed into a narrow space compared to the moose. Specifically, let's improve this by giving wolves more horizontal space per individual. This will make the harmonic relationship between the two populations easier to see. How about another prompt:

This is great but the wolves are too compressed. Let's have the split at roughly half way across the plot. This means that there is more horizontal space per wolf than per moose. However, this will show the harmonic relationship better.

Check the work: Run the updated code. Now you should see:

Wolves taking up approximately half the plot width (left side)
Moose taking up approximately half the plot width (right side)
Much more visible oscillations in the wolf population
The predator-prey relationship should be much clearer

By allocating equal horizontal space to both scales, we've effectively "zoomed in" on the wolf data. Now you can see how when wolves are high, moose populations tend to be lower. Conversely, when wolves go down (especially around 2015-2018 in the data), moose populations go up.

This is an great example of how your human understanding of a visualization helps guide the AI. This is especially true as the story (or capabilities for users to tell their own stories) becomes more clear through experience.

Refining Labels

The visualization is getting clearer but you may also notice that the axis labels might need some work. For instance, there might be overlapping text where the colored "Wolves" and "Moose" labels compete with a black axis title. If that is the case, let's clean this up and make the supporting elements more clear. Try this prompt:

Perfect! We have the colored wolves and moose labels overlapping with the black axis title. Let's set the black axis title to empty and then let's make the colors for both series darker so that it is easy to read wolves and moose. Finally, let's say "Number of Wolves" and "Number of Moose" in that colored text.

Check the work: Run the code and verify:

The black axis title is now gone, eliminating the overlap
The blue and orange colors are darker and hopefully easier to read
The labels now should say "Number of Wolves" and "Number of Moose" instead of just "Wolves" and "Moose"

Notice how this iteration focused on reducing visual clutter and improving readability. This connects to Tufte's concepts of reducing chartjunk and maximizing the data-ink ratio!

Final Polish

There's one more improvement we can make. Right now, it's possible that the numerical labels along the horizontal axis (the tick marks showing the actual population numbers) are all in black. Let's make them match their respective data series for maximum clarity. Try this final prompt:

Doing really well! One last thing. The labels for the count (along the bottom horizontal axis). Can you please make those the same color as the series they describe (so one color on left and a different color on right)?

Check the work: Run the final version. You should now see:

Blue tick labels on the left side (for wolf counts)
Orange tick labels on the right side (for moose counts)
The tick marks themselves are also color-coded
The entire visualization now uses color consistently to distinguish the two data series

Alright... after all of that valuable back and forth, you hopefully now have a polished visualization that clearly shows the predator-prey dynamics between wolves and moose. Just as if you built it up from scratch by hand, we can see how iteration with AI can eventually achieve a similar result where every visual element is intentionally designed to support the viewer's understanding of the data.

Reflection

Let's step back and think about what we just accomplished and what it might tell us about working with AI assistants for data visualization. Specifically, notice that we (likely) didn't get a perfect visualization in the first try. Instead, we went through several rounds as we determined the elements about which we needed to be more precise:

Initial request with basic requirements
Adjusting the spatial allocation to better show the data
Cleaning up labels and improving readability
Fine-tuning color coding for consistency

Think back to your regular iterative process when you write code "by hand" and how this might be similar to how it is often best to work with AI assistants. Importantly, at each step, we have an opportunity to incorporate knowledge from this course and from our experience of a prototype. Just as we created early sketches in code and refine for higher fidelity in prior tutorials, we can continue to use the lenses of the class in a cycle of critically evaluating ever improving outputs through ever more specific prompts.

This is a first stab at a relatively simple script. However, what happens when things get longer or more involved? What if we have a chart type for which there isn't already a pre-built implementation? Let's continue onwards to Tutorial 14 to learn about advanced AI techniques including llms.txt and subagents / agentic workflows. From there, you can choose between a browser-based path (Tutorial 14a) or an agents path (Tutorial 14b).

Citations

National Park Service, "Wolf and Moose Populations," Isle Royale National Park, 2025. [Online]. Available: https://www.nps.gov/isro/learn/nature/wolf-moose-populations.htm
J. Hunter, et al., "Matplotlib: Visualization with Python," Matplotlib Development Team, 2025. [Online]. Available: https://matplotlib.org
E. Tufte, "The Visual Display of Quantitative Information," Graphics Press, 2001.
Anthropic, "Claude," Anthropic PBC, 2025. [Online]. Available: https://claude.ai
OpenAI, "ChatGPT," OpenAI, 2025. [Online]. Available: https://chat.openai.com
Google, "Gemini," Google, 2025. [Online]. Available: https://gemini.google.com/
Mistral AI, "Mistral AI," Mistral AI, 2025. [Online]. Available: https://mistral.ai/
Project Jupyter, "Jupyter," Project Jupyter, 2025. [Online]. Available: https://jupyter.org
K. Reitz and T. Schlusser, "The Hitchhiker's Guide to Python," 2025. [Online]. Available: https://docs.python-guide.org