Lollipop charts #
A lollipop chart is a means to represent magnitude. It is formed by placing circular markers at a value, and then drawing a line back to a reference value. There are a few ways one could conceive of forming such a chart:
- The combination of a bar chart and a scatterplot. That is, we can conceive of the chart as a bar chart, where the bars are very thin, and the bars end in a scatterplot circle.
- The combination of a series of line charts and a scatterplot. The scatterplot is again the circles. Then, for each circle we have a line chart that connects back to the reference value. In this tutorial, I will demonstrate how to form a lollipop chart using the second of these two constructions.
Lollipop charts in Plotly #
There is not, directly a means by which to construct a lollipop chart in Plotly, but we will see that we can easily construct the chart in Plotly using a line chart and scatter plot.
Preparing the data #
The most common usage of lollipop charts is to compare magnitude between categories. This is the same function as a bar chart. So, if you have the data you need for a bar chart, you have the data you need for a lollipop chart. To illustrate this point, perhaps we can return to the data we used earlier in the course to make a bar chart, as seen in the tutorial video on how to make a bar chart. This data was the gold medal count from the Tokyo Olympics:
gold_medal_counts = [39, 38, 27, 22, 20]
country_names = ["USA", "China", "Japan","Great Briton","ROC"]
We can continue to use this data to make a lollipop chart..
Plotting the data #
Since we are going to make this figure using the add_scatter method of Plotly, we will need both x and y values.
Let us assume we want to make the lollipops run horizontally. This means that the magnitudes are the x-values. For the y-values, we can realize that we need to space these out along the $y$-axis evenly. So, we might as well assign numbers 0, 1, 2, … and so forth to the data.
y_pos = [0, 1, 2, 3, 4]
Once we have these values, adding the circular markers is not a problem. We simply use the scatterplot function. This might look something like this:
import numpy as np
from plotly.subplots import make_subplots
# add data
gold_medal_counts = [
39,
38,
27,
22,
20,
]
country_names = ["USA", "China", "Japan","Great Briton","ROC"]
y_pos = [0, 1, 2, 3, 4]
lollipop = make_subplots()
lollipop.add_scatter(x = gold_medal_counts, y = y_pos,
mode = "markers", marker = dict(size = 30,),
showlegend = False)
lollipop.update_xaxes(title = "gold metal counts", )
lollipop.update_layout(title = "USA had the most gold medals at the Tokyo olympics", template = "simple_white", title_x = 0.12)
lollipop.show("png")
Which produces the following:
At present, this is just a scatterplot. We have very large markers, because we want that for the lollipop chart. But this is not yet a lollipop chart. We need to add in the lines to get that.
There are a few ways to add in the lines. For instance, we could just create several lists that have the locations we need for the lines and then add them all. But we can also do this in a for loop, as was first discussed when we were adding error bars to our data. The idea behind a for loop is that we take a collection of items, like those in a list, and do something with each one in turn.
Thus, we could take the gold medal counts, and then for each count, we could draw a horizontal line from 0 to the medal counts. That is x = [0, <specific medal count>].
This will work fine for the x-values, but not for the y. Here, we need to draw the line at the location on the $y$-axis that corresponds to where the category is. But, these category locations are really just the index for the count. That is, the first medal count has an index of 0, and is drawn at y=0. The second medal count has an index of 1, and is drawn at y=1, and so on.
Thus, if we get the index of the medal count (say… i), we can just use y = [i, i]. Combining this with how we handled the x-data, we will have the two points for a line.
Luckily, in the video on making small multiples we also learned how to get indices as we went through a for loop using enumerate. So if we use a construction like for i, c in enumerate(gold_medal_counts): we will get both the index (i) and the counts (c), which we can use to draw our lines.
We can implement this as in the code below, where we draw the lines first, so that the lollipop circles are drawn over the top of them. We also specify that we want lines. Another thing we do is to specify the color of the scatter markers and the lines, because otherwise, the colors will change for each line.
lollipop = make_subplots()
for i, c in enumerate(gold_medal_counts):
lollipop.add_scatter(x = [0, gold_medal_counts[i]], y = [i, i],
mode = "lines", line = dict(color = "darkcyan", width = 4),
showlegend = False)
lollipop.add_scatter(x = gold_medal_counts, y = y_pos,
mode = "markers", marker = dict(size = 30, color = "darkcyan"),
showlegend = False)
lollipop.update_xaxes(title = "gold metal counts", range = [0, max(gold_medal_counts)*1.1])
#lollipop.update_yaxes(ticks = "", showticklabels = False)
lollipop.update_layout(title = "USA had the most gold medals at the Tokyo olympics", template = "simple_white", title_x = 0.12)
lollipop.show("png")
Running this code produces the following rudimentary lollipop chart.
Though this is technically a well-formed lollipop chart, it is not very useful, because we don’t have category labels on the $y$-axis.
Again, there are many ways to add such labels, but I think the simplest is to add an annotation where we want them, simply using the index we are already using to draw the lines. The index can be used to get the name of the country from the country_names list, and then add them to the plot within the same for loop used to draw the lines. We already learned how to add annotations in Plotly, so we use the same approach, as shown in the code below:
lollipop = make_subplots()
for i, c in enumerate(gold_medal_counts):
lollipop.add_scatter(x = [0, gold_medal_counts[i]], y = [i, i],
mode = "lines", line = dict(color = "darkcyan", width = 4),
showlegend = False)
lollipop.add_annotation(text = country_names[i], x = 0, y = i, xanchor="right",
showarrow = False)
lollipop.add_scatter(x = gold_medal_counts, y = y_pos,
mode = "markers", marker = dict(size = 30, color = "darkcyan"),
showlegend = False)
lollipop.update_xaxes(title = "gold metal counts", range = [0, max(gold_medal_counts)*1.1])
lollipop.update_yaxes(ticks = "", showticklabels = False)
lollipop.update_layout(title = "USA had the most gold medals at the Tokyo olympics", template = "simple_white", title_x = 0.12)
lollipop.show("png")
lollipop.write_image("lollipop1.svg")
In this code, we also added some $y$-axis formatting in update_yaxes
to get rid of the $y$-axis tick marks and the labels, so that we will only have the country names as labels. This produces the following chart:
At this point, I think we have a pretty reasonable lollipop chart, but we can improve it just a bit more. Noting that we simplified the $y$-axis by using annotations, we might replace the $x$-axis with direct labeling, by adding annotations of the values, in white text, within the lollipop head. This can also readily be done in the for loop, as follows:
lollipop = make_subplots()
for i, c in enumerate(gold_medal_counts):
lollipop.add_scatter(x = [0, gold_medal_counts[i]], y = [i, i],
mode = "lines", line = dict(color = "darkcyan", width = 4),
showlegend = False)
lollipop.add_annotation(text = country_names[i], x = 0, y = i, xanchor="right",
showarrow = False)
lollipop.add_annotation(text = gold_medal_counts[i], x = gold_medal_counts[i], y = i, xanchor="center",
font = dict(color = "white",),
showarrow = False)
lollipop.add_scatter(x = gold_medal_counts, y = y_pos,
mode = "markers", marker = dict(size = 30, color = "darkcyan"),
showlegend = False)
lollipop.update_xaxes(range = [0, max(gold_medal_counts)*1.1], showline = False, ticks = "", showticklabels = False)
lollipop.update_yaxes(ticks = "", showticklabels = False)
lollipop.update_layout(title = "USA had the most gold medals at the Tokyo olympics", template = "simple_white", title_x = 0.12)
lollipop.show("png")
Which produces:
Which is a clean lollipop chart.
The final Marimo notebook #
If you followed along the above, then the final Marimo notebook will look something like this: