Candlestick Chart

Candlestick plots #

Candlestick charts are a standard way of showing stock market data. This is, at its core, a display of change over time, with a data point at each day. However, rather than just a simple line chart or scatter plot, you can think of a candlestick chart as a scatter plot where each marker provides summary statistics for the day. Typically, each marker contains a box, which indicates the opening value for the day and the closing value for the day, then there are lines (called “shadows”) that show the highest value and lowest value for the day. This is explained by this image taken from the article on candlestick charts on Wikipedia:

!

However, there is one point of ambiguity here. It is, of course, possible that the opening value is higher than the closing value (as shown in the image above), but it is also possible that the closing value was higher than the opening value. Since the box is vertically symmetric, there is not a visual cue for this. To add this cue, the ‘candlesticks’ are typically color-coded. For instance, elements shown in red are days where there was a loss in value (closing lower than opening) and elements shown in green are those for which there was a gain in value (closing higher than opening). Though these colors are not great for accessibility, they are the most common convention used.

Taken together, you can quickly get a sense for market movement over time. Here is an example from the Plotly documentation on candlestick charts:

!

Though it might not look like the markers show above, if you zoom in, you will see it is…

!

This came from this code:

import plotly.graph_objects as go

import pandas as pd
from datetime import datetime

df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv')

fig = go.Figure(data=[go.Candlestick(x=df['Date'],
                open=df['AAPL.Open'],
                high=df['AAPL.High'],
                low=df['AAPL.Low'],
                close=df['AAPL.Close'])])

fig.show()

For this assignment, we will just be reproducing this plot. But I want to discuss some new ideas that are present.

First, you see that the data we are reading in comes from the internet! Specifically, it comes from a git hub repository. If you navigate to that URL, you will find this page:

!

So you can see this is just a comma-separate-values (csv) file, as we have been using. There are 11 columns and many, many rows.

The way that we are reading this is also different. We are using a different library than Numpy. Instead, we are using Pandas. Pandas is built on top of Numpy, but is designed in a ‘data frame’ forward manner. A data frame can be thought of as the Python equivalent of an Excel spreadsheet. When you pd.read_csv(), it returns such a data frame. Thus, if we print out the data frame (df in this case), we get:

print(df)

!

Here, you are seeing the top of the data frame, where there are columns with names, such as “Date”, “AAPL.Open”, etc. There are 11 of these, but we are only being shown the first three and the last three. the ... indicates that there are values in between that are not being shown.

Similarly, the right most column is just the index for the rows, and we can see that we are shown the first five rows as the last five rows, with the hidden rows in between indicated by .... Also, we have a summary that verifies that we have 506 rows and 11 columns.

The code that generates ths figure uses this data frame, assigning values by referencing the data frame (df) and then supplying the name of the column in a manner that is similar to using an index. So accessing the column with the date looks like df['date'].

Speaking of dates you will notice that there was an extra import statement: from datetime import datetime. This imports part of the datetime library, which is used to handle dates and times. In this example, it is not actually needed, and if you remove it, the code should still run. However, if you wanted to do more with the dates and times, like calculate the hours between dates, you would need this library (or something like it). I suspect that the person that wrote this example code just imported this out of habit.

At any rate, if you want you can play around with the code some and see if you can change things, but really all I wanted at this point was for you to understand that you can gather information directly from the internet if you like, and get you used to seeing the documentation from Plotly.