This is a quick tutorial on how to fetch stock price data from Yahoo Finance, import it into a Pandas DataFrame and then plot it.

If you're new to data science with Python I highly recommend reading A modern guide to getting started with Data Science and Python. I also recommend working with the Anaconda Python distribution.

First visit Yahoo Finance and search for a ticker. For this tutorial I used the S&P 500 ETF: SPY

http://finance.yahoo.com

After you've searched for the ticker click the Historical Prices link.

Scroll to the bottom of the page and find the Down to Spreadsheet link. Right click and copy the link address to your clipboard. The link conveniently points to a .CSV file with historical data going back to 1993 (in the case of SPY).

Let's write some Python code. First, import the necessary libraries.

import numpy as np
import matplotlib.pyplot as pp
import pandas as pd
import seaborn
import urllib.request

Instruct Python to show our plots inline on the screen.

%matplotlib inline

Use urllib to fetch the .CSV data file from the link above.

import urllib.request
urllib.request.urlretrieve(
    'http://real-chart.finance.yahoo.com/table.csv?s=SPY&d=1&e=12&f=2016&g=d&a=0&b=29&c=1993',
    'spy.csv'
)

Inspect the first 10 lines of the data file.

open('spy.csv','r').readlines()[:10]

Import the price data into a Pandas DataFrame using the read_csv function. The first column contains the trading date so tell Pandas to look for dates and parse them into the correct datetime64 data type.

spy = pd.read_csv('spy.csv',parse_dates=['Date'])

Inspect the data types of the SPY DataFrame. Notice how Pandas automatically parsed them into the correct data types.

spy.dtypes

Now let's inspect the first 5 lines of the Pandas DataFrame using the head() function. Notice how Yahoo gave us the data in reverse chronological order. The most recent data is at the beginning. This is backwards from what we need in order to plot the data.

spy.head()

Fix the sort order using the sort_values() function.

spy = spy.sort_values(by='Date')

Let's make the trading date the index for the Pandas DataFrame using the set_index() function.

spy.set_index('Date',inplace=True)

Plot the closing price of SPY over the entire date range in our DataFrame.

spy['Close'].plot(figsize=(16, 12))

Use the truncate() function to remove data prior to January 1st, 2015 and plot. i.e. Plot only data from January 1st, 2015 to present. Notice that the truncate function doesn't modify the data which is good because you'll most likely want all the data intact for later plots and analysis.

spy.truncate(before='2015-01-01')['Close'].plot(figsize=(16, 12))

Reference Links: