Gina Schmalzlehttp://geodesygina.com/2015-04-14T15:57:00-04:00An Introduction to Plotting and Mapping in Python2015-04-14T15:57:00-04:00Gina Schmalzletag:geodesygina.com,2015-04-14:matplotlib.html<div class="section" id="tutorial-on-matplotlib-and-basemap">
<h2><strong>Tutorial on Matplotlib and Basemap</strong></h2>
<p>On January 29, 2015 <a class="reference external" href="https://www.linkedin.com/pub/mark-blunk/5a/574/222">Mark Blunk</a> and I prepared a workshop on <a class="reference external" href="http://ipython.org/notebook.html">IPython Notebooks</a>, <a class="reference external" href="http://matplotlib.org/">Matplotlib</a> and <a class="reference external" href="http://matplotlib.org/basemap/">Basemap</a> held at <a class="reference external" href="http://adadevelopersacademy.org/">Ada Developers Academy</a> and sponsored by <a class="reference external" href="http://www.meetup.com/Seattle-PyLadies/">PyLadies Seattle</a>. This blog goes over the Matplotlib and Basemap components of the workshop. The code, contained within Ipython notebooks, are located in <a class="reference external" href="https://github.com/ginaschmalzle/pyladies_matplotlib_ipython_notebooks">this Github Repo</a>.</p>
<p>The Matplotlib/Basemap part of the workshop focuses on:</p>
<p><a class="reference internal" href="#getting-to-the-basics-data-structures">1. Getting to the Basics -- Data Structures</a> -- Brief overview of the data structures used in this workshop.</p>
<p><a class="reference internal" href="#prepare-the-data">2. Prepare the data</a> -- Prepare our data for plotting.</p>
<p><a class="reference internal" href="#time-to-plot-general-scatter-plots">3. Time to Plot! General Scatter Plots</a> -- Make some simple scatter plots and learn how to change their attributes.</p>
<p><a class="reference internal" href="#histograms">4. Histograms!</a> -- Make some simple histograms and learn about how to extract information from them.</p>
<p><a class="reference internal" href="#mapping">5. Mapping</a> -- Make some maps, and let's throw some data on them too!</p>
<div class="section" id="the-data">
<h3>The Data</h3>
<p>I thought it would be fun to work with real data instead of some randomly generated data. The data we will use are <a class="reference external" href="https://raw.githubusercontent.com/ginaschmalzle/pyladies_matplotlib_ipython_notebooks/master/target_day_20140422.dat">modeled weather forecasts at weather stations across the United States</a>. This information was collected from the <a class="reference external" href="http://openweathermap.org/">OpenWeatherMap project</a> who provides an API service to download weather forecasts, but unfortunately, does not keep a historical record of the forecasts (actual observations, yes, but modeled forecasts no). <a class="reference external" href="https://brannerchinese.com/">David Branner</a> and I were curious about how accurate the forecasts were, and wanted to keep the forecasts to see how well they perform over time. Hence, we created a <a class="reference external" href="https://github.com/WeatherStudy/weather_study">database</a> that collects the weather forecasts for these stations. The file target_day_20140422.dat that is in the <a class="reference external" href="https://github.com/ginaschmalzle/pyladies_matplotlib_ipython_notebooks">Github repo for this workshop</a> was extracted from our database and contains weather forecasts for each station in the United States for the 'target day' of April 22, 2014. The stations themselves are defined by their latitude and longitude and the file contains forecasts that were done 0 to 7 days out, where day zero is the forecast made on April 22, 2014. Hence a forecast made one day out was made on April 21, two days out April 20th, etc.</p>
</div>
</div>
<div class="section" id="getting-to-the-basics-data-structures">
<h2>1. Getting to the Basics -- Data Structures</h2>
<p>A basic understanding of data structures is useful when playing with and visualizing data. If you are already familiar with data structures you can skip ahead to <a class="reference internal" href="#prepare-the-data">2. Prepare the data</a>.</p>
<p>In computer science, a data structure is a way to organize data in a computer that makes it computationally efficient. Three basic data structures are used in this workshop: <em>lists</em>, <em>tuples</em> and <em>dictionaries</em>.</p>
<div class="section" id="lists">
<h3>Lists</h3>
<p>Lists represent a sequence of values. In python a list is designated with square brackets []. The following are examples of lists:</p>
<pre class="literal-block">
a = []
b = ['a', 'b', 'c']
c = [4,1,6,9,2,10]
d = [[1,2,3],['a','n','q']]
</pre>
<p>The items in these lists are called elements or items. You can figure out how many elements are in these lists by asking for its length:</p>
<pre class="literal-block">
print (len(d))
</pre>
<p>The example d above has two lists as elements. d is called a list of lists.</p>
<p>So how do you retrieve an element of a list? Each element is assigned a number, starting at 0, that represents where it sits in the list. For example, element 0 of b is 'a'. It can be retrieved like this:</p>
<pre class="literal-block">
b[0]
</pre>
<p>Now you try -- What is c[4]? How about d[2]?</p>
<p>Great things about lists are that they are very simple to understand, and they take up relatively little amounts of memory. However they do have some limitations. Say if you have a long list of values, but you wanted to see if a certain value is in the list. You potentially would have to read through all the items in the list in order to see if it is in there. Hence, it can be computationally slow.</p>
</div>
<div class="section" id="tuples">
<h3>Tuples</h3>
<p>Tuples are similar to lists in that they also represent a sequence of values, however they have a very special property -- i.e., they are immutable. This means that once they are created they can not be changed. They are represented by parentheses () rather than square brackets. So, in python, you could define a tuple like this:</p>
<pre class="literal-block">
a = ()
b = (32, 41)
c = ('x', 'y')
</pre>
<p>Similar to lists, you can access a specific element like so:</p>
<pre class="literal-block">
b[1]
</pre>
<p>This would produce the output of 41.</p>
<p>Tuples seem a lot more restrictive than a list, so you may ask, why would you ever use a tuple? Tuples are useful when you would like to describe something that needs multiple values to make sense, and these values cannot change. For example, you can create a tuple of a location on the surface of the earth that contains a latitude and longitude. The location would not make sense if one of those values were wrong or missing. Hence, having an immutable property that describes its location is appropriate in this case.</p>
</div>
<div class="section" id="dictionaries">
<h3>Dictionaries</h3>
<p>Also known as associative arrays, maps, symbol tables or hash tables, this data structure is computationally fast, but uses lots of memory. A dictionary consists of key-value pairs, where the keys are all unique and refer to a specific value. Values among the keys can be identical, however. Dictionaries are designated with curly brackets {}. Here are examples of dictionaries:</p>
<pre class="literal-block">
dict_a = {}
dict_b = {'Hello beautiful': 'Ew, Gross', 'Goodbye Gorgeous':'Finally'}
dict_c = {'Bad Pickup Lines': {'example 1': 'Did it hurt when you fell from heaven?',
'example 2': 'Do you alway wear your shoes over your socks?'
}}
</pre>
<p>For dict_b, you can think of a bad pickup line as the 'key' to your response, or 'value'. For example, if someone said:</p>
<pre class="literal-block">
dict_b['Hello beautiful']
</pre>
<p>the response would be:</p>
<pre class="literal-block">
'Ew, Gross'
</pre>
<p>For dict_c, we have a dictionary of dictionaries. Here we have a dictionary of bad pickup lines that contain examples. To get to a nested dictionary, say you want the value for 'example 2', you would type:</p>
<pre class="literal-block">
dict_c['Bad Pickup Lines']['example 2']
</pre>
<p>Get it? If you need more help, I've put together a <a class="reference external" href="http://geodesygina.com/dict.html">post on dictionaries here</a>.</p>
<p>The great thing about dictionaries is that we can have a lot of data, but if we know the key, we can very quickly get the associated values. If this information were in a list, it <em>could</em> take a long time to read through the list to get to the value you want. The down side however, is that dictionaries could take up a lot of memory, but that's not a problem in this excersize on most modern computers.</p>
</div>
</div>
<div class="section" id="prepare-the-data">
<h2>2. Prepare the data</h2>
<div class="section" id="retrieving-the-data">
<h3>Retrieving the data</h3>
<p>In this section we focus on reading in data and putting it into an appropriate data structure. These <a class="reference external" href="https://raw.githubusercontent.com/ginaschmalzle/pyladies_matplotlib_ipython_notebooks/master/target_day_20140422.dat">'data'</a> are modeled weather forecasts for individual weather stations across the United States. (I put quotes on data because these are modeled solutions, not actual observations). The file that will be read contains the forecast for one day (April 22, 2014) for 0 to 7 days prior, where the 0th day is the forecast on April 22nd:</p>
<pre class="literal-block">
# Read file
filename='target_day_20140422.dat'
f = open(filename, 'r')
contents = f.readlines()
</pre>
<p>Where contents looks like this:</p>
<pre class="literal-block">
['Lat, Lon, days_out, MaxT, MinT \n',
'38.576698 -92.173523 0 18.71 6.97\n',
'38.576698 -92.173523 1 21.03 8.7\n',
'38.576698 -92.173523 2 20.67 9.72\n',
'38.576698 -92.173523 3 19.01 7.23\n',
'38.576698 -92.173523 4 22.08 9.07\n',
'38.576698 -92.173523 5 21.68 9.53\n',
'38.576698 -92.173523 6 22.33 10.22\n',
'38.576698 -92.173523 7 16.18 12.14\n',
'34.154179 -117.344208 0 17.37 6.16\n',
'34.154179 -117.344208 1 19.66 7.48\n',
'34.154179 -117.344208 2 21.24 6.27\n',
'34.154179 -117.344208 3 21.71 5.5\n',
'34.154179 -117.344208 4 18.34 8.88\n', ...]
</pre>
<p>Couple of things here -- we have a list of strings, where the end of the string is marked with an 'n'. This marker indicates that it is the end of the line in the file and will need to be accounted for when we ingest the data into a useable form.</p>
<p>Let's make a dictionary of values, where lat, long are the keys (in tuple form). The values are also dictionaries, where the number of days out are the keys, and MaxT and MinT are the values:</p>
<pre class="literal-block">
forecast_dict = {}
for line in range(1, len(contents)):
line_split = contents[line].split(' ')
try:
forecast_dict[line_split[0], line_split[1]][line_split[2]] = {'MaxT':float(line_split[3]),
'MinT':float(line_split[4][:-1])}
except:
forecast_dict[line_split[0], line_split[1]] = {}
forecast_dict[line_split[0], line_split[1]][line_split[2]] = {'MaxT':float(line_split[3]),
'MinT':float(line_split[4][:-1])}
</pre>
<p>Here forecast_dict looks like this:</p>
<pre class="literal-block">
{('19.068609', '-155.764999'): {'0': {'MaxT': 25.67, 'MinT': 24.45},
'1': {'MaxT': 25.88, 'MinT': 24.66},
'2': {'MaxT': 25.17, 'MinT': 24.49},
'3': {'MaxT': 25.67, 'MinT': 24.37},
'4': {'MaxT': 25.35, 'MinT': 23.76},
'5': {'MaxT': 24.57, 'MinT': 23.27},
'6': {'MaxT': 24.26, 'MinT': 23.33},
'7': {'MaxT': 24.71, 'MinT': 23.78}},
('19.43083', '-155.237778'): {'0': {'MaxT': 25.38, 'MinT': 23.41},
'1': {'MaxT': 25.39, 'MinT': 22.47},
'2': {'MaxT': 24.77, 'MinT': 23.35},
'3': {'MaxT': 25.38, 'MinT': 22.45},
'4': {'MaxT': 24.36, 'MinT': 22.5},
'5': {'MaxT': 23.92, 'MinT': 22.57},
'6': {'MaxT': 23.21, 'MinT': 22.45},
'7': {'MaxT': 23.56, 'MinT': 22.68}},...
</pre>
<p>So now we have for each site (defined by its latitude and longitude) the Maximum Temperature (MaxT) and Minimum Temperature (Min T) for each forecast done the day of (day '0') to 7 days prior. It's pretty easy to retrieve the stations (and hence the latitudes and longitudes) by typing:</p>
<pre class="literal-block">
forecast_dict.keys()
</pre>
<p>which gives:</p>
<pre class="literal-block">
[('37.224239', '-95.708313'),
('27.53587', '-82.561211'),
('32.709301', '-96.008301'),
('42.09808', '-88.28286'),
('36.424229', '-89.057007'),
('36.98801', '-121.956627'),
('43.02496', '-108.380096'),
('41.802601', '-71.88591'),
('37.99548', '-122.332748'),
('43.416679', '-86.35701'),
('41.85371', '-71.758118'),...
</pre>
<p>And you can extract values for a random station by selecting one of these keys, e.g.:</p>
<pre class="literal-block">
forecast_dict[('40.51218', '-111.47435')]
</pre>
<p>gives you:</p>
<pre class="literal-block">
{'0': {'MaxT': 17.45, 'MinT': 2.04},
'1': {'MaxT': 17.95, 'MinT': 5.84},
'2': {'MaxT': 18.33, 'MinT': 7.99},
'3': {'MaxT': 18.16, 'MinT': 7.7},
'4': {'MaxT': 13.75, 'MinT': 3.62},
'5': {'MaxT': 14.58, 'MinT': 9.23},
'6': {'MaxT': 14.58, 'MinT': 9.23},
'7': {'MaxT': 13.08, 'MinT': -2.99}}
</pre>
<p>The output above shows the forecasted Max T and Min T values for 0-7 days prior for a specific station at Latitude 40.51218N, Longitude -111.47435E.</p>
</div>
<div class="section" id="prepare-our-data-for-plotting">
<h3>Prepare our data for Plotting</h3>
<p>The plot will be Max T vs. day out for this one station. It will be a simple plot, but first, we need to make some lists that matplotlib can use to do the plotting. We will need a list of days, and a list of corresponding Max T values:</p>
<pre class="literal-block">
# First retrieve the days
day_keys = forecast_dict[('40.51218', '-111.47435')].keys()
</pre>
<p>day_keys gives you:</p>
<pre class="literal-block">
['1', '0', '3', '2', '5', '4', '7', '6']
</pre>
<p>Dictionaries don't necessarily sort alphabetically or numerically, so let's sort them:</p>
<pre class="literal-block">
day_keys.sort()
</pre>
<p>returns:</p>
<pre class="literal-block">
['0', '1', '2', '3', '4', '5', '6', '7']
</pre>
<p>Matplotlib plots lists of one thing against another. So, let's make our lists:</p>
<pre class="literal-block">
# First define the variables as lists
day_list = []; maxt_list = []
# Then populate the lists
for day_key in day_keys:
day_list.append(float(day_key))
maxt_list.append(float(forecast_dict[('40.51218', '-111.47435')][day_key]['MaxT']))
</pre>
<p>Now the element in one list corresponds with an element in the other list, for a given element number. For example day_list[0] corresponds to maxt_list[0]</p>
</div>
</div>
<div class="section" id="time-to-plot-general-scatter-plots">
<h2>3. Time to Plot! General Scatter Plots</h2>
<p>First let's import everything we will need:</p>
<pre class="literal-block">
%matplotlib inline # In ipython or ipython notebook only
import matplotlib as mpl
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import numpy as np
</pre>
<p>Our most simple scatter plot can be made by typing:</p>
<pre class="literal-block">
plt.scatter(day_list, maxt_list)
# Let's add a line --
plt.plot(day_list, maxt_list)
</pre>
<p>This gives you:</p>
<img alt="simple_scatter" class="align-right" src="/images/simple_scatter.png" style="width: 400.0px; height: 300.0px;" />
<p>Now let's jazz is up a bit -- Let's Make the lines red and dashed and change the size of the circles, change them to stars and make them green. Also, how is one to know what you just plotted? Let's add the axes labels and the title:</p>
<pre class="literal-block">
plt.plot(day_list, maxt_list, '.r--')
plt.scatter(day_list, maxt_list, s = 400, color='green', marker='*')
plt.ylabel ('Forecasted Max Temperature, Deg C')
plt.xlabel ('Days from Target day April 22, 2014')
plt.title ('Forecasted Max Temperature')
plt.show()
</pre>
<p>This will give you:</p>
<img alt="fancy_scatter" class="align-right" src="/images/fancy_scatter.png" style="width: 400.0px; height: 300.0px;" />
<p>Click <a class="reference external" href="http://matplotlib.org/api/markers_api.html">here for more marker fun</a>, and more <a class="reference external" href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot">info on pretty-ing up lines can be found here</a>.</p>
<p>Getting the idea?</p>
<p>Let's do another plot and this time look at all of the Max Temperature forecasts 2 days out, and plot them with respect to Latitude. We will need to pick out from forecast_dict all the Max T values for all of the weather stations made 2 days before April 22, 2014. First, we will need to get all the Latitudes and Longitudes for each site, then we will need to pick out all the Max T values for each of the stations for that day.</p>
<p>We will keep in mind that maybe in the future you might want to look at Min T, or a different day:</p>
<pre class="literal-block">
# Get keys of forecast_dict (lats and longs):
keys = forecast_dict.keys()
# Circle through all the keys to get the values for the 2nd day maximum temperature and the
# corresponding Lat and Longs
day_out = '2' # 0-7
temp = 'MaxT' # MaxT or MinT
temperature = []; lat = []; lon = []
for key in keys:
temperature.append(float(forecast_dict[key][day_out][temp]))
lat.append(float(key[0]))
lon.append(float(key[1]))
# Now that those are collected, let's see what the Temperature as a function of Latitude is:
plt.scatter(temperature,lat)
</pre>
<p>This will give you:</p>
<img alt="blue_t_v_lon" class="align-right" src="/images/blue_t_v_lon.png" style="width: 400.0px; height: 300.0px;" />
<div class="section" id="coloring-points-in-a-scatter-plot">
<h3>Coloring Points in a Scatter Plot</h3>
<p>Let's try again, but this time, color according to Longitude. Again, let's keep in mind we may want to color by something else. You can try playing with these:</p>
<pre class="literal-block">
color_by = lon
label = 'Long' # Need to rename if 'color_by' is changed
max_color_by = max(color_by)
min_color_by = min(color_by)
fig, ax = plt.subplots()
s = ax.scatter(temperature, lat,
c=color_by,
s=200,
marker='o', # Plot circles
# alpha = 0.2,
cmap = plt.cm.coolwarm, # Color pallete
vmin = min_color_by, # Min value
vmax = max_color_by) # Max value
cbar = plt.colorbar(mappable = s, ax = ax) # Mappable 'maps' the values of s to an array of RGB colors defined by a color palette
cbar.set_label(label)
plt.xlabel('{0} in Deg C, forecasted {1} days out'.format(temp,day_out))
plt.ylabel('Latitude, Deg N')
plt.title('{0} forecasted {1} Days out from target day April 22, 2014'.format(temp,day_out))
plt.show()
</pre>
<p>And now you have color:</p>
<img alt="color_t_v_lon" class="align-right" src="/images/color_t_v_lon.png" style="width: 400.0px; height: 300.0px;" />
<p><a class="reference external" href="http://matplotlib.org/users/colormaps.html">Click here for more color mapping fun</a>.</p>
<p>Any ideas what the blue blobs are? (Hint: they are not part of the contiguous United States!)</p>
</div>
</div>
<div class="section" id="histograms">
<h2>4. Histograms!</h2>
<p>Let's take a step back and work on a histogram.
What we are going to plot is the distribution of forecasted temperatures.
Let's start with a very simple histogram of the temperature we left off with:</p>
<pre class="literal-block">
plt.hist(temperature)
plt.ylabel ('Counts')
plt.xlabel(temp)
plt.show()
</pre>
<p>This gives you a very simple histogram that looks like this:</p>
<img alt="simple_hist" class="align-right" src="/images/simple_hist.png" style="width: 400.0px; height: 300.0px;" />
<p>Now let's try again and jazz it up... Let's increase the number of bins (bin size calculated by the difference Min and Max values, divided by the number of bins). Let's also change the color of the bars and make them a little translucent.</p>
<img alt="green_hist" class="align-right" src="/images/green_hist.png" style="width: 400.0px; height: 300.0px;" />
<p>Python histograms give you some information about them. Let's explore:</p>
<pre class="literal-block">
n, bins, patches = plt.hist(temperature, 10, color='green', alpha=0.2)
</pre>
<p>Note that I've fattened up the bins again for this example...
n are the number of counts for each bin:</p>
<pre class="literal-block">
[ 69., 322., 1078., 1732., 2243., 2285., 2421., 1267., 275., 38.]
</pre>
<p>bins are the x-centered location of the bins:</p>
<pre class="literal-block">
[ 0.91 , 4.425, 7.94 , 11.455, 14.97 , 18.485, 22., 25.515, 29.03 , 32.545, 36.06 ]
</pre>
<p>And patches are a list of the matplotlib rectangle shapes that make the bins.</p>
</div>
<div class="section" id="mapping">
<h2>5. Mapping</h2>
<p>Now that we have the basics down, let's start with mapping!
We will be using Matplotlib's basemap: <a class="reference external" href="http://matplotlib.org/basemap/">http://matplotlib.org/basemap/</a>.</p>
<p>Let's make a simple Mercator Projection Map. The code in the next cell is straight from the Basemap example section -- <a class="reference external" href="http://matplotlib.org/basemap/users/merc.html">http://matplotlib.org/basemap/users/merc.html</a>:</p>
<pre class="literal-block">
# Define the projection, scale, the corners of the map, and the resolution.
m = Basemap(projection='merc',llcrnrlat=-80,urcrnrlat=80,\
llcrnrlon=-180,urcrnrlon=180,lat_ts=20,resolution='c')
# Draw the coastlines
m.drawcoastlines()
# Color the continents
m.fillcontinents(color='coral',lake_color='aqua')
# draw parallels and meridians.
m.drawparallels(np.arange(-90.,91.,30.))
m.drawmeridians(np.arange(-180.,181.,60.))
# fill in the oceans
m.drawmapboundary(fill_color='aqua')
plt.title("Mercator Projection")
plt.show()
</pre>
<p>llcrnrlat,llcrnrlon,urcrnrlat,urcrnrlon are the lat/lon values of the lower left and upper right corners of the map.
lat_ts is the latitude of true scale.
resolution = 'c' means use crude resolution coastlines.</p>
<p>And here is the result:</p>
<img alt="default_map" class="align-right" src="/images/default_map.png" style="width: 400.0px; height: 300.0px;" />
<p>Now let's change this map to do what we need. Let's
1. Change the area to the continental United States
2. Increase the resolution to intermediate ('i')
3. Remove the horrific ocean/land colors provided above:</p>
<pre class="literal-block">
m = Basemap(projection='merc',llcrnrlat=20,urcrnrlat=50,\
llcrnrlon=-130,urcrnrlon=-60,lat_ts=20,resolution='i')
m.drawcoastlines()
m.drawcountries()
#m.drawstates()
# draw parallels and meridians.
parallels = np.arange(-90.,91.,5.)
# Label the meridians and parallels
m.drawparallels(parallels,labels=[False,True,True,False])
# Draw Meridians and Labels
meridians = np.arange(-180.,181.,10.)
m.drawmeridians(meridians,labels=[True,False,False,True])
m.drawmapboundary(fill_color='white')
plt.title("Forecast {0} days out".format(day_out))
plt.show()
</pre>
<p>Now the map looks like this:</p>
<img alt="us" class="align-right" src="/images/us.png" style="width: 500.0px; height: 300.0px;" />
<p>Awesome, now we have the area of our interest -- a map of the contiguous United States. Let's put some data on this map. First, let's just start by putting the points on the map. Here I am just going to make some small changes to the code in the previous code block -- namely, I am going to take the latitudes and longitudes from our dataset and convert them into the map's projection. In this case, it will be converted into the mercator projection I've defined:</p>
<pre class="literal-block">
m = Basemap(projection='merc',llcrnrlat=20,urcrnrlat=50,\
llcrnrlon=-130,urcrnrlon=-60,lat_ts=20,resolution='i')
m.drawcoastlines()
m.drawcountries()
# draw parallels and meridians.
parallels = np.arange(-90.,91.,5.)
# Label the meridians and parallels
m.drawparallels(parallels,labels=[False,True,True,False])
# Draw Meridians and Labels
meridians = np.arange(-180.,181.,10.)
m.drawmeridians(meridians,labels=[True,False,False,True])
m.drawmapboundary(fill_color='white')
plt.title("Forecast {0} days out".format(day_out))
x,y = m(lon, lat) # This is the step that transforms the data into the map's projection
m.plot(x,y, 'bo', markersize=5)
plt.show()
</pre>
<p>Now we have a map with the location of the weather stations mapped:</p>
<img alt="us" class="align-right" src="/images/blue_us.png" style="width: 500.0px; height: 300.0px;" />
<p>This is nice and all, but it would be great if we can color each of the points by their forecasted maximum temperature -- so let's do that! Here we have to define what points we want to color, and what we want to color them by:</p>
<pre class="literal-block">
m = Basemap(projection='merc',llcrnrlat=20,urcrnrlat=50,\
llcrnrlon=-130,urcrnrlon=-60,lat_ts=20,resolution='i')
m.drawcoastlines()
m.drawcountries()
# draw parallels and meridians.
parallels = np.arange(-90.,91.,5.)
# Label the meridians and parallels
m.drawparallels(parallels,labels=[True,False,False,False])
# Draw Meridians and Labels
meridians = np.arange(-180.,181.,10.)
m.drawmeridians(meridians,labels=[True,False,False,True])
m.drawmapboundary(fill_color='white')
plt.title("Forecast {0} days out".format(day_out))
# Define a colormap
jet = plt.cm.get_cmap('jet')
# Transform points into Map's projection
x,y = m(lon, lat)
# Color the transformed points!
sc = plt.scatter(x,y, c=temperature, vmin=0, vmax =35, cmap=jet, s=20, edgecolors='none')
# And let's include that colorbar
cbar = plt.colorbar(sc, shrink = .5)
cbar.set_label(temp)
plt.show()
</pre>
<p>And finally, now we have a map with colored points:</p>
<img alt="us" class="align-right" src="/images/color_us.png" style="width: 500.0px; height: 300.0px;" />
<p>Interested in playing with this more on your own? Here are a few exercises you can try:</p>
<blockquote>
<ol class="arabic simple">
<li>In the first graph -- include the weather forecast through time for multiple stations. Color each set of lines differently for each weather station. Also color the points differently for each.</li>
<li>In the second graph -- Try creating a figure with subplots and show the forecasted Max Temperature and forecasted Min Temperature as a function of Latitude side by side.</li>
<li>In the histogram -- Try overlaying a histogram with of the distribution of Max T values for day 2 with the distribution of Min T values for the same day.</li>
<li>For the map -- Create a figure with multiple maps, where each map shows the forecasted distribution of temperature for each day out. Change the location of labels.</li>
<li>What is the difference of the temperature forecast made April 22, 2014 with the previous forecast days? Can you map the differences?</li>
</ol>
</blockquote>
<p>That's it for this workshop! Hope you had fun, and I would love to see what you come up with!</p>
</div>
<div class="section" id="more-info-on-my-code">
<h2><strong>More Info on My Code</strong></h2>
<p>Interested in using the notebooks? Check out my <a class="reference external" href="https://github.com/ginaschmalzle/pyladies_matplotlib_ipython_notebooks">Github page</a> which includes the codes, data and instructions on how to use them. Any comments or suggestions are welcome!</p>
</div>
<div class="section" id="acknowledgements">
<h2><strong>Acknowledgements</strong></h2>
<p>Thanks to <a class="reference external" href="http://www.meetup.com/Seattle-PyLadies/">PyLadies Seattle</a>, specifically <a class="reference external" href="http://www.erinshellman.com/">Erin Shellman</a> and <a class="reference external" href="https://www.linkedin.com/pub/wendy-grus/12/1a6/8ba">Wendy Grus</a> for organizing this fun little workshop! Also many thanks to <a class="reference external" href="http://adadevelopersacademy.org/">Ada Developers Academy</a> for providing the space.</p>
</div>
The Million Song Database and Recommendation Systems2014-07-27T15:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-07-27:recommender.html<div class="section" id="building-recommendation-systems">
<h2><strong>Building Recommendation Systems</strong></h2>
<p>Recommender systems filter information to predict how much a user would like a given item. Companies like Netflix and Tivo use these types of filtering algorithms to try to figure out what a person will want. Unfortunately, these systems are not perfect, and sometimes can go horribly wrong, as elegantly described by Patton Oswalt on the Conan O'Brien show:</p>
<div class="youtube youtube-16x9"><iframe src="https://www.youtube.com/embed/tdzIXkj1OfA?start=195&end=272&version=3" allowfullscreen seamless frameBorder="0"></iframe></div><p>Yes, bad Tivo.</p>
<p>So how do we improve recommender systems? Companies as well as academics are trying hard to figure this out. Fortunately, some groups released large datasets so the anyone can play with them and try to solves these issues. One such publicly available dataset is the <a class="reference external" href="http://labrosa.ee.columbia.edu/millionsong/">The Million Song Dataset</a> -- a perfect dataset for building recommender systems! So, I thought I would give it a try.</p>
<p>For this project, I focused on the <a class="reference external" href="http://labrosa.ee.columbia.edu/millionsong/tasteprofile">Taste Profile subset</a> provided by Echonest, which includes information on user play lists to build my recommenders located on my <a class="reference external" href="https://github.com/ginaschmalzle/million_song">Github page</a>. I built two recommenders; one that figures out what songs a user would like by using an input of a selected song, and another that recommends songs based on what the user has in their play list.</p>
<p>Both recommenders use a combination of <a class="reference external" href="http://en.wikipedia.org/wiki/Collaborative_filtering">Collaborative filtering techniques</a> with vote counting. Collaborative filtering makes recommendations by collecting taste preferences and comparing them to other users. Here we assume that others that have the same song in their play list have similar tastes. Therefore, songs in the other users play lists would be good ones to recommend. In these recommenders I ultimately get to a list of songs that were provided by other users. I then count up how many times a song appears in other people's play lists (vote counting) and spit out the top counted songs as the top recommended songs. In this blog I briefly describe the approach for both the simple, single song recommender and the slightly more complex user recommender for users with a play list.</p>
</div>
<div class="section" id="the-data">
<h2><strong>The Data</strong></h2>
<p>The <a class="reference external" href="http://labrosa.ee.columbia.edu/millionsong/tasteprofile">Taste Profile subset</a> contains over a million users with over 380,000 unique songs. I only use a very small subset of data that includes:</p>
<ol class="arabic simple">
<li>A unique user ID</li>
<li>All the songs in the user play list including:</li>
</ol>
<blockquote>
<ul class="simple">
<li>Song name and id</li>
<li>Artist name and id</li>
<li>The number of times the song was played by the user</li>
</ul>
</blockquote>
</div>
<div class="section" id="the-simple-recommender">
<h2><strong>The Simple Recommender</strong></h2>
<p>For my simple recommender I don't know anything about the person selecting the song. All I know is the selected artist and song. The steps for this recommender include:</p>
<ol class="arabic simple">
<li>Find all users that have the song in their play list</li>
<li>Make a list of all songs from each person's play list</li>
<li>Count how many times a unique song appears in the list</li>
<li>Print out the songs in the order of most counts that was not the original input song</li>
</ol>
<p>Easy cheesy, right?</p>
<p>To illustrate the outcome of this recommender, here is a plot of the top 10 most counted songs from other people's play lists given the song Yeah! by Usher (keep in mind these are the counts for my much smaller subset of data):</p>
<img alt="deforming_plates" class="align-center" src="/images/top10.jpg" style="width: 700.0px; height: 700.0px;" />
</div>
<div class="section" id="adding-user-play-list-into-a-recommender">
<h2><strong>Adding User Play List into a Recommender</strong></h2>
<p>Adding a user play list into a recommender is slightly more complex. Here, I want to know what other users are most similar to the recommendee (for lack of a better term, I define the recommendee as the person who is going to get the recommendation), then suggest songs from the similar users' play lists. The steps for this recommender include:</p>
<ol class="arabic simple">
<li>For each song in the recommendee play list, make a list of all users that also have that song in their play list.</li>
<li>Count the number of times a unique user is in the list. The user with the most counts is the most similar to the recommendee.</li>
<li>Pick the most similar users and concatenate a list of songs that were not in the recommendee's play list.</li>
<li>Count the number of times a song shows up in the list</li>
<li>Print out the songs in order of most counted</li>
</ol>
<p>Slightly more complicated than the simple recommender, but generally the same idea.</p>
</div>
<div class="section" id="pitfalls">
<h2><strong>Pitfalls</strong></h2>
<p>There are issues with these simple approaches. They work well for the small data set that I downloaded, but as the dataset gets larger, the lists and dictionaries that I make in my code also get larger. So, this approach will take up increasing amounts of memory to make my lists, and increasing amounts of time to sort the lists and count the number of songs. <a class="reference external" href="http://en.wikipedia.org/wiki/Collaborative_filtering">Model-based approaches</a> help to minimize these issues. Another issue is making recommendations based on new songs, or songs that very few people have listened to. In these cases other information about the song, such as genre, would be needed to make recommendations.</p>
</div>
<div class="section" id="more-info-on-my-code">
<h2><strong>More Info on My Code</strong></h2>
<p>Interested in using my recommenders? Check out my <a class="reference external" href="https://github.com/ginaschmalzle/million_song">Github page</a> which includes the codes, instructions on how to use them, and some more information on how the codes work. Any comments or suggestions are welcome!</p>
</div>
<div class="section" id="acknowledgements">
<h2><strong>Acknowledgements</strong></h2>
<p>Thanks to <a class="reference external" href="http://www.linkedin.com/pub/stella-rowlett/0/797/118">Stella Rowlett</a>, <a class="reference external" href="http://jasongowans.net/">Jason Gowans</a> and <a class="reference external" href="http://www.linkedin.com/in/manjudotorg">Manju Muthukumaresan</a> for suggesting this project!</p>
</div>
My big fat shoe-shopping adventure: Iterative sampling in R2014-07-27T14:56:00-04:00Gina Schmalzle and Craig Fauncetag:geodesygina.com,2014-07-27:sampling.html<div class="section" id="r-helped-me-figure-out-how-many-shoes-i-can-buy">
<h2><strong>R helped me figure out how many shoes I can buy</strong></h2>
<p>One of the things I love about coding and data science is that I get to work on a lot of interesting problems. One of my good friends Craig Faunce <a class="reference external" href="https://www.linkedin.com/pub/craig-h-faunce/66/789/1ba">Craig Faunce</a> approached me over a beer with a problem. It seems he had been asked to determine how many items he could buy given a certain budget. Ok, if each and every item costs the same this is simple math, which has me puzzled. Of course it’s not that easy, since each and every item has a different cost. Ok, still not that difficult. It only becomes something that I think you would be interested in when he gets to the next part, where he says: "I'm asked to sample one population of items at a given rate, and then with my left-over money, determine at what rate I can afford to sample a second, totally different population of items with totally different costs per item."</p>
<p>Ok! We have an interesting little sampling project. Since Craig works for a large employer, he can't really divulge every gory detail about this issue, and obviously getting the real data isn't going to happen here. Besides, it sounded pretty boring to me, so I thought about something that I can relate to - shoes!</p>
<img alt="deforming_plates" class="align-center" src="/images/shoefits.jpg" style="width: 600.0px; height: 500.0px;" />
<p><a class="reference external" href="http://www.kulfoto.com/funny-pictures/49597/if-the-shoe-fits-buy-it-in-every-color">Figure 1</a> Ahh, too cute...</p>
<p>So I reframed the questions.</p>
<p>My first question is: If this year (hopefully during a big Sale) I were to blindly have an assigned shopper (or better yet, a blind assigned shopper) randomly buy a set percentage of the store, how much money would I spend? The reason we want to sample in this exercise is due to the fact that the answer depends on which shoes are purchased, since each one has a different price. So we are interested in building a distribution of potential outcomes from shoe-shopping, so we can build a range of likely outcomes from the adventure.</p>
<p>We will need the following libraries:</p>
<pre class="literal-block">
require(plyr)
require(ggplot2)
</pre>
<p>The actual data doesn't really matter for this exercise, so lets generate some with these parameters:</p>
<pre class="literal-block">
nshoe1 <- 1000 # Number of shoes in the store in the first year.
meanprice1 <- 100 # Mean price of shoes in the first year.
pricesd1 <- 50 # Standard deviation of the price of shoes in the first year.
R <- 0.01 # The sampling rate of my shopper in the first year.
it <- 200 # The number of iterations to build our distribution of outcomes.
</pre>
<p>I created a makedata function to create a dataframe in R consisting of nshoe rows with the associated price (called bucks) generated from a known distribution (in this case the normal, but who cares?) with a mean price of meanprice1 and a standard deviation of pricesd1:</p>
<pre class="literal-block">
makedata <- function (numberofshoes, dm, sdv){
# Assign number of shoes
df <- data.frame(shoes = seq(1:numberofshoes))
# Assign random # of bucks for each shoe
df$bucks <- rnorm(n = numberofshoes, mean = dm, sd = sdv)
return (df)
}
</pre>
<p>The function sampleme samples from the dataframe that was created from the makedata function above:</p>
<pre class="literal-block">
sampleme <- function(dataframe, samplerate){
# Generate a subsample of shoe numbers, then take the associated
# bucks and stick them into sdf.
sdf <- data.frame(shoes=sample(1:nrow(dataframe), size = (samplerate*nrow(dataframe))))
sdf <- merge(sdf,dataframe,all.x=TRUE)
return (sdf)
}
</pre>
<p>Finally, a third function storesamples enables the outcome of each random sample to be stored and appended to prior samples for later use:</p>
<pre class="literal-block">
storesamples<-function(iteration, df, sr){
for (iter in 1:iteration){
sdf <- sampleme(dataframe = df, samplerate=sr)
sdf$index <- iter
ifelse(iter == 1, allsdf <-sdf, allsdf <-rbind(allsdf,sdf))
}
return(allsdf)
}
</pre>
<p>Note that the function storesamples calls function sampleme.</p>
<p>Now that I have my functions, let's figure out how much money I spend if I buy 1% of the store's inventory:</p>
<pre class="literal-block">
# make a dataframe
shoesinstore1 <- makedata(nshoe1, meanprice1, pricesd1)
# calculate how much $$ you spent by buying 1% of the inventory
moneyIspent <- storesamples(it,shoesinstore1,R)
</pre>
<p>Now let's make a summary of the money I just spent and print it out:</p>
<pre class="literal-block">
summarya <- ddply(moneyIspent, .(index), summarize, Totalbucks = floor(sum(bucks)))
summary(summarya$Totalbucks)
</pre>
<p>In my last run, here are my results:</p>
<pre class="literal-block">
Min. 1st Qu. Median Mean 3rd Qu. Max.
604.0 897.8 1009.0 1010.0 1120.0 1383.0
</pre>
<p>So I can expect my blind shopper to come back with a Visa/AmEX/Mastercard charge of around a thousand bucks, but it could be as low as $600, or as high as $1383 (still within my spending limit- whew!).
Now let's plot our results using a histogram:</p>
<pre class="literal-block">
(ggplot(summarya, aes(x=Totalbucks))
+ geom_histogram()
)
</pre>
<p>This gives you:</p>
<img alt="deforming_plates" class="align-center" src="/images/moneyIspend.png" style="width: 700.0px; height: 400.0px;" />
<p>Now for my second question. The following year I am <em>given the same amount of money I spent last year</em> as my budget. <em>What percentage of the store's inventory in year 2 can I buy given the amount of money I spent last year?</em></p>
<p>Here we have reversed the sampling question from year 1: instead of sampling at a fixed rate to generate a distribution of credit card debts, we now have a distribution of available spending limits, and are asked to generate a distribution of expected percentage of the store purchased.</p>
<p>To ensure we don't go over our budget, we can't create a single sample of a given number of shoes as above- we have to select a single pair of shoes, evaluate its cost against our remaining funds, and then repeat until we have no more money. Of course in addition we need to count the number of shoes. We select each pair of shoes and conduct our evaluation with our shoesIcanbuy function:</p>
<pre class="literal-block">
shoesIcanbuy <- function(dataframe,mypurse){
numofshoepairs <- 0
while (mypurse > 0) {
Shoe.pair<-dataframe[sample(nrow(dataframe),1),] # Pick a random pair of shoes
if (mypurse >= Shoe.pair$bucks){ # As long as I have enough money in my purse
mypurse<-mypurse-Shoe.pair$bucks # Buy a pair of shoes and subtract their price from my budget
numofshoepairs <- numofshoepairs + 1 # Record the number of shoes I bought
}
else {
break
}
}
return(numofshoepairs) # Return the number of shoes I bought
}
</pre>
<p>However the above function only gets us so far- our real interest lies in the summary of multiple shoe-shopping extravaganzas, which- you guessed it- we will conduct with another function:</p>
<pre class="literal-block">
how_many_shoes_in_store_I_bought <- function(dataframe, summarya, it){
numofshoepairs <- array() # Declare an array
for (i in 1:nrow(summarya)) { # Use each row in summarya as my starting budget
mypurse<-summarya[i,2]
for (j in 1:(it)){ # Figure out how many shoes I bought with each starting budget
numofshoepairs[j] <- shoesIcanbuy(dataframe, mypurse)
}
numofshoepairs.df<-data.frame(Shoes=numofshoepairs)
ifelse(i==1, numofshoepairs.masterdf<-numofshoepairs.df,
numofshoepairs.masterdf<-rbind(numofshoepairs.masterdf,numofshoepairs.df))
}
return(numofshoepairs.masterdf)
}
</pre>
<p>Now let's make this a little more realistic by making a completely different shoe line-up in the store for year 2:</p>
<pre class="literal-block">
shoesinstore2 <- makedata(nshoe2, meanprice2, pricesd2)
</pre>
<p>Now collect information on how many shoes I bought, and the corresponding percentage of how many shoes I bought in the store:</p>
<pre class="literal-block">
numofshoepairs.masterdf <- how_many_shoes_in_store_I_bought(shoesinstore2,summarya,it)
</pre>
<p>Calculate a percent of the store by taking the number of shoes I bought and dividing it by the corresponding number of shoes in the store, and multiplying by 100:</p>
<pre class="literal-block">
numofshoepairs.masterdf$Percent<-(numofshoepairs.masterdf$Shoes/nrow(shoesinstore2))*100
</pre>
<p>OK, let's see how much of the store I bought out:</p>
<pre class="literal-block">
summary(numofshoepairs.masterdf$Percent)
</pre>
<p>which gives:</p>
<pre class="literal-block">
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.2143 0.5000 0.5714 0.5736 0.6429 1.0710
</pre>
<p>and how many shoes I bought:</p>
<pre class="literal-block">
summary(numofshoepairs.masterdf$Shoes)
</pre>
<p>which gives:</p>
<pre class="literal-block">
Min. 1st Qu. Median Mean 3rd Qu. Max.
3.000 7.000 8.000 8.031 9.000 15.000
</pre>
<p>So, I bought about 8 pairs of shoes.</p>
<p>Finally, let's plot a histogram of the percentage of shoes in the store I bought:</p>
<pre class="literal-block">
(ggplot(numofshoepairs.masterdf, aes(x=Percent))
+ geom_histogram(aes(y=..density..), fill="gray", color="black", binwidth = .1)
+ theme_bw()
+ geom_vline(x=mean(numofshoepairs.masterdf$Percent), color="blue")
)
</pre>
<p>And you get:</p>
<img alt="deforming_plates" class="align-center" src="/images/percent_store_invent.png" style="width: 700.0px; height: 300.0px;" />
<p>And that's our shoe-shopping adventure: Sampling with the built-in function of sample in R, where we determined the size of a single sample through our rate, and secondly with the supplied function where we sample individual elements in a population and evaluate each outcome against a set threshold. Sampling forwards and backwards- have fun, and good shopping!</p>
<p>Interested in getting your hands on the code? Check it out in my <a class="reference external" href="https://github.com/ginaschmalzle/MyShoes">Github Repo</a>.</p>
</div>
SQLite3 Databases: Creating, Populating and Retrieving Data, Part 32014-07-27T13:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-07-27:ret_db.html<p>In <a class="reference external" href="http://geodesygina.com/make_db.html">Part 1 Creating a Database with SQLite3</a> we built a database. In <a class="reference external" href="http://geodesygina.com/pop_db.html">Part 2 Populating an SQLite Database using Python</a> we inserted values into TABLES within our database using Python 3.4 and SQLite3. Here we continue using the functionality of Python 3.4 to retrieve and visualize forecasts contained within our database. Again, I cannot thank enough <a class="reference external" href="https://github.com/brannerchinese">David Branner</a>, for his efforts with this project!</p>
<p>Our desired end-product will be to produce the map below of the differences between forecasts that were made for a specific calendar day and the forecast for that day.</p>
<img alt="differenced_forecasts" class="align-left" src="/images/weather_diff.png" style="width: 800.0px; height: 500.0px;" />
<p><em>Figure 1. Maps of forecasted differences (the difference between the day of forecast and the forecast for X days out).</em></p>
<div class="section" id="part-3-retrieving-data-from-an-sqlite-database-using-python">
<h2><strong>Part 3. Retrieving data from an SQLite Database using Python</strong></h2>
<p>First we need to retrieve the weather forecast data we made in the previous posts. Our database contains the forecasted maximum temperature (maxt), minimum temperature (mint), rain and snow for a given day that were made the day of to fourteen days prior. So, we need to be able to extract this information from the database. This code uses the sqlite3 module in python to extract the information:</p>
<pre class="literal-block">
import os
import sqlite3
def get_single_date_data_from_db(exact_date, db='weather_data_OWM.db'):
"""Retrieve forecasts for single date."""
# exact date should be in the form YYYYMMDD
connection = sqlite3.connect(db)
with connection:
cursor = connection.cursor()
try:
cursor_output = cursor.execute( # This should all be old hat to you now...
'''SELECT lat, lon, '''
'''maxt_0, mint_0, rain_0, snow_0, '''
'''maxt_1, mint_1, rain_1, snow_1, '''
'''maxt_2, mint_2, rain_2, snow_2, '''
'''maxt_3, mint_3, rain_3, snow_3, '''
'''maxt_4, mint_4, rain_4, snow_4, '''
'''maxt_5, mint_5, rain_5, snow_5, '''
'''maxt_6, mint_6, rain_6, snow_6, '''
'''maxt_7, mint_7, rain_7, snow_7, '''
'''maxt_8, mint_8, rain_8, snow_8, '''
'''maxt_9, mint_9, rain_9, snow_9, '''
'''maxt_10, mint_10, rain_10, snow_10, '''
'''maxt_11, mint_11, rain_11, snow_11, '''
'''maxt_12, mint_12, rain_12, snow_12, '''
'''maxt_13, mint_13, rain_13, snow_13, '''
'''maxt_14, mint_14, rain_14, snow_14 '''
'''FROM locations, owm_values '''
'''ON owm_values.location_id=locations.id '''
'''WHERE target_date=?''', (exact_date,))
except Exception as e: # What exceptions may we encounter here?
print(e)
retrieved_data = cursor_output.fetchall() # We receive list of simple tuples from database.
composed_data = generate_dict_of_tuples(retrieved_data) # Now we need to build some function that converts the retrieved data into a dictionary.
return composed_data
</pre>
<p>Note the line:</p>
<pre class="literal-block">
composed_data = generate_dict_of_tuples(retrieved_data)
</pre>
<p>Here we need some way to make a usable form of the dataset. In this case the function generate_dict_of tuples receives the raw data from the SQLite3 database and converts it into a more usable dictionary of tuples:</p>
<pre class="literal-block">
def generate_dict_of_tuples(retrieved_data):
"""Compose the data into a succinct dictionary of tuples."""
# Our re-composed data type is a dictionary of tuples.
# Each tuple contains three items:
# sub-tuple containing latitude and longitude (floats);
# list of 15 sub-sub-tuples, each containing
# maxt, mint, rain, snow (floats).
# For dates where the database contains no data, the forecast tuple
# would be: `(None, None, None, None)` but this is replaced by `None`,
# using an `if-else` clause.
composed_data = {}
for item in retrieved_data:
lat_lon = item[0:2]
forecasts = [subitem
if subitem[0] or subitem[1] or subitem[2] or subitem[3]
else None
for subitem in
zip(item[2::4], item[3::4], item[4::4], item[5::4])]
composed_data[lat_lon] = forecasts
return composed_data
</pre>
<p>Now having both of these functions in place, if we run:</p>
<pre class="literal-block">
get_single_date_data_from_db(20140522)
</pre>
<p>We get a dictionary that looks like this:</p>
<pre class="literal-block">
{(38.576698, -92.173523): [(18.71, 6.97, 0, 0),
(21.03, 8.7, 0, 0),
(20.67, 9.72, 0, 0),
(19.01, 7.23, 0, 0),
(22.08, 9.07, 0, 0),
(21.68, 9.53, 0.34, 0),
(22.33, 10.22, 0, 0),
(16.18, 12.14, 1.23, 0),
(19.05, 12.02, 10.08, 0),
None,
None,
None,
None,
None,
None],
(34.154179, -117.344208): [(17.37, 6.16, 0, 0),
(19.66, 7.48, 0, 0),
(21.24, 6.27, 0, 0),
(21.71, 5.5, 0, 0),
(18.34, 8.88, 0, 0),
(20.78, 4.73, 0, 0),
(20.78, 4.73, 0, 0),
(22.96, 7.06, 0, 0),
(20.78, 4.73, 0, 0),
None,
None,
None,
None,
None,
None],
.
.
.}
</pre>
<p>The keys are the location's latitude and longitude, and the values are the forecasts. In this example we have 9 forecasts: one for the day of and 8 days out (other values that are not present are marked as ‘None’).</p>
<p>Fabulous. In <em>Figure 1</em> we focus only on the maximum temperature (maxt) forecasts. We visualize the absolute differences between the maximum forecasted values for the day of and the forecasted value for that day at some time in the past. The differenced values are presented on a map of the United States using warm colors to reflect that the forecast the day of was warmer and cooler colors to reflect cooler temperatures (pun intended). With our data extracted, we need only to calculate the differences and we will plot the data using python's matplotlib with the basemap toolkit.</p>
<p>This visualization will include six subplots- one for each successive day leading up to our target date. Thinking about this another way, if our target date is April 22, 2014 (20140422), and we assign that the letter t, then we are making a subplot for differences between t, the day of forecast, and the forecast made at t-1, t-2, t-…n days.</p>
<p>To collect the data for our target date we run the function below, which makes lists containing the latitude, longitude and differences, and sends them off to be processed by our next function:</p>
<pre class="literal-block">
def make_map(target_date=20140422):
'''Make a basic map of the United States'''
# target_date is the day the forecasts were made for
lat=[]; lon=[]; diff=[]
forecasts = get_single_date_data_from_db(target_date) # Call earlier function to get dictionary
for city in forecasts:
# First collect the lats and longs of the cities
lat.append(city[0])
lon.append(city[1])
#Collect differenced maxt values
diff.append([
(forecasts[city][0][0]-forecasts[city][1][0]),
(forecasts[city][0][0]-forecasts[city][2][0]),
(forecasts[city][0][0]-forecasts[city][3][0]),
(forecasts[city][0][0]-forecasts[city][4][0]),
(forecasts[city][0][0]-forecasts[city][5][0]),
(forecasts[city][0][0]-forecasts[city][6][0]),
(forecasts[city][0][0]-forecasts[city][7][0]),
(forecasts[city][0][0]-forecasts[city][8][0])])
make_basemap(lon,lat,diff,target_date) # Send this information to make_basemap --> our next function!
plt.show()
</pre>
<p>The second function we have named "make_basemap”, and does the mapping work:</p>
<pre class="literal-block">
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap, cm
def make_basemap(lon,lat,diff,target_date):
for day in range(1,7): # Run this for each forecasted difference
subdiff = []
for city in range(0,len(diff)):
subdiff.append(diff[city][day])
plt.subplot(3,2,day) # Define where the subplot will lie on figure
mindiff = min(subdiff)
maxdiff = max(subdiff)
# create Mercator Projection Basemap instance.
m = Basemap(projection='merc',\
llcrnrlat=25,urcrnrlat=50,\
llcrnrlon=-130,urcrnrlon=-60,\
rsphere=6371200.,resolution='l',area_thresh=10000)
# draw coastlines, state and country boundaries, edge of map.
m.drawcoastlines()
m.drawstates()
m.drawcountries()
# draw parallels.
parallels = np.arange(0.,90,10.)
m.drawparallels(parallels,labels=[1,0,0,0],fontsize=10)
# draw meridians
meridians = np.arange(180.,360.,10.)
m.drawmeridians(meridians,labels=[0,0,0,1],fontsize=10)
# draw Circles on the map
# Determine min and max differenced values
jet = plt.cm.get_cmap('jet')
x,y = (m(lon,lat))
sc = plt.scatter(x, y, c=subdiff, vmin=mindiff, vmax=maxdiff, cmap=jet, s=8, edgecolors='none' )
# add colorbar
plt.colorbar(sc)
# add title
plt.suptitle("Differenced Max Temperatures (degrees C) for day "+str(target_date), fontsize=18)
plt.title("Forecast Day 0 - Day "+str(day))
</pre>
<p>Executing make_map() we get the Figure 1. Note that a subplot is created for each differenced forecast through a for loop which also defines the subplot being created.</p>
<p>Like what you see? Stay tuned, the next step on my agenda is making an interactive website that will allow users to play with the data! Thanks for reading!</p>
</div>
SQLite3 Databases: Creating, Populating and Retrieving Data, Part 22014-07-09T14:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-07-09:pop_db.html<p>In <a class="reference external" href="/make_db.html">Part 1 Creating a Database with SQLite3</a> we built a database. Here we will use the functionality of Python 3.4 to help populate our database created in Part 1 with data and weather forecasts. This blog assumes you followd Part 1, and have some prior knowledge of Python. Much of this work was done with <a class="reference external" href="https://github.com/brannerchinese">David Branner</a>, who was incredibly patient in teaching me how to do this... Kudos, David!</p>
<div class="section" id="part-2-populating-an-sqlite-database-using-python">
<h2><strong>Part 2. Populating an SQLite Database using Python</strong></h2>
<p>Let's put some data into our database! First, let's fill up our <em>locations</em> TABLE. We collected and keep a list of the cities, their unique codes provided by Open Weather Map, their latitudes and longitudes and their country codes <a class="reference external" href="https://raw.githubusercontent.com/WeatherStudy/weather_study/master/data/city_lists/city_list_normalized_20140425-1923.txt">here</a>, in a file called city_list_normalized_20140425-1923.txt. This file contains information on the cities and looks like this:</p>
<pre class="literal-block">
id nm lat lon countryCode
819827 Razvilka 55.591667 37.740833 RU
524901 Moscow 55.752220 37.615555 RU
1271881 Firozpur Jhirka 27.799999 76.949997 IN
1283240 Kathmandu 27.716667 85.316666 NP
703448 Kiev 50.433334 30.516666 UA
1282898 Pokhara 28.233334 83.983330 NP
3632308 Merida 8.598333 -71.144997 VE
.
.
.
</pre>
<p>We need a way to grab this file and read the contents in python. Let's create a function that will do just that. If we are in the directory containing the file called city_list_normalized_20140425-1923.txt, we can call the file in python and read its contents:</p>
<pre class="literal-block">
def isolate_city_codes():
filename = 'city_list_normalized_20140425-1923.txt'
with open(filename, 'r') as f:
contents = f.read()
list_of_lines = [line.split('\t') for line in contents.split('\n')[1:]]
# Latitude and longitude should be numbers
for i in range(1, len(list_of_lines)-1):
list_of_lines[i][2] = float(list_of_lines[i][2])
list_of_lines[i][3] = float(list_of_lines[i][3])
return list_of_lines
</pre>
<p>Let's break down what the function is doing. The first thing is that it defines a string called filename as 'city_list_normalized_20140425-1923.txt'. The next two lines of code are contained in a 'with statement'. A 'with statement' is a context manager, which provides a way to safely close the opened file and exit out of the python script in case of an error. The contents of the file are read and placed into the variable <em>contents</em> that looks something like this:</p>
<pre class="literal-block">
'id\tnm\tlat\tlon\tcountryCode\n819827\tRazvilka\t55.591667\t37.740833\tRU\n524901\tMoscow\t55.752220\t37.615555\tRU ...
...
\n895417\tBanket\t-17.383329\t30.400000\tZW\n'
</pre>
<p>Notice the once tab seperated entries of the file are now separated with '\t' and the lines are now separated with '\n'. The next line of our program defines list_of_lines, which runs a loop through contents that splits out each line (defined by '\n') and each tab separated space (defined by '\t'). list_of_lines now looks like this:</p>
<pre class="literal-block">
[['819827', 'Razvilka', '55.591667', '37.740833', 'RU'],
['524901', 'Moscow', '55.752220', '37.615555', 'RU'],
['1271881', 'Firozpur Jhirka', '27.799999', '76.949997', 'IN'],
.
.
.
['895417', 'Banket', '-17.383329', '30.400000', 'ZW']
]
</pre>
<p>So, list_of_lines is a <em>list of lists</em>, where a list contains the contents within a set of square brackets. The problem with the current list_of_lines is that the latitude and longtitudes are strings and must be converted into floats, which is done using the for statement. Finally, the revised list_of_lines is returned with floats for the latitude and longitude.</p>
<p>Now, let's populate the TABLE <em>locations</em> in the sqlite3 database <em>weather_data_OWM.db</em>. We write another function that calls the previous function to grab the data, then it populates the <em>locations</em> TABLE with those values:</p>
<pre class="literal-block">
import sqlite3
def populate_db_w_city_codes(db='weather_data_OWM.db'):
connection = sqlite3.connect(db)
with connection:
city_codes = isolate_city_codes()
cursor = connection.cursor()
for code in city_codes[1:-1]:
if code == ['']:
print('\n Empty tuple found; skipping.\n')
continue
cursor.execute(
'''INSERT INTO locations VALUES''' +
str(tuple(code)))
</pre>
<p>Note that we have to import the python module sqlite3. This module allows you to 'connect' with a specified database. Once you have a connection, you can create a cursor object that calls its execute() method to perform SQLite3 commands. In the function described above, we create a <em>connection</em> which connects to our database (db = 'weather_data_OWM.db'). Then we apply a context manager (the with statement) to:</p>
<blockquote>
<ol class="arabic simple">
<li>Collect the information contained within <em>city_list_normalized_20140425-1923.txt</em> by calling our previous function, <em>isolate_city_codes()</em>. The returned <em>list_of_lines</em> from <em>isolate_city_codes()</em> is now labeled as <em>city_codes</em>.</li>
<li>Open a <em>cursor</em> that will execute subsequent <em>SQLite3</em> commands.</li>
<li>Insert the values of <em>city_codes</em> into the <em>locations</em> TABLE.</li>
</ol>
</blockquote>
<p>Notice that the SQLite3 commands are imbedded into cursor.execute. The lists within <em>city_codes</em> are already in the order we want them (same order as the database columns were set up in, see <a class="reference external" href="/make_db.html">Part 1</a>). They have been 'tuple-ized' and 'string-ified' since this is the format SQLite3 understands.</p>
<p>After executing, you can now check if they were inserted into your database by entering the sqlite3 repl:</p>
<pre class="literal-block">
sqlite3 weather_data_OWM.db
</pre>
<p>Once in the sqlite repl type:</p>
<pre class="literal-block">
SELECT * FROM locations;
</pre>
<p>The output should look something like this:</p>
<pre class="literal-block">
.
.
.
894413|Chakari|-18.062941|29.89246|ZW
894460|Centenary|-16.722891|31.11462|ZW
895057|Binga|-17.620279|27.341391|ZW
895417|Banket|-17.383329|30.4|ZW
</pre>
<p>Your new table should have data that include: <em>city id|city name|latitude|longitude|two letter country code</em>.</p>
<p>Splendid! One table down, one related table to go! The second table is a bit more complicated. It involves data that was downloaded through the <a class="reference external" href="http://openweathermap.org/">Open Weather Map</a> API that allows for easy access to their data products that are available in XML and JSON formats. Since this blog is focusing on building and populating databases, I assume that you already have the data downloaded in JSON format. I will not get into how to download the data here, but for more information on how to do this, David developed a nifty little python script called <a class="reference external" href="https://raw.githubusercontent.com/WeatherStudy/weather_study/master/code/requests.py">requests.py</a> that allows you to download data using an API key that is hidden from public access (important when allowing public access to your files in Github).</p>
<p>We use the JSON formatted files to populate our database. JSON files are in the form of a dictionary, also known as an associative array. If you are not familiar with this data structure, I recommend you read <a class="reference external" href="/dict.html">this little ditty</a> before continuing. Otherwise, keep on reading!</p>
<p>Below is an example of a JSON file obtained using the Open Weather Map API that has been prettified using <a class="reference external" href="http://jsbeautifier.org/">http://jsbeautifier.org/</a>. The JSON file contains the forecasts and information for a single city:</p>
<pre class="literal-block">
{
'cod': '200',
'message': 0.005,
'city': {
'name': 'Bay Minette',
'id': 4046255,
'coord': {
'lat': 30.882959,
'lon': -87.773048
},
'population': 8044,
'country': 'US'
},
'list': [{
'weather': [{
'description': 'few clouds',
'icon': '02d',
'main': 'Clouds',
'id': 801
}],
'temp': {
'max': 27.32,
'min': 18.14,
'eve': 24.57,
'day': 27.22,
'night': 18.14,
'morn': 27.22
},
'deg': 199,
'clouds': 12,
'pressure': 1020.38,
'humidity': 42,
'dt': 1398186000,
'speed': 2.11
}, {
'weather': [{
.
.
.
}],
'cnt': 15
}
</pre>
<p>Now you see that it is just one giant dictionary, right? So if we import this into python, then we can call certain values by their keys. For example, if we call this dictionary x, then we can retrieve the latitude of the city by typing:</p>
<pre class="literal-block">
x['city']['coord']['lat']
</pre>
<p>In this JSON file the 'city' key contains the information about the city itself, and the 'list' key contains information on the weather forecasts, where the first value contains information on the weather forecasts for the day the file was downloaded. The second value in 'list' contains the forecast for the next day, etc.</p>
<p>You can see that the file contains the minimum and maximum temperature, and, if it exists, also contains snow and rain amounts. 'dt' is the day the forecast is for in <a class="reference external" href="http://en.wikipedia.org/wiki/Unix_time">Unix Time</a>. The 'query date' which is the day the file was downloaded is not included in these files but is important because this will tell you which day is the day-of forecast. We dealt with this problem by downloading the JSON files for each city into a directory with the download date.</p>
<p>The first thing we will need to do here is extract the information we need from these JSON files. For the sake of simplicity, I assume you know the download date and specify it in the python code (rather than extracting it from the directory name). Depending on what region you are collecting, you may have thousands of files for one download date, each corresponding to an individual location. We would like a function that</p>
<blockquote>
<ol class="arabic simple">
<li>Ingests these JSON formatted files and stores the contents as a dictionary</li>
<li>Create a smaller dictionary called <em>forecast_dict</em> that contains only the information that we need for our database. The smaller dictionary should have a 'key' that relates to the city_id, and values that contain the forecasted values.</li>
</ol>
</blockquote>
<p>I assume that you have the names of your files in a list called <em>files</em> that were collected on a specified <em>query_date</em>. I use an example query date of 20140422:</p>
<pre class="literal-block">
files = [yourfile1.json, yourfile2.json, yourfile3.json... ] # example files
import ast
def retrieve_data_vals(files, query_date='20140422'):
forecast_dict = {'query_date': query_date} # Assign query_date to dictionary
files.sort()
for file in files:
forecast_list_pruned = []
try:
with open(file, 'r') as f:
contents = f.read() # Read in file as a string
except Exception as e:
print('Error {}\n in file {}'.format(e, file))
if contents == '\n':
print('File {} empty.'.format(file))
continue
content_dict = ast.literal_eval(contents) # Convert to dictionary
city_id = (content_dict['city']['id']) # Assign city_id
forecast_list_received =(content_dict['list']) # Assign everythin in 'list' to forecast_list_received
for i, forecast in enumerate(forecast_list_received): # For each forecast in the dictionary
if 'rain' in forecast: # Assign rain, if exists,
rain = forecast['rain'] # Otherwise make 0
else:
rain = 0
if 'snow' in forecast: # Same with snow
snow = forecast['snow']
else:
snow = 0
forecast_tuple = ( # Assign forecast information in tuple form that is SQLite3 readable (if stringified)
forecast['dt'],
float(forecast['temp']['max']),
float(forecast['temp']['min']),
float(rain),
float(snow),
)
forecast_list_pruned.append(forecast_tuple) # Collect all forecasts for that file
forecast_dict[city_id] = forecast_list_pruned # and assign to the forecast_dict for each city
return forecast_dict
</pre>
<p>Phew! Extracting the data from the JSON files and putting it into an SQLite3 friendly format is the toughest part. Now that we have forecast_dict, however, we can populate our database! Our next function will use some of the same techniques described above, which include using the sqlite3 module to make a connection with the sqlite database and execute <strong>SQLite3</strong> commands:</p>
<pre class="literal-block">
import sqlite3
def populate_db_w_forecasts(db='weather_data_OWM.db'):
forecast_dict = retrieve_data_vals(files) # Run the function retrieve_data_vals above which returns the forecast dictionary
query_date = forecast_dict['query_date'] # Assign query_date
connection = sqlite3.connect(db) # Create the SQLite3 connection
with connection:
cursor = connection.cursor()
for key in forecast_dict:
if key == 'query_date':
continue # After here, "key" is a location_id.
for i,item in enumerate(forecast_dict[key]):
target_date = datetime.datetime.fromtimestamp(int(item[0])).strftime('%Y%m%d') # Convert the Unix time to human readable string
maxt, mint, rain, snow = item[1:] # Remember forecast dict contains dt, maxT, minT, rain and snow, so we want everything past dt (hence item[1:])
i = str(i)
fields = ','.join([ # 'fields' contains question marks that indicate where values will be inserted later in the code
'maxt_' + i + '=?',
'mint_' + i + '=?',
'rain_' + i + '=?',
'snow_' + i + '=?'
])
try:
cursor.execute( # Insert the location_id (key) and target_date
'''INSERT INTO owm_values '''
'''(location_id,target_date) '''
'''VALUES (?,?)''', (key, target_date))
except sqlite3.IntegrityError as e:
pass
cursor.execute( # Insert forecast values
'''UPDATE owm_values SET ''' + fields +
''' WHERE id='''
'''(SELECT id FROM owm_values '''
'''WHERE location_id=? AND target_date=?)''',
(maxt, mint, rain, snow, key, target_date)
</pre>
<p>Let's talk a little about <em>cursor.execute</em>. Here we do a little python trick to insert values into the SQLite code. In cursor.execute, we state the SQLite3 commands, but we include question marks (?). After the SQLite commands we place a comma then a tuple of values. These tuple values are inserted into where the question marks appear in the code in the order the question marks appear. So, in the case of:</p>
<pre class="literal-block">
cursor.execute(
'''INSERT INTO owm_values '''
'''(location_id,target_date) '''
'''VALUES (?,?)''', (key, target_date))
</pre>
<p>The SQLite3 command is:</p>
<pre class="literal-block">
INSERT INTO owm_values (location_id, target_date) VALUES (key, target_date)
</pre>
<p>where 'key' is the location city id, and 'target_date' is the date the forecast is for. Note the <em>location_id</em> of the <em>owm_values</em> TABLE refers to the <em>city_codes</em> of the locations TABLE.</p>
<p>There you go! You now have a relational database that has been populated with data! Now what to do with it... Stay tuned for Part 3 Retrieving data from an SQLite Database using Python.</p>
</div>
SQLite3 Databases: Creating, Populating and Retrieving Data, Part 12014-07-04T14:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-07-04:make_db.html<p>Structured Query Language (SQL) is a langauge that is used to design and manage data held in a relational database. A relational database is a database that contains multiple tables that contain related values. For example, one table may contain names of people and their ages, and another may contain names of people and their favorite color. The names of the people are the related values. SQL provides a relatively easy (and commonly used) way of extracting only the data you want from the database that can later be analyzed or visualized.</p>
<p><a class="reference external" href="https://github.com/brannerchinese">David Branner</a>, a fabulous <strong>python</strong> coder who dabbles in creating and using <strong>SQLite</strong> databases, and knows a thing or two about the <a class="reference external" href="https://brannerchinese.com/">Chinese Language</a>, and I are working on <a class="reference external" href="https://github.com/WeatherStudy/weather_study">The Weather Project</a> where we intend to examine the accuracy of weather forecasts. In order to do that, we need to collect weather forecasts that will be analyzed. We decided to use weather forecasts from <a class="reference external" href="http://openweathermap.org/">Open Weather Map</a>, a website that gives open access to weather forecasts through an API key. Through the API key, we are able to download JSON files that contain information on the weather forecasts at specific locations around the world. Our goal is for each day to collect weather forecasts for that day and from 1 day before to about two weeks out. We collect the maximum temperature (maxt), the minimum temperature (mint), snow and rain forecasts for each of the forecasts. Then we subtract the predicted value from the observation to estimate how much the forecast predicts warmer/cooler temperatures or more/less snow and rain. Hence, we need to collect a lot of information and organize it in a way that will be relatively easy and consistent to retreive. To do that, we created an SQLite3 database. This blog is the first of three, and focuses on <strong>creating a Database with SQLite3</strong>. The next blogs will cover <strong>Populating an SQLite Database with Data using Python</strong> and <strong>Retrieving data from an SQLite Database using Python</strong>.</p>
<div class="section" id="part-1-creating-a-database-with-sqlite3">
<h2><strong>Part 1: Creating a Database with SQLite3</strong></h2>
<p><strong>SQLite</strong> is a compact and self contained relational database management system. We decided to use <strong>SQLite3</strong> (Mac OS X's version of SQLite) because</p>
<blockquote>
<ol class="arabic simple">
<li>It is included on the Mac OS X operating system (/usr/bin/sqlite3)</li>
<li>It does not require a server and no need for an administrator</li>
<li>It does not include any configuration files</li>
<li>No action is required after a system crash</li>
</ol>
</blockquote>
<p>Certainly, there are issues with <strong>SQLite</strong>, but for our humble little project <strong>SQLite</strong> provides all the functionality we wanted. If you are running Mac OSX you can use SQLite3. Be sure that /usr/bin/ is in your path (it already should be there). You can check to see if you have it by typing:</p>
<pre class="literal-block">
which sqlite3
</pre>
<p>Let's get started. First, a few things about sqlite3. You can enter the sqlite3 <a class="reference external" href="http://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop">repl</a> by simply typing sqlite3 at the command line. Or, you can type:</p>
<pre class="literal-block">
sqlite3 mydatabase.db
</pre>
<p>to ensure your creations/populations/extractions are all for the database mydatabase.db (or whatever you want it named). If you make a sqlite3 script that is applied to mydatabase.db called myscript.sql, you can run it at the command line by typing:</p>
<pre class="literal-block">
sqlite3 mydatabase.db < myscript.sql.
</pre>
<p>Our <strong>SQLite</strong> database that we named <em>weather_data_OWM.db</em> is set up with multiple tables. Information within those tables are related, and is referred to as a <em>relational database</em>. As previously mentioned, a relational database is setup so that there is some common information between tables that helps link them. Our database tables are linked by city id. The city id is simply a unique number assigned to each location that has a weather forecast. In one table we keep the properties of each location, such as the latitude, longitude, city name, etc. In the other, we assign the forecasts to each city id. Let's take a closer look at how this works.</p>
<p>The first thing we did was create a TABLE called <em>locations</em> which contains the id, name, latitude, longtiude and country:</p>
<pre class="literal-block">
DROP TABLE IF EXISTS locations;
CREATE TABLE locations (
id TEXT PRIMARY KEY UNIQUE,
name TEXT,
lat NUMBER,
lon NUMBER,
country TEXT
);
</pre>
<p>Eeeek! The "DROP TABLE" part of this code is a little scary -- here we are saying if there is already a table in our database called <em>locations</em> then remove it! The table <em>locations</em> will be completely removed and can not be recovered. You may ask, <em>why would you want to do that???</em> Well, this code is simply meant to provide the bones for our database. The only reason we are running this script is to make a database from scratch, and if one exists, it should be removed. It is also recommended because you don't want to confuse the current data with other data sets if a table called <em>locations</em> exists. So <strong>BE CAREFUL</strong> with this command.</p>
<p>The next command lines create the table with columns that are defined as containing a certain type of field. The columns that we have are id, name, lat, lon and country and are either TEXT (strings) or NUMBERS (floats). The id column is special because it also contains PRIMARY KEY command. The PRIMARY KEY command ensures that all rows in that column are uniquely identifiable. To be extra certain of this (but may be a little redundant), we also included UNIQUE, which ensures that all values in the column are different.</p>
<p>How can we tell if the table was made properly? If you entered the commands above in the repl, then type:</p>
<pre class="literal-block">
SELECT * FROM sqlite_master WHERE type='table';
</pre>
<p>What should print out is information on your new table, including its structure:</p>
<pre class="literal-block">
table|locations|locations|2|CREATE TABLE locations (
id TEXT PRIMARY KEY UNIQUE,
name TEXT,
lat NUMBER,
lon NUMBER,
country TEXT
)
</pre>
<p>The line "table|locations|locations|2|CREATE TABLE locations" is simply output that states: <em>type|name|table name|root page #|sql command used to generate the table</em>. Then the table column names are printed.</p>
<p>Very good! Now we have a table that will contain some characteristics of each city. Now let's make a second TABLE that includes the weather forecasts and will be related to the first one by the city code. We are collecting forecasts for up to 14 days before a <em>target_date</em> which we define as the day being forecasted. We want to know the forecasts for rain and snow, as well as the minimum and maximum temperatures for the <em>target_date</em>. As before, we first need to DROP any existing tables, then we create the table:</p>
<pre class="literal-block">
DROP TABLE IF EXISTS owm_values;
CREATE TABLE owm_values (
id INTEGER PRIMARY KEY AUTOINCREMENT,
location_id TEXT,
target_date INTEGER,
maxt_0 NUMBER,
mint_0 NUMBER,
rain_0 NUMBER,
snow_0 NUMBER,
maxt_1 NUMBER,
mint_1 NUMBER,
rain_1 NUMBER,
snow_1 NUMBER,
maxt_2 NUMBER,
mint_2 NUMBER,
rain_2 NUMBER,
snow_2 NUMBER,
maxt_3 NUMBER,
mint_3 NUMBER,
rain_3 NUMBER,
snow_3 NUMBER,
maxt_4 NUMBER,
mint_4 NUMBER,
rain_4 NUMBER,
snow_4 NUMBER,
maxt_5 NUMBER,
mint_5 NUMBER,
rain_5 NUMBER,
snow_5 NUMBER,
maxt_6 NUMBER,
mint_6 NUMBER,
rain_6 NUMBER,
snow_6 NUMBER,
maxt_7 NUMBER,
mint_7 NUMBER,
rain_7 NUMBER,
snow_7 NUMBER,
maxt_8 NUMBER,
mint_8 NUMBER,
rain_8 NUMBER,
snow_8 NUMBER,
maxt_9 NUMBER,
mint_9 NUMBER,
rain_9 NUMBER,
snow_9 NUMBER,
maxt_10 NUMBER,
mint_10 NUMBER,
rain_10 NUMBER,
snow_10 NUMBER,
maxt_11 NUMBER,
mint_11 NUMBER,
rain_11 NUMBER,
snow_11 NUMBER,
maxt_12 NUMBER,
mint_12 NUMBER,
rain_12 NUMBER,
snow_12 NUMBER,
maxt_13 NUMBER,
mint_13 NUMBER,
rain_13 NUMBER,
snow_13 NUMBER,
maxt_14 NUMBER,
mint_14 NUMBER,
rain_14 NUMBER,
snow_14 NUMBER,
UNIQUE (location_id, target_date),
FOREIGN KEY (location_id) REFERENCES locations(id)
);
</pre>
<p>In this table, each forecast is given its own, unique id (called id). In addition, it contains a location_id, which will refer to <em>id</em> in our first TABLE, <em>locations</em>. These values 'link' the two tables, creating a relational database. The FOREIGN KEY statement defines this relationship, stating that the location_id of the TABLE <em>owm_values</em> is REFERENCED to the id of TABLE <em>locations</em>. We also created columns in our TABLE that will store forecasts from the day of (*_0) to 14 days out (*_14). UNIQUE ensures that both the location_id and the target_date are unique in this table (i.e., every city will have its own unique id, and every city will have forecasts for unique target dates).</p>
<p>Now if you type into the repl:</p>
<pre class="literal-block">
SELECT * FROM sqlite_master WHERE type='table';
</pre>
<p>Two tables should print out -- the first one being the <em>locations</em> table, the second your brand new <em>owm_values</em> table.</p>
<p>Congratulations! You have now set up a database in SQLite3 that contains two tables. Now for <a class="reference external" href="/pop_db.html">Part 2 Populating an SQLite Database using Python</a> coming soon...</p>
</div>
The Dictionary Data structure2014-07-01T14:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-07-01:dict.html<div class="section" id="dictionaries">
<h2><strong>Dictionaries</strong></h2>
<p>This page briefly reviews a <strong>dictionary</strong>, also known as an <strong>associative array</strong>, a <strong>map</strong> or a <strong>symbol table</strong>. A dictionary is composed of a collection of keys and values, and each key appears only once in a collection. JSON files, human-readable files that are commonly used for web applications, contains data objects in the form of a dictionary.</p>
<p>Let's demonstrate how a dictionary is defined in <strong>python</strong>. Open a <strong>python</strong> repl by typing on the command line: python. Once in your repl type:</p>
<pre class="literal-block">
a = { 'Hello': 'How are you?'}
</pre>
<p>'a' is a dictionary. Notice how it was defined with the curly brackets. This dictionary contains one <em>key</em> on the left of the colon, and one <em>value</em> on the right side of the colon. You can retrieve the <em>value</em> by typing in the repl:</p>
<pre class="literal-block">
a['Hello']
</pre>
<p>and pressing enter, which returns 'How are you?'. You can retreive the list of keys by typing:</p>
<pre class="literal-block">
a.keys()
</pre>
<p>and pressing enter, which returns 'Hello'. Now let's make this dictionary a little more complicated:</p>
<pre class="literal-block">
a = { 'Hello': 'How are you?', 'Goodbye': 'See you Later!' }
</pre>
<p>Here you now have two keys that have their own associated values. Here you can still run:</p>
<pre class="literal-block">
a['Hello']
</pre>
<p>which will still return 'How are you?', but now you can type:</p>
<pre class="literal-block">
a['Goodbye']
</pre>
<p>which returns 'See you Later!' A list of keys can be obtained by typing:</p>
<pre class="literal-block">
a.keys()
</pre>
<p>which returns the keys in a list form: ['Hello', 'Goodbye']. These keys can be retrieved individually by specifying their element indices starting at 0. For example:</p>
<pre class="literal-block">
a.keys()[0]
</pre>
<p>returns 'Hello'. Making sense? OK, let's add one more layer of abstraction which is that the values (of the key/value pair) can be a dictionary. Changing our dictionary again, type:</p>
<pre class="literal-block">
a = { 'Hello': 'How are you?', 'Goodbye': {'a':'See you Later!','b':'Later Gator' }}
</pre>
<p>into the python repl. Notice that the key 'Goodbye' now points to a dictionary. If you type into the python repl:</p>
<pre class="literal-block">
a ['Goodbye']
</pre>
<p>you now get the dictionary {'a': 'See you Later!', 'b': 'Later Gator'}. You can choose a particular value of this sub-dictionary by specifying:</p>
<pre class="literal-block">
a ['Goodbye']['b']
</pre>
<p>which returns 'Later Gator'.</p>
<p><a class="reference external" href="/pop_db.html">Return to Populating a Database</a>!</p>
</div>
The Cascadia Subduction Zone gets Creepier and Creepier...2014-05-23T14:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-05-23:SSEs.html<div class="section" id="cascadia-subduction-zone-creep">
<h2><strong>Cascadia subduction zone creep</strong></h2>
<p>This blog is a continuation of the original <a class="reference external" href="http://geodesygina.com/Cascadia.html">'Why the Cascadia Subduction Zone is Creepy'</a> blog posted a few weeks ago, and many of the terms used in this post are defined there. My collaborators on this project are <a class="reference external" href="http://web.pdx.edu/~pdx07343/">Rob McCaffrey</a> at <a class="reference external" href="http://www.pdx.edu/">Portland State University</a> and <a class="reference external" href="http://www.ess.washington.edu/dwp/people/profile.php?name=creager--ken">Ken Creager</a> at the <a class="reference external" href="http://www.washington.edu/">University of Washington</a>. The paper this blog is based on is found <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">here</a>.</p>
<p>The Cascadia subduction zone is not just creepy, but it is creepy on many different levels (<em>Figure 1</em>).</p>
<img alt="cat" class="align-center" src="/images/touch_my_tail.jpg" style="width: 300.0px; height: 300.0px;" />
<p><em>Figure 1. I had to include this</em> <a class="reference external" href="http://cheezburger.com/1384231168">figure</a>. <em>So funny, I laughed for hours, and I'm not even a fan of pet photos.</em></p>
<p>No, I don't mean that kind of creepy. What I mean is that the tectonic plates that make up the Cascadia subduction zone between major earthquakes are in some places stuck together, but in others are partially slipping (aka, creeping). My previous <a class="reference external" href="http://geodesygina.com/Cascadia.html">blog</a> talks about regions on the subduction fault that are stuck (or 'locked'), and regions undergoing persistent fault creep between major earthquakes, where persistent fault creep means just that -- between earthquakes the plates are constantly, slowly slipping. However, there is yet another slip phenomenon that periodically occurs between major earthquakes called slow slip, and is the topic of this blog.</p>
<p><em>Figure 2</em> is a cross section of the Cascadia subduction zone that shows the Juan de Fuca plate subducting beneath the North America plate. On the up-dip portion of the fault, the plates are stuck in between large earthquakes. This region is expected to be where the next megathrust earthquake (magnitude ~9) will occur. Much further down-dip, the plates slide freely past each other. Between these two regions, however, it was fairly recently discovered using continuously recording GPS that the two plates periodically slip over a period of weeks to months (<a class="reference external" href="http://www.sciencemag.org/content/292/5521/1525">Dragert et al., 2001</a>), and in doing so accumulate enough slip to be equivalent to moment magnitude 6-7 earthquakes! Interestingly, you can't physically feel these periodic slow slip events (or SSEs) because they happen so slowly compared to an earthquake, which can last for a few seconds to minutes. Major slow slip events happen every 10 to 24 months, depending where you are observing along the Cascadia subduction zone. We will talk about how often these events occur a little later in the blog.</p>
<img alt="xsection" class="align-center" src="/images/xsection_w_sse.png" style="width: 700.0px; height: 500.0px;" />
<p><em>Figure 2. Cross-sectional view of the Cascadia Subduction Zone. Image from</em> <a class="reference external" href="http://ooi.washington.edu/rsn/jrd/">John Delaney</a>. <em>White oval indicates region that experiences slow slip and non-volcanic tremor. The</em> <a class="reference external" href="http://peartreedesigns.blogspot.com/2011/11/devil-wallpapers.html">little devil guy</a>, <em>that is courtesy of too much coffee. Thanks to Aaron Wech for giving me the idea (BTW -- that is not Aaron Wech in the photo, though maybe it should be...).</em></p>
<p>Not too much time after the discovery of SSEs, periodic bursts of noise were observed at nearly the same time among multiple local seismometers (<em>Figure 3</em>).</p>
<img alt="seismo_map" class="align-center" src="/images/seimograph_map.jpg" style="width: 500.0px; height: 300.0px;" />
<img alt="seismo_data" class="align-center" src="/images/seismo_tremor.jpg" style="width: 500.0px; height: 300.0px;" />
<p><em>Figure 3. Figures modified from the</em> <a class="reference external" href="http://www.earthquakescanada.nrcan.gc.ca/pprs-pprp/re/ETS-eng.php">Natural Resources Canada webpage</a>. <em>(A) Map of seismometer network. (B) Example seismic records for corresponding seismometers located in (A).</em></p>
<p>Soon scientists realized what they were seeing wasn't noise at all -- it was actually a seismic signal generated from these periodic SSEs that became known as non-volcanic tremor, sometimes referred to simply as tremor. <em>Figure 4</em> demonstrates how well in time the non-volcanic tremor correlates with GPS detection of periodic slow slip. The blue dots in <em>Figure 4</em> are the east component positions of a GPS site on Vancouver, WA. The time series produces a saw-tooth pattern. Each drop indicates that the motion of the station temporarily reverses (indicating an SSE). Non-volcanic tremor activity is also plotted on <em>Figure 4</em> and shows that the non-volcanic tremor peaks during these GPS detected SSEs.</p>
<img alt="ets" class="align-center" src="/images/ETS.jpg" style="width: 500.0px; height: 300.0px;" />
<p><em>Figure 4. Modified from the</em> <a class="reference external" href="http://www.earthquakescanada.nrcan.gc.ca/pprs-pprp/re/ETS-eng.php">Natural Resources Canada webpage</a>. <em>Blue circles are daily east position time series of a GPS site near Victoria. The green line is the long term eastward motion of the site (with respect to North America), and the red saw-tooth line shows the motion of the site between events is faster than the long-term motion. The bottom black line shows the number of hours of tremor activity observed on southern Vancouver Island.</em></p>
<p>The combination of periodic slow slip and non-volcanic tremor together was coined by the Canadian Geologic Survey as 'Episodic Tremor and Slip (ETS)' (<a class="reference external" href="http://www.pnsn.org/tremor/rogers_ETS.pdf">Rogers and Dragert, 2003</a>). Intriguingly, non-volcanic tremor and SSEs are not observed together or at all for all subduction zones, but that is a topic for another blog.</p>
<p>Subsequent studies have shown that in Cascadia ETS recurrence varies along strike of the subduction zone. <em>Figure 5</em>, (from <a class="reference external" href="http://www.intl-geology.geoscienceworld.org/content/35/10/907.abstract">Brudzinski and Allen, 2007</a>) color codes select continuously recording GPS (squares) and broadband seismometers (triangles) by how often they detect periodic slow slip and tremor, respectively. Warmer colors indicate the site detected them more often. What <a class="reference external" href="http://www.intl-geology.geoscienceworld.org/content/35/10/907.abstract">Brudzinski and Allen, 2007</a> found is that ETS recurrence seems to be segmented along the margin, with ETS events happening every ~10 months in northern CA, ~24 months in central to northern Oregon, and about every 14 months in Washington (<em>Figure 5</em>).</p>
<img alt="Brudzinski_Allen" class="align-center" src="/images/Brudzinski_Allen_fig.png" style="width: 300.0px; height: 500.0px;" />
<p><em>Figure 5. Map of the Cascadia subduction zone modified from</em> <a class="reference external" href="http://www.intl-geology.geoscienceworld.org/content/35/10/907.abstract">Brudzinski and Allen, 2007</a>. <em>Squares and triangles represent locations of high precision GPS and broadband seismometers, respectively, and are colored by how often slip and tremor are detected.</em></p>
<p>So the recurrence of these events are not the same along the margin, but does that mean that the amount of tremor and slip along the margin also differ? First, let's look at the tremor. The <a class="reference external" href="http://www.pnsn.org/">Pacific Northwest Seismic Network</a>, operated out of the <a class="reference external" href="http://www.washington.edu/">University of Washington</a>, keeps a continuously updating catalog of tremor along the entire margin. For some interactive tremor fun, you might want to check out their <a class="reference external" href="http://www.pnsn.org/tremor">tremor mapping tool</a>. <em>Figure 6</em> is a tremor density map -- in other words, it takes how many tremors were detected over a specified region (the squares on the map), applies that number to a color scale that is then used to color the region. Dark blue colors indicate regions where the tremor counts are higher. Correlating well with the how often tremor and SSEs are detected, tremor counts for the time period of August 2009-August 2013 (2009.6-2013.6) are elevated where the recurrence time is shorter and lower where the tremor and SSEs are detected less (<em>Figures 5 and 6</em>).</p>
<img alt="Brudzinski_Allen" class="align-center" src="/images/tremor.png" style="width: 300.0px; height: 500.0px;" />
<p><em>Figure 6. Non-volcanic tremor density map of the Casacadia subduction zone. Tremors from August 2009-August 2013 are used. Tremor counts larger than 400 are colored blue. Tremor locations from the</em> <a class="reference external" href="http://www.pnsn.org/">Pacific Northwest Seismic Network</a> <em>tremor catalog.</em> Solid red line marks the 10 mgal gravity anomaly from <a class="reference external" href="http://courses.washington.edu/ess502/BlakelyGeology2005.pdf">Blakely et al., 2005</a>.</p>
<p>So what about periodic SSEs? The total amount of slip on a fault due to periodic SSEs over time is a little more difficult to estimate because our observations are on the surface of the earth, but we really want to know what is going on down on the fault. In order to figure that out, we will need to build a mechanical model, but we will get to that part in a minute. For now, let's take a look at the data. In <em>Figure 4</em> the east component GPS time series of a site in Vancouver, Canada is shown. The GPS east position time series in this figure has a slope (notice that the time series position starts at about 1996 at -5 mm, and ends in 2004 at about 28 mm. The slope (calculated by eye) is then (28mm- -5mm)/(2004-1996) = 33mm/8yrs = 4.125 mm/yr). This slope marks the long term velocity of the time series, which is illustrated in <em>Figure 4</em> as a green line. Notice that in between slow slip events the slope is larger (red line), and indicates the inter-SSE velocity, which in Cascadia seems to be pretty consistent between SSEs. To better visualize the GPS offsets from SSEs along the Cascadia subduction zone, the inter-SSE velocity is simply subtracted from the time series. <em>Figure 7</em> displays the time series from select sites from Canada down to northern California. Note that the SSEs (marked by jumps in the time series) are well defined and fairly frequent in the north, reduce in amplitude and recurrence as we enter Oregon, then pick up again as we move into southern Oregon and northern California. South of about 40 degrees latitude SSEs are not detected with GPS.</p>
<img alt="deforming_plates" class="align-center" src="/images/time_series.png" style="width: 700.0px; height: 500.0px;" />
<p><em>Figure 7. Map of inter-SSE GPS velocities (black arrows) with select GPS monuments labeled (a). East component of detrended GPS position time series (red dots) with model fit (black line) for sites labeled on the map (b). The site name, latitude of the site (Lat), and the east and north velocity components (Ve and Vn, respectively) are given. Figure from</em> <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">Schmalzle et al., 2014</a>.</p>
<p>Now, let's get into how we figure out what is going on at the fault during the SSEs. As mentioned previously, the GPS position time series are observations on the surface of the earth, but we would like to know how much periodic slow slip is occurring on the fault. Similar to my previous <a class="reference external" href="http://geodesygina.com/Cascadia.html">blog</a>, we use a mechanical model. To briefly review, a mechanical model mathematically mimics the behavior of the earth and the math behind these models are based on what we think the earth is doing. In the previous <a class="reference external" href="http://geodesygina.com/Cascadia.html">blog</a>, we used a mechanical block model to explore how much the tectonic plates are stuck in between earthquakes, but in this blog we are interested in seeing how much and where the plates are slipping during SSEs. We again use the block modeling software TDEFNODE, which breaks up the region of interest into tectonic blocks (<em>Figure 8</em>). Instead of using the long-term pre-estimated GPS velocities with the model we use GPS time-series directly. What I want you to take away here is that the model mimics how much the tectonic plates are stuck between the SSEs, and how much they slip during the SSEs. It estimates slip for 16 SSEs that occur between 2005.5 and 2011 throughout the Cascadia subduction zone. The goal here is to add up all the SSE slip from that time period and see how it changes as we go from north to south. For details on the modeling, please refer to <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">Schmalzle et al., 2014</a>.</p>
<img alt="deforming_plates" class="align-center" src="/images/block_model.png" style="width: 300.0px; height: 500.0px;" />
<p><em>Figure 8. Geometry of the three dimensional block model. Thick black lines mark block boundaries, dots the three dimensional subduction interface. Block names are labeled.Figure modified from</em> <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">Schmalzle et al., 2014</a>.</p>
<p>Now let's look at the results! The black lines in <em>Figure 7b</em> GPS position time series are modeled east positions over time for points that colocate with the observed GPS monuments. <em>Figure 9</em> are examples of model estimated fault slip patterns for two SSEs in 2007, plotted next to tremor detected during the same time period.</p>
<img alt="deforming_plates" class="align-center" src="/images/ets_examples.png" style="width: 700.0px; height: 900.0px;" />
<p><em>Figure 9. Slip distributions for two SSEs in 2007 estimated using the block model. The left-hand images show slip patterns (colors) overlain with estimated GPS displacement vectors for that event (red arrows). The images to the right show non-volcanic tremor locations that occurred in the same time period as the SSEs. Blue dots are tremor from the</em> <a class="reference external" href="http://www.pnsn.org/">Pacific Northwest Seismic Network</a>, <em>and red dots are tremors from the</em> <a class="reference external" href="http://miamioh.edu/">Miami University</a> <em>catalog, courtesy of</em> <a class="reference external" href="http://www.units.miamioh.edu/geology/people/brudzinski.html">M. Brudzinski</a>. <em>Figure modified from the Supplementary material of</em> <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">Schmalzle et al., 2014</a>.</p>
<p>As expected, <em>Figure 9</em> shows that regions experiencing non-volcanic tremor seem to be the same regions the model detects slip for a given time period. Phew. So, now, let's add up all the slip from all the slow slip events and see what we get. <em>Figure 10a and b</em> show cumulative GPS displacements and modeled cumulative slow slip on the fault, respectively, for the time period between 2005.5 and 2011. <em>Figure 10c</em> plots the cumulative tremor counts (blue line) and the sum of slow slip estimated at each node for each down-dip row of nodes as a function of latitude. The non-volcanic tremor data used in this plot spans from 2009.8 to 2013.0, whereas the estimated slip is from all SSEs between 2005.5-2011. Hence, some discrepancies are apparent. However, it is noted that in both cases, more tremor and slow slip occur in northern California and Washington. Both are suppressed between ~42-43 and 46 degrees latitude.</p>
<img alt="deforming_plates" class="align-center" src="/images/cumulative_SSEs.png" style="width: 750.0px; height: 500.0px;" />
<p><em>Figure 10. Figure from</em> <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">Schmalzle et al., 2014</a>. <em>(a) Sum of all SSE displacements detected at GPS (red and black vectors) from 2005.5-2011. Red vectors indicate sites that were in operation for 90% of the study. (b) Summed plate interface slow-slip from 2005.5 to 2011. Black vectors are North America relative convergence rates and directions. Thick, solid black lines mark the 10 mgal gravity anomaly contour of</em> <a class="reference external" href="http://courses.washington.edu/ess502/BlakelyGeology2005.pdf">Blakely et al., 2005</a>. <em>(c) Cumulative node depth profile interface slow-slip from 2005.5 to 2011 (red line) and 50 km binned cumulative tremor counts from 2009.8 to 2013.0 acquired from the</em> <a class="reference external" href="http://www.pnsn.org/">Pacific Northwest Seismic Network tremor catalog</a> <em>(blue line). Thick black line represents latitudes with high gravity anomalies.</em></p>
<p>Let's recap our observations:</p>
<ol class="arabic simple">
<li><a class="reference external" href="http://www.intl-geology.geoscienceworld.org/content/35/10/907.abstract">Brudzinski and Allen, 2007</a> demonstrate using GPS and seismometers that SSEs and non-volcanic tremor detection times are segmented (<em>Figure 5</em>). In other words, between 40 and about 43 degrees north, ETS occurs about once every 10-11 months, ~24 months between 43 and 46-47 degrees north, and about every 14 months north of 47 degrees north.</li>
<li>Tectonic tremor counts are increased south of 43 degrees N and north of 47 degrees N (<em>Figure 6</em> and <em>Figure 10c</em>, blue line).</li>
<li>Slow slip peaks in northern California and Washington, but is suppressed in Oregon (<em>Figure 10</em>).</li>
</ol>
<p>These observations sound awfully reminiscent of the observations in my last <a class="reference external" href="http://geodesygina.com/Cascadia.html">blog post</a>. The observations there were:</p>
<ol class="arabic simple" start="4">
<li>Reduced uplift between major subduction zone earthquakes along the coast between 43 and 46 degrees latitude (<a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">Schmalzle et al., 2014</a>).</li>
<li>Reduced paleoseismically derived subsidence for multiple Cascadia earthquakes between 43.5 and 46 degrees latitude (<a class="reference external" href="http://bulletin.geoscienceworld.org/content/122/11-12/2079.abstract">Leonard et al., 2010</a>).</li>
</ol>
<p>In the last <a class="reference external" href="http://geodesygina.com/Cascadia.html">blog post</a>, I talk about how observations 4 and 5 could be explained by persistent fault creep; in other words, if the fault is slipping in between large earthquakes, then it is slowly relieving stress that would have built up if the plates were stuck together. This results in less subsidence during an earthquake, and less coastal uplift between earthquakes. This idea is taken one step further and we suggest that the Siletzia terrane may be the culprit behind the persistent fault creep. The Siletzia terrane is a dense, accreted basalt that can be mapped with gravity surveys (<a class="reference external" href="http://courses.washington.edu/ess502/BlakelyGeology2005.pdf">Blakely et al., 2005</a>). The 10 mgal gravity anomaly is plotted as a thick red line in <em>Figure 6</em> and a thick black line in <em>Figure 10b</em>. We suggest something similar to the conceptual model presented by <a class="reference external" href="http://www.sciencedirect.com/science/article/pii/S0012821X09001836">Reyners and Eberhart-Phillips, 2009</a>, where the Siletzia terrane, if impermeable (i.e., water cannot pass through it), increases pore fluid pressures at the fault by not allowing water to percolate into the overriding crust. High pore fluid pressures at or near the plate interface encourages creep, since these conditions are thought to promote fault slip (<a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1029/95JB02403/abstract">Segall and Rice, 1995</a>; <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1029/2005JB003872/abstract">Hillers and Miller, 2006</a>).</p>
<p><a class="reference external" href="http://www.intl-geology.geoscienceworld.org/content/35/10/907.abstract">Brudzinski and Allen, 2007</a> noted that the thickest accumulations of Siletzia terrane near the coast were also the regions that experience major slow slip and non-tectonic tremor events less often. In <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">Schmalzle et al., 2014</a> we see that the total amount of tremor and the total amount of slow slip is also reduced in the region. <em>But what does that mean???</em></p>
<p>Similar to the arguments posed for locking, <a class="reference external" href="http://seismo.berkeley.edu/~paudet/Downloads_files/AudetJGR-2010.pdf">Audet et al., 2010</a> suggest that fluids trapped beneath a seal at the plate boundary increase pore fluid pressures. Although the plates in the region of slow slip are stuck most of the time, they suggest that the increased pore fluid pressures allow the plates to slip with small changes in stress. They suggest that once the fault begins to slip, the pore fluid pressure decreases and the plates become stuck again, stopping the slow slip and re-enforcing the new seal. You can imagine then, that variations in the permeability of the upper crust could influence the occurrence of periodic slow slip. If the Siletzia terrane is less permeable, then it may offer a stronger seal than surrounding regions, producing higher pore fluid pressures, which may encourage more of a partial fault creep environment than one that periodically slips.</p>
<img alt="deforming_plates" class="align-center" src="/images/locking_tremor.png" style="width: 300.0px; height: 500.0px;" />
<p><em>Figure 11. Map of the Cascadia subduction zone, with the Gamma-style locking model of the subduction fault presented in my previous</em> <a class="reference external" href="http://geodesygina.com/Cascadia.html">post</a>. <em>Red indicates areas that are completely stuck, blue areas that are freely slipping. Colors between red and blue indicate regions that are partially creeping. White line marks the region where 95% of non-volcanic tremor occurred between 2009-2012.</em></p>
<p><em>Figure 11</em> is a map that contains the results from the Gamma-style locking model described in my previous <a class="reference external" href="http://geodesygina.com/Cascadia.html">post</a>, where red represents areas that are estimated to be completely stuck, and blue represents areas that are freely slipping. The colors in between represent regions that are partially creeping. Also plotted is an outline of where 95% of the tremor occurred between 2009 and 2012. Together, it shows that partial fault creep is up-dip of the tremor. The conundrum here is: <em>if persistent partial fault creep is occurring up-dip of the zone of tremor and slow slip, then wouldn't this increase the stress on the region of tremor and periodic slow slip and foster more slow slip events?</em> If so, then why do we see the opposite -- we see less tremor and slow slip where we have more persistent fault creep! We suggest that the partial fault creep must extend into the zone of non-volcanic tremor and slow slip. Both the locking and the periodic slow slip are thought to be promoted by high pore fluid pressures. So, as fluid pressure increases due to a better seal (maybe the Siletzia?), perhaps persistent partial fault creep is the dominant mode of slip. If correct, then it is possible that the make up of the over-riding crust may determine if the fault slips as ETS or persistent fault creep (<a class="reference external" href="http://www.nature.com/ngeo/journal/v3/n9/abs/ngeo940.html">Peng and Gomberg, 2010</a>).</p>
<p>For a deeper discussion of the observations and hypotheses presented in this blog, please read <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">Schmalzle et al., 2014</a>.</p>
<p>Thanks for reading and keep in touch! Contents of this blogsite are updated at <a class="reference external" href="http://geodesygina.com/">http://geodesygina.com/</a>. See other contact information below.</p>
<p>Acknowledgments:
This work was funded by the National Science Foundation (NSF) Postdoctoral Fellowship Program, award 0847985 (Schmalzle), NSF award EAR-1062251 (McCaffrey), and USGS National Earthquake Hazards Reduction Program, Award G12AP20033 (Schmalzle and Creager). Some of the figures I made myself using General Mapping Tools (GMT), but some figures I took from random places on the web. For any of those images I say where the figure was taken. Many thanks to Reed Burgette and an anonymous reviewer for their thoughtful comments and suggestions that greatly improved this research. Thanks to Mike Brudzinski and Aaron Wech for providing their tremor catalogs. Thanks to Rick Blakeley for providing gravity data. Thanks to PBO and PANGA for providing access to GPS data products. Craig H. Faunce, Bruce Nelson, Steve Malone, Justin Sweet, David Schmidt, Aaron Wech, Tom Pratt, Brian Atwater, Sarah Minson, Lorraine Wolf, and Aimee Schmalzle provided useful comments and insight. Thanks to PBO and PANGA for providing access to GPS data products.</p>
</div>
Why the Cascadia Subduction Zone is Creepy2014-05-01T14:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-05-01:Cascadia.html<div class="section" id="cascadia-subduction-zone-creep">
<h2><strong>Cascadia subduction zone creep</strong></h2>
<p>On April 29, 2014, I presented a talk for <a class="reference external" href="http://www.meetup.com/Data-Rave/events/177359692/">Data Rave, NYC</a> at EBay. My collaborators on this project are <a class="reference external" href="http://web.pdx.edu/~pdx07343/">Rob McCaffrey</a> at <a class="reference external" href="http://www.pdx.edu/">Portland State University</a> and <a class="reference external" href="http://www.ess.washington.edu/dwp/people/profile.php?name=creager--ken">Ken Creager</a> at the <a class="reference external" href="http://www.washington.edu/">University of Washington</a>. This blog covers the talk, and the original slides can be found in my <a class="reference external" href="https://github.com/ginaschmalzle/Cascadia">github repo</a>. The paper for which the talk and this blog are based is found <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">here</a>. Like what you are reading? Check out my follow-up blog on <a class="reference external" href="http://geodesygina.com/SSEs.html">slow slip and tremor</a>!</p>
<p>A key question that geophysicists try to answer is “How much are tectonic plates “stuck” together?" The answer to this question has major implications on seismic hazard since it is thought that the more plates are stuck, the larger the earthquake will be. This blog will discuss how much the North American plate is stuck (or not stuck) to the Explorer, Juan de Fuca and Gorda plates within the Cascadia subduction zone, which resides in the Pacific Northwest corner of the United States.</p>
<p>First, I want to share with you the story of <a class="reference external" href="http://www.nature.com/ki/journal/v62/n5/fig_tab/4493262f1.html">the blind men and the elephant</a> because this story pretty much sums up my experience as a scientist, and my experience with this study (<em>Figure 1</em>). In the story of the blind men and the elephant, there were a group of blind men that decided to figure out what an elephant was really like, having never seen one before. So, they all approach the elephant from different angles and examine it. One blind man approaches a tusk and examining it declares, hey, an elephant is like a spear! But then another one came up to its wriggling trunk and decided it was more like a snake. In fact, each blind man made their own conclusion of what an elephant was really like having observed different parts of the elephant, but each of them only observed a small piece of the elephant. And so, these blind men all bickered about what an elephant really was like, and in reality they were all right because the elephant in fact has all of the properties the blind men said it had, but at the same time, they were all wrong because they weren't able to see the full picture.</p>
<img alt="deforming_plates" class="align-center" src="/images/elephant.gif" style="width: 700.0px; height: 500.0px;" />
<p><em>Figure 1. Cartoon of the blind men and the elephant. G. Renee Guzlas, artist, source:</em> <a class="reference external" href="http://www.nature.com/ki/journal/v62/n5/fig_tab/4493262f1.html">http://www.nature.com/ki/journal/v62/n5/fig_tab/4493262f1.html</a></p>
<p>In this study I bring together interdisciplinary data sets, observed by myself and by other scientists in order to get a fuller picture of what is going on with the tectonic plates in the Cascadia Subduction zone. To my knowledge, this is the first study that brings together these datasets in a comprehensive way to better understand subduction zone mechanics.</p>
<p>Before diving into the research, I am going to quickly review some key concepts for this blog. <em>Figure 2</em> is a map of the major tectonic plates that cover the earth. Each color represents a major plate. The plates rotate and translate with respect to each other all the time. The red arrows show how these plates are moving with respect to each other. They can move apart, which is what is observed in Iceland and mid-oceanic ridges. The plates can move laterally past each other, like the San Andreas fault, or, they can move toward each other. When an oceanic plate collides with a continental plate, the oceanic plate moves underneath ("subducts" beneath) the continental plate. These areas are known as subduction zones and they can produce some of the largest earthquakes in the world. These earthquakes can reach magnitude 9 or more on the Richter scale and can produce tsunamis. The <a class="reference external" href="http://en.wikipedia.org/wiki/2011_T%C5%8Dhoku_earthquake_and_tsunami">2011 Japan earthquake</a> and the <a class="reference external" href="http://en.wikipedia.org/wiki/2004_Indian_Ocean_earthquake_and_tsunami">2004 Sumatra Earthquake</a> were examples of megathrust earthquakes (ie., large, magnitude ~9 earthquakes) that occurred in subduction zones.</p>
<img alt="deforming_plates" class="align-center" src="/images/TectonicPlates.jpg" style="width: 700.0px; height: 500.0px;" />
<p><em>Figure 2. Major tectonic plates of the world. Image from</em> <a class="reference external" href="http://www.sanandreasfault.org/Tectonics.html">http://www.sanandreasfault.org/Tectonics.html</a></p>
<p>This blog focuses on the Cascadia subduction zone, found in the Pacific Northwest of the United States. <em>Figure 3</em> is a zoomed in view of the Cascadia subduction zone. The Explorer, Juan de Fuca and Gorda plates are currently being subducted beneath the North America plate. The last major earthquake occurred in 1700, and we know precisely when because of records of a tsunami in Japan <a class="reference external" href="http://pubs.usgs.gov/pp/pp1707/">(Atwater et al., 2005)</a>! Scientists have estimated that this quake was approximately 1000km in length and the plates slipped about 20m <a class="reference external" href="http://activetectonics.asu.edu/lipi/Lecture24_Tsunami/Satake_etal_2003JB002521.pdf">(Satake et al 2003)</a>! Yikes! The estimated moment magnitude for this quake was approximately 9.</p>
<img alt="deforming_plates" class="align-center" src="/images/Cascadia.png" style="width: 300.0px; height: 500.0px;" />
<p><em>Figure 3. Close up map view of the Cascadia Subduction Zone. Topography data from</em> <a class="reference external" href="http://www.ngdc.noaa.gov/mgg/global/global.html">ETOPO1 Topography Model</a>. <em>Figure made with</em> <a class="reference external" href="http://geodesygina.com/GMT.html">GMT</a>. <em>Red circles outline oceanic plate names.</em></p>
<p>Let’s look a little deeper as to what is going on here. <em>Figure 4</em> is a cross-section of the Cascadia subduction zone. You can see the Olympic Peninsula and Puget Sound. Below is an artist’s rendition of the Juan de Fuca oceanic plate subducting beneath the North America plate. The shallow, up-dip area is where the plates are thought to be stuck, or locked together. Further down-dip, the plates transition from fully locked, to fully creeping, where creeping is a measurement of how much the plates are slipping between large earthquakes. So, in the regions where the plates are stuck, lots of stress is building up, and is where megathrust earthquakes are thought to occur.</p>
<img alt="deforming_plates" class="align-center" src="/images/csz_cross.png" style="width: 700.0px; height: 500.0px;" />
<p><em>Figure 4. Profile cross-sectional view of the Cascadia Subduction Zone. Image from</em> <a class="reference external" href="http://ooi.washington.edu/rsn/jrd/">John Delaney</a>.</p>
<p>So, what happens when the plates are stuck? The two plates are moving toward each other. In order to accommodate that motion, the two plates that are stuck together must begin to bend and deform. The continental crust begins to shorten and the ground near the coast begins to uplift.</p>
<p>When an earthquake happens, the two plates quickly slide past each other. The continental plate suddenly expands and subsides near the coast, and uplifts offshore. You can imagine the dire consequences of this – The uplifting crust shifts the entire water column up, possibly generating a massive wave which will eventually propagate to shore, but the shore line has also gone down, allowing the tsunami wave, once it hits, to reach further inland and be more destructive. As an example, Japan experience about 0.5-1 meter of subsidence during the 2011 quake (<a class="reference external" href="http://blogs.agu.org/mountainbeltway/2011/03/15/new-gps-vectors/">http://blogs.agu.org/mountainbeltway/2011/03/15/new-gps-vectors/</a>) that also generated a tsunami that reached 33 ft high (<a class="reference external" href="http://en.wikipedia.org/wiki/2011_T%C5%8Dhoku_earthquake_and_tsunami">http://en.wikipedia.org/wiki/2011_T%C5%8Dhoku_earthquake_and_tsunami</a>). Yikes.</p>
<img alt="deforming_plates" class="align-center" src="/images/leonard.jpg" style="width: 500.0px; height: 500.0px;" />
<p><em>Figure 5. Cartoon of crustal deformation due to fault locking between earthquakes (top) and during an earthquake (bottom). Figure from</em> <a class="reference external" href="http://gsabulletin.gsapubs.org/content/116/5-6/655.abstract">Leonard et al., 2003</a>.</p>
<p>The punch line of this study is that the amount of locking changes along the Cascadia Subduction zone--the plates are more stuck off the coast of Washington, southern Oregon and California, and less stuck in northern and central Oregon. This conclusion was reached by bringing together observations from a variety of cross-disciplinary studies, and like the blind men mentioned earlier, I attempt to piece together these data sets to make a simple, consistent story that explains all of them.</p>
<p>Let’s dive into the first data set – High precision Global Positioning Systems (GPS). A GPS satellite emits two wavelengths and some other information that help determine the distance of the satellite to a receiver on the ground (say your smart phone). It is important that two different wavelengths are emitted because it helps in calculating some distortions in the passing through the ionosphere. In the most simplistic view of how distance is calculated, one can take the time difference between the emission of the signal from the satellite and the detection of the signal at the ground reciever and multiply that differenced time by the speed of light. That will give the satellite line of site distance. To convert it to a 3 dimensional position, one would need the calculated range from at least 4 different satellites. There are currently 32 healthy GPS satellites in orbit, which means that any place on the earth, except maybe at the poles, can see at least 4 satellites at any given point in time.</p>
<img alt="deforming_plates" class="align-center" src="/images/GPS_sat.png" style="width: 400.0px; height: 500.0px;" />
<p><em>Figure 6. Horizontal arrow points to an image of a GPS satellite from</em> <a class="reference external" href="http://www.geosoft-gps.de/english/gps_infos/info_2_e.html">http://www.geosoft-gps.de/english/gps_infos/info_2_e.html</a>. <em>Vertical arrow points to a picture of the Death Star. GPS satellites and the Death Star should not be confused.</em></p>
<p>Back here on earth, we have permanently installed GPS monuments. These guys are usually installed in bedrock, if possible, or some other sturdy structure. You may have seen some of these monuments, the top left corner of <em>Figure 7</em> is an example of what one may look like. Below that is a Trimble 5700 GPS and a Zephyr Geodetic antenna – a little larger than your smart phone. The antenna is usually set up on top of a tripod that is centered over the monument. The right hand photo of <em>Figure 7</em> shows the antenna on top of a tripod with a protective cover that helps keep snow off. The antenna detects the signals from the satellite, which is then sent to the connected receiver, that collects that information.</p>
<img alt="deforming_plates" class="align-center" src="/images/GPS_stuff.png" style="width: 700.0px; height: 400.0px;" />
<p><em>Figure 7. Upper left: photo of Geodetic monument from</em> <a class="reference external" href="http://en.wikipedia.org/wiki/Survey_marker">http://en.wikipedia.org/wiki/Survey_marker</a>. <em>Lower left: photo of a Trimble 5700 GPS and a Zephyr Geodetic antenna from</em> <a class="reference external" href="http://facility.unavco.org/">http://facility.unavco.org/</a>. <em>Right: Picture of an operating GPS from</em> <a class="reference external" href="https://earthdata.nasa.gov/featured-stories/featured-research/looking-mud">https://earthdata.nasa.gov/featured-stories/featured-research/looking-mud</a>.</p>
<p>Daily positions of the GPS can be estimated. <em>Figure 8</em> is an example of a GPS position time series for its three components – North, East and Vertical. The blue dots mark the daily position estimate, and the vertical black lines the uncertainties. Interestingly at this particular site there was a small earthquake nearby which caused this jump in the position time series. But, you can imagine that, ignoring the earthquake we can calculate the rate at which this monument is moving by taking the slope of the time series for each component.</p>
<img alt="deforming_plates" class="align-center" src="/images/BEMT.png" style="width: 400.0px; height: 500.0px;" />
<p><em>Figure 8. GPS position time series for site BEMT, taken from</em> <a class="reference external" href="http://cws.unavco.org:8080/cws/modules/GPStimeseriesCA/">UNAVCO website</a>.</p>
<p>Focusing on the horizontal velocities, we can estimate by eye that this site moved about 30 mm per 6 years, or 5 mm/yr. Similarly, we can estimate by eye that the north component moves at about 8 mm/yr. By taking the square root of the squares of these velocities we can calculate a magnitude and we can also calculate the direction it was moving by taking the arctangent of the two components. This gives you an idea of how a velocity can be calculated by eye. Calculating the time series velocities for this study is a little more rigorous, however, since other signals, such as earthquakes and seasonal effects convolute the velocity estimate. Using the least squares method, velocities in this study are calculated by fitting the time series to the linear linear equation:</p>
<img alt="deforming_plates" class="align-center" src="/images/equation.png" style="width: 800.0px; height: 50.0px;" />
<dl class="docutils">
<dt>where</dt>
<dd><em>p</em> = position,
<em>po</em> = initial position,
<em>v</em> = velocity,
<em>t</em> = time,
<em>H</em> = Heaviside function (step function) for earthquakes or equipment changes,
<em>A</em> = amplitude of offset, and
<em>U1-4</em> = constants for seasonal variations.</dd>
</dl>
<p>Another data set used was tide and leveling data from <a class="reference external" href="http://cascadiageo.org/documentation/literature/cascadia_papers/burgette_etal_2009_interseis_uplift_orygun.pdf">Burgette et al., 2009</a>. Remember in between major earthquakes the region near the shoreline uplifts– which means it would look like sea level is lowering. This can be measured over time, and an estimate can be made on how much vertical movement happened over time.</p>
<img alt="deforming_plates" class="align-center" src="/images/TideGauge.jpg" style="width: 200.0px; height: 300.0px;" />
<p><em>Figure 9. Photo of a tide station. Photo from</em> <a class="reference external" href="http://www.oco.noaa.gov/tideGauges.html">http://www.oco.noaa.gov/tideGauges.html</a>.</p>
<p>Let’s look at the data! In <em>Figure 10</em>, the map on the left has horizontal GPS velocities that are estimated from daily position time series from 1997 to 2013. These velocities are referenced to stable North America, so you could imagine standing in Nebraska, looking longingly to the west coast, and watching the plates move as indicated by these arrows. The arrows here originate at the GPS monument, are sized according to their magnitude, and point in the direction of motion. Note the reference scale arrow in black is 5 mm/yr. Now let’s look at the vertical data set. For better illustration, I’ve color coded them so that warm colors represent more uplift. The key thing to notice about this data set is that there is more uplift in the north and in the south, and a reduced amount of uplift in central and northern Oregon.</p>
<img alt="deforming_plates" class="align-center" src="/images/GPSvelos.png" style="width: 500.0px; height: 500.0px;" />
<p><em>Figure 10. Maps of GPS horizontal velocities (left) and the combined GPS vertical velocities with tide and leveling uplift rates (right). Vertical rates colored according to their magnitude. Warm colors indicate uplift.</em></p>
<p>Geophysicists try to figure out how the world works by applying geophysical data to a mechanical model. What I mean is we think we know some basic concepts behind how the world works, so we build a mechanical model that will actually mimic what the earth is doing based on these concepts. One such model is called a block model. This type of model divides the corner of the earth you are working on into tectonic blocks that can move and rotate, strain and bend due to fault motions. We can use these models, along with the GPS data to estimate how much the plates are stuck together. The modeling program that I use is called TDEFNODE and it is a massive, wonderful beast of a code, written in Fortran! Yes, Fortran. It is based off of the models presented in <a class="reference external" href="http://www.web.pdx.edu/~mccaf/pubs/mccaffrey_pnw_gji_2007.pdf">McCaffrey et al., 2007</a>. <em>Figure 11</em> is a map of the Cascadia subduction zone with the block model geometry laid over top (solid black lines). The dots represent the interface between the subducting oceanic plate and the continental plate. It looks flat here, but really the fault interface is going down into the page.</p>
<img alt="deforming_plates" class="align-center" src="/images/block_model.png" style="width: 300.0px; height: 500.0px;" />
<p><em>Figure 11. Geometry of three dimensional block model. Thick black lines mark block boundaries, dots the three dimensional subduction interface. Block names are labeled.</em></p>
<p>OK -- We have our data, and we have our model. Only we have a big problem – The locking, which is what we are trying to solve, is mostly offshore, where we don’t have any data to constrain the model! This means that the model is heavily reliant on the user assumptions. Hence, I've described the model in two ways -- The first I call the Gaussian model, which assumes that the locking is distributed along strike in a Gaussian way, where it is minimal at the trench, crescendos to a maximum, and then tapers off down-dip. The second way assumes that the fault is completely locked from the trench to some distance down-dip before it begins to taper off.</p>
<p><em>Figure 12</em> are maps of locking distributions for the Gaussian (a) and Gamma (b) models. The green lines mark the modeled block geometery, and the colors are the locking fraction – where red indicates that the plate are stuck together more, and the cooler colors mean that the plates are less stuck. The residuals for each model are nearly identical in both cases, even though at first glance these models seem very different. But let’s take a closer look here. Both models show much more intense locking offshore in Washington and in northern California and southern Oregon. And the other distinguishing feature is that there is a wide transition zone between about 43 and 46 degrees north in central and northern Oregon. So, for these models, locking must be reduced in order to fit the reduced GPS, tide and leveling uplift rates in this region.</p>
<img alt="deforming_plates" class="align-center" src="/images/locking.png" style="width: 500.0px; height: 500.0px;" />
<p><em>Figure 12. Locking distributions for the Gaussian (a) and Gamma (b) locking distribution models. Green lines mark block model boundaries, warm colors indicate regions that are more locked.</em></p>
<p>Up until now we have been talking about what happens in between major earthquakes. Let’s change gears a bit and think about what happens during an earthquake. Remember that during an earthquake, the continental crust uplifts offshore, potentially displacing the water column and producing a tsunami. Near the coast the ground subsides, allowing tsunami waters to inundate the shore line much further than in between earthquakes, bringing with it sediment and debris that would eventually settle out of the water column and form a geologic layer. These tsunami deposits can be seen in the geologic record. From these geologic layers, paleoseismologists can deduce how much subsidence occurred. Diatoms and other organic matter can help date when these layers were formed.</p>
<p><a class="reference external" href="http://bulletin.geoscienceworld.org/content/122/11-12/2079.abstract">Leonard et al., 2010</a> compiled subsidence records from a plethora of studies that include earthquakes from the past 6500 years in Cascadia. <em>Figure 13</em> shows a subset of subsidence data from some of these Cascadia earthquakes. The figure displays subsidence as a function of latitude, ranging from 50 degrees latitude (Canada) to 40 degrees latitude (northern California). What <a class="reference external" href="http://bulletin.geoscienceworld.org/content/122/11-12/2079.abstract">Leonard et al., 2010</a> observed is that for multiple past earthquakes, subsidence was reduced between 43.5 and 46 degrees north latitude. In their study, they state that reduced subsidence in central Cascadia is a persistent feature of Cascadia subduction earthquakes.</p>
<img alt="deforming_plates" class="align-center" src="/images/Leonard_eq.png" style="width: 500.0px; height: 700.0px;" />
<p><em>Figure 13. Subsidence records compiled in</em> <a class="reference external" href="http://bulletin.geoscienceworld.org/content/122/11-12/2079.abstract">Leonard et al., 2010</a>. Reduced subsidence is observed between ~43.5 and 46 degrees North.</p>
<p>Hmmph...</p>
<p>So let’s recap what we have so far – in the same region, at about 43-46 degrees north, we have both reduced inter-earthquake uplift as well as reduced subsidence due to earthquakes!</p>
<p>Now we come to <em>my elephant</em> -- that is, my interpretation of these observations. One way we can explain these observations is by fault creep in central Cascadia. In the locked scenario, the two plates are pushing together, creating uplift, which we see in Washington and California. This builds up a lot of stress which is later released in a big earthquake (<em>Figure 14</em>). In the case where the plates may be partially creeping the two plates are actually sliding past each other in between major earthquakes and stress doesn’t accumulate to the same extent – this means that we are less likely to see as much uplift in between earthquakes, and when an earthquake does happen the slip is expected to be less since much of it was already accommodated between earthquakes (<em>Figure 14</em>).</p>
<img alt="deforming_plates" class="align-center" src="/images/creep.png" style="width: 700.0px; height: 400.0px;" />
<p><em>Figure 14. Subduction zone locking and creep scenarios. The top row shows the expected deformation for a locked subduction zone -- the continental crust uplifts near the coast in between earthquakes, and experience lots of subsidence during an earthquake. Alternatively (bottom row), if the subduction zone is creeping then the two plates release stress between earthquakes so that when an earthquake happens less slip is expected.</em></p>
<p>So, now we have our theory, based on interdisciplinary research using GPS, tide gauge, leveling and paleoseismic datasets. The theory, however doesn't explain <em>why</em> the plates are creeping in central Cascadia. <a class="reference external" href="http://cascadiageo.org/documentation/literature/cascadia_papers/burgette_etal_2009_interseis_uplift_orygun.pdf">Burgette et al., 2009</a> present a model similar to the Gamma model above, but they enforce a narrow locking transition width. In order to fit the reduced uplift rates in central Cascadia, their model shifted the locked region offshore. They note that the locking pattern in their model correlates well with the location of a dense, Eocene age (~50 Ma) accreted basalt known as the Siletzia terrane, and suggest it may influence the locking. The Siletzia itself is pretty rigid -- it has few earthquakes within its body (<a class="reference external" href="http://pubs.usgs.gov/pp/pp1661d/">Parsons et al., 2005</a>). Some studies suggest that it may also be less permeable (<a class="reference external" href="http://stephanerondenay.com/Materials/pdf/Calkins_etal_JGR_2011.pdf">Calkins et al., 2011</a>). Seismic surveys indicate that the Siletzia terrane is thickest in coastal Cascadia in central Oregon, where it extends as much as ~35km offshore. In Washington, the Siletzia terrane is not present in large quantities in the Olympics, but is observed further east in the Puget Sound region (<a class="reference external" href="http://pubs.usgs.gov/pp/pp1661d/">Parsons et al., 2005</a>).</p>
<p>Because the Siletzia terrane is dense, it can be mapped in gravity surveys. <a class="reference external" href="http://earthweb.ess.washington.edu/brown/downloads/ESS403/Cascadia/BlakelyGeology2005.pdf">Blakely et al., 2005</a> present gravity data sets that map out the extent of the Siletzia terrane. We use the 10 mgal contour line of this data set to map out the thickest accretions of the Siletzia terrance (<em>Figure 15</em>). The gravity anomaly outlined shows the largest block extends from about 44 to 46 degreen north latitude, and extends through our region with reduced interseismic uplift and reduced coseismic subsidence. The outline is mapped on top of the Gaussian locking distribution model in <em>Figure 15</em>, but please note that their is no preference between model solutions.</p>
<img alt="deforming_plates" class="align-center" src="/images/lock_grav.png" style="width: 300.0px; height: 500.0px;" />
<p><em>Figure 15. Gaussian locking distibution model plotted with 10 mgal contour line (white line) from</em> <a class="reference external" href="http://earthweb.ess.washington.edu/brown/downloads/ESS403/Cascadia/BlakelyGeology2005.pdf">Blakely et al., 2005</a>.</p>
<p>This study builds off of a conceptual model from <a class="reference external" href="http://www.sciencedirect.com/science/article/pii/S0012821X09001836">Reyners and Eberhart-Phillips, 2009</a>. In their study of locking distributions in the Hikurangi Subduction Zone (HSZ) under the North Island of New Zealand, they note that the less permeable Rakaia terrane, seems to also be influencing the locking. They suggest that the impermeable Rakaia terrane prevents water generated from the basalt-eclogite transition from percolating up into the overriding crust. Instead, pore fluid pressures increase at or near the plate interface and influences the locking. Intriguingly, their study suggests that the Rakaia terrane is <em>more</em> locked than its surroundings. We suggest a similar conceptual model for Cascadia, where the Siletzia terrane, if impermeable, increases pore fluid pressures by not allowing water to percolate into the overriding crust. We suggest, however, that the high pore fluid pressures at or near the plate interface encourages creep, since these conditions are thought to favor stable sliding (<a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1029/95JB02403/abstract">Segall and Rice, 1995</a>; <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1029/2005JB003872/abstract">Hillers and Miller, 2006</a>.</p>
<p>So, here is the recap of this work:</p>
<ol class="arabic simple">
<li>We observe reduced interseismic uplift rates in central Cascadia determined with GPS, tide and leveling data sets. Models of plate interface locking indicate that locking has to be reduced with a wide transition zone (this study) or move offshore in order to fit these data.</li>
<li>Paleoseismologists observe reduced coseismic subsidence from past great earthquakes in central Cascadia.</li>
<li>The above observations can be explained by creep in central Cascadia.</li>
</ol>
<p>For a deeper discussion of the observations and hypotheses presented in this blog, please read <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">Schmalzle et al., 2014</a>.</p>
<p>Thanks for reading!</p>
<p>Like what you read? Check out my follow-up blog on <a class="reference external" href="http://geodesygina.com/SSEs.html">periodic slow slip and tremor</a>!</p>
<p>Acknowledgments:
This work was funded by the National Science Foundation (NSF) Postdoctoral Fellowship Program, award 0847985 (Schmalzle), NSF award EAR-1062251 (McCaffrey), and USGS National Earthquake Hazards Reduction Program, Award G12AP20033 (Schmalzle and Creager). Some of the figures I made myself using General Mapping Tools (GMT), but some figures I took from random places on the web. For any of those images I say where the figure was taken. Many thanks to Reed Burgette and an anonymous reviewer for their thoughtful comments and suggestions that greatly improved this research. Thanks to Rick Blakeley for providing gravity data. Craig H. Faunce, Bruce Nelson, Steve Malone, Justin Sweet, David Schmidt, Aaron Wech, Tom Pratt, Brian Atwater, Sarah Minson, Lorraine Wolf, and Aimee Schmalzle provided useful comments and insight. Thanks to PBO and PANGA for providing access to GPS data products. Special thanks to David Branner, Ruby Childs and Nick Collins for organizing <a class="reference external" href="http://www.meetup.com/Data-Rave/events/177359692/">Data Rave</a> and inviting me to give a talk.</p>
</div>
Movie of March 11, 2011 Japan Earthquake and its aftershocks2014-04-18T16:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-04-18:japan_eq.html<div class="section" id="movie-of-the-2011-japan-earthquake-and-its-aftershocks">
<h2><strong>Movie of the 2011 Japan earthquake and its aftershocks</strong></h2>
<p>On March 11, 2011 a magnitude 9 earthquake occurred off the coast of Japan, generating a massive tsunami that caused major damage to infrastructure and loss of life. The magnitude 9 earthquake was followed by thousands of smaller earthquakes known as aftershocks in the hours after the earthquake. The magnitude of aftershocks is known to decreases exponentially after the main shock and can be seen in the data set, as shown in <em>Figure 1</em>.</p>
<img alt="deforming_plates" class="align-center" src="/images/Japan_eq.png" style="width: 750.0px; height: 500.0px;" />
<p><em>Figure 1. Screen shot of the</em> <a class="reference external" href="http://geodesygina.com/JapanEarthquake/index.html">earthquake movie</a> <em>. The map on the left plots earthquakes at the end of March 11, 2011. Circles are sized by magnitude and colored by depth in kilometers. The plots on the right show the earthquake magnitudes plotted over time, colored by depth. The top figure spans the entire day, where as the lower figure shows a zoomed-in view of the 5th through the 10th hour of the day.</em></p>
<p>Interestingly, the earthquake magnitudes fluctuate over time. <a class="reference external" href="http://www.nature.com/srep/2013/130717/srep02218/full/srep02218.html">Omi et al. [2013]</a> note this behavior and model it in their paper "Forecasting large aftershocks within one day after the main shock".</p>
<p>Using earthquake data provided by the <a class="reference external" href="http://quake.geo.berkeley.edu/anss/catalog-search.html">ANSS database</a>, Ville Juutilainen, Gayane Petrosyan, Sean Mathew Lawrence and I worked on the <a class="reference external" href="http://geodesygina.com/JapanEarthquake/index.html">Japan Earthquake Movie</a>. It allows the user to choose the plot they want to see while the movie plays. The first default is the plot shown above that displays Magnitude vs. Time. The other shows Depth vs. Profile Length. The Depth vs. Profile length option allows the user to select two points to define a profile. Points within a certain distance defined by the user are plotted on the map and on the plot. The user must select two points to see the plot. The code is written in Javascript, taking advantage of the D3 framework. You can access the code in my <a class="reference external" href="https://github.com/ginaschmalzle/tohoku_eq">github repo</a>. The code has been tested for Chrome, but has yet to be tested for other browsers.</p>
</div>
Elastic half space model of a vertical strike slip fault2014-04-18T14:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-04-18:ehalf.html<div class="section" id="id1">
<h2><strong>Elastic Half Space Model of a Vertical Strike Slip Fault</strong></h2>
<p>In between major earthquakes, the ground deforms due to movement of tectonic plates. For strike-slip faults, such as the San Andreas Fault, the ground deforms in an 'S' shape that can be modeled as an arctangent. To better understand what is observed at the surface, imagine a fence built perpendicularly across a strike-slip fault. When the fault is first built it is nice and straight, but over time it starts to deform and look kind of like an "S". When the earthquake occurs the ground (and the fence) will snap, and the two sides of the fence will become straight again at some time after the earthquake, although displaced.</p>
<img alt="deforming_plates" class="align-center" src="/images/elastichs.jpg" style="width: 170.0px; height: 250.0px;" />
<p><em>Figure 1. Deformation of tectonic plates between major earthquakes. D is the locking depth (the depth at which the fault is stuck). Figure modified from http://geologycafe.com/california/pp1515/chapter7.html</em></p>
<p>A simple model of this movement for vertical strike slip faults in a homogenous elastic half space [Weertman and Weertman, 1964; Savage and Burford, 1973] can be described as:</p>
<pre class="literal-block">
V(x) = (vo/pi)arctan(x/D)
</pre>
<p>where V(x) is the velocity of points estimated along a perpendicular profile across the fault, v is the far field velocity, x is the distance from the fault, and D is the dislocation depth.</p>
<p>High precision GPS can be used to observe the deformation around fault systems. The velocity, position and uncertainty estimates can be compared directly to the model to estimate the goodness of fit of the model. The chi2 statistic is give by:</p>
<pre class="literal-block">
chi2 = SUM ((dataR -vel)/(sig))**2
</pre>
<p>where chi2 = chi2 statistic, dataR = GPS estimated rate for a position aling the profile, vel = model calculated velocity, and sig = GPS velocity uncertainty. SUM is the sum of all chi2 for each GPS datum.</p>
<p>The reduced chi2 is also calculated and is given by:</p>
<pre class="literal-block">
reduced chi2 = chi2 / (N-v-1)
</pre>
<p>where N = number of data, v = number of variable parameters (in this case 2, R and d). An ideal reduced chi2 should equal 1.</p>
<p>In this blog, I use high precision GPS velocities from Schmalzle et al, [2006], and compare them to the model described above. I use the chi2 statistic and the reduced chi2 to determine the goodness of fit of the model to the data. My <a class="reference external" href="https://github.com/ginaschmalzle/elastichalfspace">githup repo</a> has versions of how to do this in <strong>Fortran</strong> with visualization using <strong>General Mapping Tools (GMT)</strong>, and how to do it in <strong>Python</strong>. This blog will cover only the techniques used in the Python version of the code.</p>
<p>I have a few external files that are read in the code and are also available in the my <a class="reference external" href="https://github.com/ginaschmalzle/elastichalfspace">githup repo</a>. The file param.py holds the values of the far field velocity and locking depth for a given model run and looks like this:</p>
<pre class="literal-block">
xmin=-150.
xmax=150.
int=1.
Vo=34.
d=15.
dmin=1.
dmax=100.
Vmin=0.
Vmax=100.
</pre>
<p>The file data.py contains the GPS data in the form of x (in km), Rate (in mm/yr) and uncertainty (in mm/yr) and looks like this:</p>
<pre class="literal-block">
-92.60814 15.0918 -3.561012106
-90.65163 15.4416 -3.592941176
-92.60814 15.1681 -1.42072184
-71.08653 14.3386 -1.661513002
-69.13002 16.4146 -1.773867193
-47.60841 10.1976 -1.511861585
-60.65181 14.0536 -1.95124564
-70.43436 14.4541 -2.901815856
-2.282595 0.82942 -1.622786425
-27.39114 8.70825 -1.672086072
0.65217 0.18103 -1.550890689
7.499955 -6.89121 -1.799743676
19.5651 -11.9668 -1.632284566
-17.60859 4.82287 -1.62481939
12.39123 -8.5554 -1.223772558
33.91284 -15.2699 -1.190422007
65.86917 -15.6106 -1.351439143
122.60796 -15.6614 -6.776253439
3.26085 -5.95536 -1.524006894
2.60868 -2.39876 -1.179208687
6.5217 -5.81965 -1.62421975
-13.0434 5.08107 -1.23548631
</pre>
<p>Now for the coding! First, import the following modules:</p>
<pre class="literal-block">
import numpy as np
import math
import matplotlib.pyplot as plt
</pre>
<p>and import the param.py file:</p>
<pre class="literal-block">
import param
</pre>
<p>I collect the information from the param file and computed the surface velocities like this:</p>
<pre class="literal-block">
f = open('vel.txt','w')
listx = []
listVel = []
x=param.xmin
while (x <= param.xmax):
Vel=-((param.Vo/np.pi)*math.atan(x/param.d))
print >> f, x, Vel
listx.append(x)
listVel.append(Vel)
x = x + param.int
</pre>
<p>This calculates the predicted velocity for a defined increment along a profile of a strike slip fault.
I keep the x's and calculated velocities in lists that will be used later in the program for plotting purposes.</p>
<p>Now let's open the GPS file and read its contents:</p>
<pre class="literal-block">
g=np.loadtxt('data.py')
gx = g[:,0]
gVel = g[:,1]
gsig = g[:,2]
</pre>
<p>Now calculate the expected velocity or at each GPS position:</p>
<pre class="literal-block">
VelC = -((param.Vo / np.pi) * np.arctan ([ gx/param.d ]))
</pre>
<p>and calculate the chi2 and reduced chi2</p>
<pre class="literal-block">
chi = ((gVel - VelC)/ (gsig))**2
chi2 = sum(chi.T)
redchi = chi2/(len(gVel)-3)
</pre>
<p>Now you have the model fit to the data for a modeled fault rate and locking depth! The model fit to the data looks like <em>Figure 2</em>.</p>
<img alt="gridsearch" class="align-right" src="/images/lineGPS.png" style="width: 800.0px; height: 700.0px;" />
<p><em>Figure 2. Modeled velocities across a vertical strike slip fault (solid lines) compared to GPS velocities (triangles) with velocity uncertainty error bars. Both gridsearch estimated and inversion estimated low misfit rates are shown for a locking depth of 15km. The reduced chi2 is given.</em></p>
<p>But suppose you want to know which combination of modeled fault rate and locking depth give you the best fit to the data. One way you can do this is by running a whole suite of models that include different combinations of fault rate and locking depth values. This is called a gridsearch approach, and is perhaps the simplest (although most time consuming) method. The param.py files contains user input values for a range of modeled parameters. Grabbing those values we can then perform a while loop to loop through those ranges:</p>
<pre class="literal-block">
dmin=param.dmin
dmax=param.dmax
Vmin=param.Vmin
Vmax=param.Vmax
d=param.dmin
gridredchi = np.array([V, d, chi])
grc = []
c = open('chi.py','w')
while (d <= dmax):
V=param.Vmin
while (V <= Vmax):
gridVelC = -((V / np.pi) * np.arctan ([ gx/d ]))
gridchi = ((gVel - gridVelC)/ (gsig))**2
gridchisum = np.matrix.item(sum(gridchi.T))
gridrchi= gridchisum/(len(gVel)-3)
newrow = [ V, d, gridrchi ]
gridredchi = np.vstack([gridredchi, newrow])
print >> c, V, d, gridrchi
plt.scatter(V,d, c=gridrchi, marker='s',lw=0, s=40, vmin=0, vmax=10)
V = V + param.int
d = d + param.int
</pre>
<p>By performing the gridsearch, you can contour the estimated chi2 value with the defined model rate and locking depth as shown in <em>Figure 3</em>.</p>
<img alt="gridsearch" class="align-right" src="/images/gridsearch.png" style="width: 800.0px; height: 700.0px;" />
<p><em>Figure 3. Contour plot of the chi2 statistic (colors, cooler colors indicate lower misfit) given modeled values of fault rate and locking depth. The white star marks the low misfit model.</em></p>
<p>Performing a gridsearch can take a long time, but it has the advantage that it is a straightforward method to estimate the low misfit model. By imaging our chi2 distribution, like in <em>Figure 3</em> we can also easily see if there are other minima that could provide an alternative model that fits the data just as well for our given parameter ranges. The down side, however, is that this method is really slow, especially for more complicated models that require longer computation times.</p>
<p>An alternative method is to linearly invert the data with a little bit of matrix algebra. A great book that clearly describes this technique is:</p>
<p>Aster, R., Borchers, B., Thurber, C., Parameter Estimation and Inverse Problems, 301 pp, Elsevier Academic Press, 2004.</p>
<p>I highly recommend this book for further reading on this subject. I am not going to go over these concepts in this blog, but these methods are used in the scripts in my <a class="reference external" href="https://github.com/ginaschmalzle/elastichalfspace">github repo</a>. I use a linear inverse approach which is only valid for linear parameters, hence I can use it to estimate the best fitting rate, but not the locking depth. Using this method, the model is run for a locking depth of 15 km to find the best fit model in <em>Figure 2</em>.</p>
</div>
Mapping and Plotting data with Generic Mapping Tools (GMT)2014-04-14T14:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-04-14:GMT.html<div class="section" id="generic-mapping-tools">
<h2><strong>Generic Mapping Tools</strong></h2>
<p><strong>Update, Feb. 7, 2015</strong>: <em>A comment made by Joseph below alerted me that I had been referring to GMT as General Mapping Tools, which was incorrect! The correct name is Generic Mapping Tools and has since been updated. Many thanks to Joseph for pointing this out, and my apologies for my gaffe! BTW -- I love the commentary -- it provides great feedback for improving my blog. Please don't be shy!</em></p>
<p><strong>Original (Revised) Text:</strong></p>
<p>Before my time here at <strong>Hacker School</strong> I put together a short, hands on course on how to use <a class="reference external" href="http://gmt.soest.hawaii.edu/">Generic Mapping Tools (GMT)</a>. <strong>GMT</strong> is a very powerful <strong>mapping</strong> and <strong>data visualization</strong> package. This class was based around using <strong>GMT</strong> 4. <strong>GMT</strong> 5 has slightly different syntax and functionality and will not work with the scripts presented here. I intend to update the scripts to <strong>GMT</strong> 5 in the near future.</p>
<p>The class was taught at the University of Washington Earth in the Department of Earth and Space Science, so it has an earth science theme -- <strong>earthquakes</strong>! Here, students learn how to make a map with layers that include <strong>mapping</strong> <strong>topography</strong> and <strong>earthquake locations</strong>. A subset of data is extracted from a profile line (red line in map below) so that <strong>earthquake</strong> depths can be plotted as a function of distance. The end goal of the class is to produce the following map and transect:</p>
<img alt="Cascadia_earthquakes" class="align-right" src="/images/cascadia_seis.jpg" style="width: 800.0px; height: 1100.0px;" />
<p><em>Figure 1. Map of the Cascadia subduction zone in northwestern United states overlain with ETOPO1 topography model (relief map), and earthquake locations from the ANSS catalog (circles) colored according to earthquake depth. Red line marks profile line from which data are extracted and plotted the depth vs. longitude graph below the map.</em></p>
<p>All files are contained in my <a class="reference external" href="https://github.com/ginaschmalzle/GMT_DataViz">Github repo</a> except for the ETOPO1 topography model dataset. Unfortunately, this is a gigantic file that I could not push to the github repo without errors. The grd file can be downloaded on the original website, however, at <a class="reference external" href="http://www.ngdc.noaa.gov/mgg/global/relief/ETOPO1/data/bedrock/grid_registered/netcdf/ETOPO1_Bed_g_gmt4.grd.gz">http://www.ngdc.noaa.gov/mgg/global/relief/ETOPO1/data/bedrock/grid_registered/netcdf/ETOPO1_Bed_g_gmt4.grd.gz</a>.</p>
<p>The file that made the above figure (cascadia_seis.com) is also in the repo, but I am also including a very verbose version below that goes through and explains each step, including some basics of <strong>bash scripting</strong>:</p>
<pre class="literal-block">
#!/bin/bash
# This file is for the first class on data visualization with GMT
# Written by Gina Schmalzle
# This code is a bash shell script (http://en.wikipedia.org/wiki/Bash_(Unix_shell)) that calls GMT files.
# awk is used to manipulate data files (http://en.wikipedia.org/wiki/AWK).
# A unix shell is a "command line interpreter" that allows definition of variables,
# and execution of commands that could be done via the command line. A nice shell script
# will make it easier to change variables, and easier to implement several lines of code.
# Shell scripts are commonly used with GMT programs.
# Most if not all of the commands in this document have associated 'man' (manual) pages. To access them type:
# man whatever_your_command_is
# If you cannot access your man pages through your command prompt, an alternative would be to type man command in
# google.
# To make this file executable, you will have to change the mode of the file (ie, read, write and/or execute)
# In your directory you will need to type:
# chmod u+x ./cascadia_seis.com
# The '#' marks to the left indicates a comment. Anything written after them is not read when the file is executed.
# This file will create a map of Cascadia that includes a grid of topography data from ETOPO1 (ETOPO1_Bed_g_gmt4.grd) and
# seismicity data. These data will be applied in "layers", very similar to how GIS packages have layers. The layers may be
# turned off or on by commenting/uncommenting lines.
# The grd file is already in gmt format. Generating and using grids is another class in itself, but here I will introduce
# you to using GMT formated grid files.
# Also included on the map are earthquakes locations color coded by depth from the ANSS catalog for 2000 to 2012 (anss_eq_2000_2012.dat)
#MAKE A MAP!
# Define the names of the input and output files
out=cascadia_seis.ps # This will be the name of your map generated by this file
seis_data=anss_eq_2000_2012.dat # ANSS earthquake catalog
topo=./ETOPO1_Bed_g_gmt4.grd # ETOPO1 topography grid
# Define map characteristics
# Define your area
north=50
south=40
east=-118
west=-132
# Define your map boundary annotation
# Here we define tick marks every 2 degrees and we print the degree on the West and South sides of the plot
# and keep the ticks (but don't label) on the east and north sides
tick='-B2/2WSen'
# Define Map Projection
# Here we define a Mercator Projection of size = 15
proj='-JM15'
#Start with GMT commands with embedded definitions....
# Help with any of these commands can be obtained by looking at the 'man' files. Simply type at the command line: man gmt_command
# If the man files are not properly installed you can also type in man gmt_command (e.g., man psbasemap) in google and it will come up.
#This line sets up the 'basemap' meaning here you will define the region, boundary annotations and projections.
#You can accomplish this also with other commands (including psxy, pscoast, etc...), but it is good many times to start with psbasemap.
psbasemap -R$west/$east/$south/$north $proj $tick -P -Y12 -K > $out
# This is your first line of GMT Code!!! Whoo-hoo! In long hand this line would look like this:
#
# psbasemap -R-132/-118/40/50 -JM15 -P -Y12 -K > cascadia_seis.ps
# What the options mean:
# psbasemap = plots postscript basemaps
# -R -- defines the area of your map (note that we defined north, south, east and west above and they are inserted into the -R option.
# The Projection (-JM) and tick marks (-B) were defined above.
# Note that when you call a defined variable, you must include a '$' before the variable name
# -P Sets the figure to "Portrait" mode. No -P is landscape.
# -Y Orients the figure vertically (-X orients it horizontally).
# -K means that there will be more 'stuff' appended to the postscript file.
# '>' means that the command output, which would normally print to screen will be directed into your new file (cascadia_seis.ps, shown here as $out)
# In addition it means that it believes cascadia_seis.ps is a new file. If it is not, it will erase all existing info in the file and re-write it with
# the new information.
#plot grid
# We would like the topography to be the map background, so it needs to be the first layer. Hence, we get started with a hard part...
# Helpful hint...
#
# use grdinfo your_grd_file.grd
# to find info about your grid file, such as the min and max values
#
# You will need to make some color palettes. These are files that tell what colors certain properties are displayed.
# For example, your ETOPO grid has a latitude, longitude and a elevation, and you want to color code the topography
# by elevation. The following lines will tell you how to do that...
# First, Make a color palatte
# Typing: makecpt
# at the command line will give you information on pre-existing color schemes
# This will make a color pallete of typical, pre-defined topography colors:
makecpt -Crelief -T-8000/8000/500 -Z > topo.cpt
#makecpt = makes GMT color palette tables
#-C tells GMT what pre-defined color palette to use
#-T defines the range and increment
#-Z states that the colors will change continuously (rather than discretely)
#topo.cpt is a new file containing your color pallete information that will be used later.
#This next line is not necessary, but may be used to make the image appear sharper.
#grdgradient helps to illuminate ridges in the topography from a specified angle.
#grdgradient $topo -A135 -Ne0.8 -Gshadow.grd
#grdgradient=Makes illumination shadow
#-A is the angle from which the light is shown
#-N normalizes the shadow according to equations stated in man grdgradient
#-G lists the name of your output grid
# Overlay the grid onto your map
# Here you are adding the grid as a layer to your postscript file
# This command includes a shadow grid file:
# grdimage $topo -R -J -O -K -Ctopo.cpt -Ishadow.grd >> $out
# This command omits the shadow file:
grdimage $topo -R -J -O -K -Ctopo.cpt >> $out
#grdimage = creates an image from a 2D netcdf grid file
#-R = Sets the region. Notice here I don't have to state the min and max values again.
#-J = Sets the projection. Again the type and size don't have to be restated.
#-O = Overlay. The output for this line is being appended to a previous postscript code
# i.e., you are adding another layer
#-K = You will be appending another layer
#-C = You will be using the color pallete topo.cpt
# Now, back to the easy stuff..
# Add coastlines
pscoast -R -J -O -K -W2 -Df -Na -Ia -Lf-130.8/46/10/200+lkm >> $out
#pscoast = adds coastlines
#-W = Sets the line width and color. Default color = black = 0 and does not have to be explicitly stated.
#-Df = What is the resolution of the coasline dataset? f = fine
#-Na = Draws politcal boundaries, a = draw all the boundaries, see man pscoast for more options
#-Ia = Draw Rivers, a = draw all rivers, see man pscoast for more options
#-Lf = Draw a fancy map scale, f = fancy, centered on -130.8, 46 degrees. +200 = length, +lkm = kilometers
# Add seismic locations and color code them by depth
# Make color palette
# Ahh, another color pallete...
# This time, let's make it rainbow colored and call is seis.cpt
makecpt -Crainbow -T0/50/10 -Z > seis.cpt
# Columns 4, 3 and 5 of the data file are the longitude, latitude and depth, respectively. This is the order
# your data need to be in for psxy (see man file)
awk '{print($4,$3,$5)}' $seis_data | psxy -R -J -O -K -W.1 -Sc.1 -Cseis.cpt -H15 >> $out
# psxy = Plot 2D lines, polygons and symbols on a map. Fun fact -- psxyz plots in 3D.
# -W.1 = Draws the black outline of the circles.
# -Sc.1 = Defines the shape and size; c = circle, size = 0.1
# -H = Header. The first 15 lines of the file contain header information and will not be read.
# -C = defines the color palette to be used for the depth. We could also make all the circles one color.
# In this case, remove the -C option and use -G instead. -G defines the color of the circle in either white-black
# or red/green/blue format. Example colors: -G0 (black); -G255 (White); -G255/0/0 (Red)
# GMT has made this a little easier. You could also say -Gblack or -Gred, but there are a limited amount of colors
# you could do that with.
# Add a scale
psscale -D0/3.2/6/1 -B10:Depth:/:km: -Cseis.cpt -O -K >> $out
# pscale = Adds a scale to go with your color palette
# -D = set the position of the scale
# -B = set and annotate the scale tick marks and lables.
# -C = specify your color palette
# Now, let's take a subset of seismic data and project them onto a line....
#First, let's view the transect line
#Plot transect line
psxy center.dat -R -J -O -K -W1 -Sc.3 -G255/0/0 >> $out
psxy center.dat -R -J -O -K -W5/255/0/0 >> $out
# You should know the options by now ;-)
#This ends the map making part of this exercize, now we move onto making a scatter plot from the seismic data.
# PROJECT DATA
# Here we use the GMT code project to take all the data within a certain region and project them onto a line
awk '{print($4,$3,$5)}' $seis_data | project -C-124/47 -A90 -W-.2/.2 -L0/4 -H15 > projection.dat
# project = projects data onto a transect
# Note that the options are different for this command
# -C = defines the center of your transect
# -A = azimuth of transect (CW from N)
# -W = Width of the transect in degrees
# -L = length of transect in degrees
# -H = Header declaration
# projection.dat = new file with the original data and the projected locations
# MAKE SCATTER PLOT
#We want the scatter plot to be on the same page as the map, but just below it, so we need to redefine our
#region, projection and tick marks...
east=-120
west=-124
dmin=0
dmax=50
proj=-JX15/-5
tick=-B1:Longitude:/10:Depth:WSen
awk '{print($6,$3)}' projection.dat | psxy -R$west/$east/$dmin/$dmax $proj $tick -W1 -Sc.2 -G200 -O -K -Y-8 -P >> $out
# Columns 6 and 3 are the projected longitude and the Depth, repectively
# -Y = Shift the new plot down 8 units. You can designate if you want to shift in centimeters (c), inches (i),
# meters (m), or pixels (p). Otherwise it shifts by whatever is in your gmtdefaults.
# Last, but not least, image your map!
# Common postscript viewers: gs, gv, ggv, open, gimp
# What, you don't like postscript files? That's ok, uncomment this line:
# ps2pdf $out
open $out
</pre>
</div>
Iteration and Recursion in Python2014-04-10T14:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-04-10:Fibo.html<div class="section" id="iteration-vs-recursion-in-python">
<h2><strong>Iteration vs. Recursion in Python</strong></h2>
<p>For the past week at Hacker School, I took a step back from making a cool and awesome projects like the <a class="reference external" href="http://geodesygina.com/vectorprojector/vectorprojector.html">Vector Projector</a> or the <a class="reference external" href="http://geodesygina.com/vectorprojector/vectorprojector.html">Japan Earthquake</a> projects and looked at some good, old-fashioned computer science concepts. Much of what I did was repetive; I found a really simple problem, solved it using some method, and then tried solving it again using a variety of other methods. One project I worked on was programming the n'th <strong>Fibonacci</strong> number using <strong>python</strong>. In this blog I will describe <strong>iterative</strong> and <strong>recursive</strong> methods for solving this problem in Python.</p>
<p>What are <strong>Fibonacci</strong> numbers (or series or sequence)? From <a class="reference external" href="http://en.wikipedia.org/wiki/Fibonacci_number">the Fibonacci Wiki Page</a>, the Fibonacci sequence is defined to start at either 0 or 1, and the next number in the sequence is one. Each subsequent number in the sequence is simply the sum of the prior two. Hence the Fibonacci sequence looks like:</p>
<pre class="literal-block">
1, 1, 2, 3, 5, 8, 13, 21, 34, 55,...
</pre>
<p>or:</p>
<pre class="literal-block">
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55,...
</pre>
<p>It is mathmatically defined as:</p>
<pre class="literal-block">
F(n) = F(n-1) + F(n-2)
</pre>
<p>My first approach was an <strong>iterative</strong> method -- basically, hardwire the case for F(0) and F(1), and then iteratively add the the values for larger values of n:</p>
<pre class="literal-block">
def F_iter(n):
if (n == 0):
return 0
elif (n == 1):
return 1
elif (n >1 ):
fn = 0
fn1 = 1
fn2 = 2
for i in range(3, n):
fn = fn1+fn2
fn1 = fn2
fn2 = fn
return fn
else:
return -1
</pre>
<p>OK, great, this works just fine... Now, let's try writing this recursively. You may ask, what is recursion in computer science? -- I certainly did about a week ago. <strong>Recursion</strong>, according to <a class="reference external" href="http://en.wikipedia.org/wiki/Recursion_(computer_science)">the Recursion Wiki page</a> is a method where the solution to a problem depends on solutions to smaller instances of the same problem. Computer programs support recursion by allowing a function to call itself (Woa! -- this concept blew my mind).</p>
<p>A <strong>recursive</strong> solution to find the nth <strong>Fibonacci</strong> number is:</p>
<pre class="literal-block">
def F(n):
if (n == 0):
return 0
elif (n == 1):
return 1
elif (n > 1):
return (F(n-1) + F(n-2))
else:
return -1
</pre>
<p>Notice that the recursive approach still defines the F(0) and F(1) cases. For cases where n > 1, however, the function calls itself. Let's take a look at what is happening here. Suppose we call F(4). F(4) will forgo the n = 0 and n = 1 cases and go the the n > 1 case where it calls the function twice with F(3) and F(2). F(3) and F(2) then each subsequently call the function again -- F(3) calls F(2) and F(1), and F(2) calls F(1) and F(0), as shown in the tree structure below (<strong>Figure 1</strong>). The F(1) and F(0) cases are the final, terminating cases in the tree and return the value of 1 or 0, respectively. So, just as Wikipedia said, the recursive case breaks down the problem to smaller instances, and it does that by allowing the user to define a function that calls itself:</p>
<!DOCTYPE html>
<meta charset="utf-8">
<style type="text/css">
.node {
cursor: pointer;
}
.overlay{
background-color:white;
}
.node circle {
fill: #fff;
stroke: steelblue;
stroke-width: 1.5px;
}
.node text {
font-size:10px;
font-family:sans-serif;
}
.link {
fill: none;
stroke: #ccc;
stroke-width: 10px;
}
.templink {
fill: none;
stroke: red;
stroke-width: 3px;
}
.ghostCircle.show{
display:block;
}
.ghostCircle, .activeDrag .ghostCircle{
display: none;
}
</style>
<script src="http://code.jquery.com/jquery-1.10.2.min.js"></script>
<script src="http://d3js.org/d3.v3.min.js"></script>
<script src="dndTree.js"></script>
<body>
<div id="tree-container"></div>
</body>
</html>
<p><em>Figure 1. Tree structure demonstrating the recursion flow for a case where n = 4. Figure made using</em> <a class="reference external" href="http://d3js.org/">D3.js</a>.</p>
<p>Fantastic! Now we have both the iterative and recursive styles of this code, and we can now see how recursive coding works. Playing around with these codes, however, I noticed that the run time on the recursive case was MUCH slower than the iterative case for large n. I included a timer (time.time()) to measured how quickly each run is estimated for a range of n. What I got is a figure that looks like this:</p>
<html><head><title>mpld3 plot</title></head><body>
<style>
</style>
<script src="http://d3js.org/d3.v3.min.js"></script>
<script src="/mpld3.v0.2git.min.js"></script>
<div id="fig91044974148646267950353"></div>
<script>
function mpld3_load_lib(url, callback){
var s = document.createElement('script');
s.src = url;
s.async = true;
s.onreadystatechange = s.onload = callback;
s.onerror = function(){console.warn("failed to load library " + url);};
document.getElementsByTagName("head")[0].appendChild(s);
}
function create_fig91044974148646267950353(){
mpld3.draw_figure("fig91044974148646267950353", {"width": 640.0, "axes": [{"xlim": [0.0, 0.035000000000000003], "yscale": "linear", "axesbg": "#FFFFFF", "texts": [{"v_baseline": "auto", "h_anchor": "middle", "color": "#000000", "text": "n", "coordinates": "axes", "zorder": 3, "alpha": 1, "fontsize": 12.0, "position": [-0.050655241935483875, 0.5], "rotation": -90.0, "id": "9104497489616"}], "zoomable": true, "images": [], "xdomain": [0.0, 0.035000000000000003], "ylim": [0.0, 25.0], "paths": [], "sharey": [], "sharex": [], "axesbgalpha": null, "axes": [{"grid": {"gridOn": false}, "position": "bottom", "nticks": 9, "tickvalues": null, "tickformat": null}, {"grid": {"gridOn": false}, "position": "left", "nticks": 6, "tickvalues": null, "tickformat": null}], "lines": [{"color": "#FF0000", "yindex": 1, "coordinates": "data", "dasharray": "10,0", "zorder": 2, "alpha": 1, "xindex": 0, "linewidth": 10, "data": "data01", "id": "9104548429520"}, {"color": "#0000FF", "yindex": 1, "coordinates": "data", "dasharray": "10,0", "zorder": 2, "alpha": 1, "xindex": 2, "linewidth": 10, "data": "data01", "id": "9104548589584"}], "markers": [], "id": "9104497469584", "ydomain": [0.0, 25.0], "collections": [], "xscale": "linear", "bbox": [0.125, 0.53636363636363638, 0.77500000000000002, 0.36363636363636365]}, {"xlim": [0.0, 0.035000000000000003], "yscale": "linear", "axesbg": "#FFFFFF", "texts": [{"v_baseline": "hanging", "h_anchor": "middle", "color": "#000000", "text": "Time(seconds)", "coordinates": "axes", "zorder": 3, "alpha": 1, "fontsize": 12.0, "position": [0.5, -0.13177083333333334], "rotation": -0.0, "id": "9104548443216"}, {"v_baseline": "auto", "h_anchor": "middle", "color": "#000000", "text": "Fibonacci Value", "coordinates": "axes", "zorder": 3, "alpha": 1, "fontsize": 12.0, "position": [-0.10282258064516128, 0.5], "rotation": -90.0, "id": "9104548467664"}], "zoomable": true, "images": [], "xdomain": [0.0, 0.035000000000000003], "ylim": [0.0, 50000.0], "paths": [], "sharey": [], "sharex": [], "axesbgalpha": null, "axes": [{"grid": {"gridOn": false}, "position": "bottom", "nticks": 9, "tickvalues": null, "tickformat": null}, {"grid": {"gridOn": false}, "position": "left", "nticks": 6, "tickvalues": null, "tickformat": null}], "lines": [{"color": "#FF0000", "yindex": 3, "coordinates": "data", "dasharray": "10,0", "zorder": 2, "alpha": 1, "xindex": 0, "linewidth": 10, "data": "data01", "id": "9104548588944"}, {"color": "#0000FF", "yindex": 4, "coordinates": "data", "dasharray": "10,0", "zorder": 2, "alpha": 1, "xindex": 2, "linewidth": 10, "data": "data01", "id": "9104548591312"}], "markers": [], "id": "9104548430160", "ydomain": [0.0, 50000.0], "collections": [], "xscale": "linear", "bbox": [0.125, 0.099999999999999978, 0.77500000000000002, 0.36363636363636365]}], "data": {"data01": [[9.5367431640625e-07, 0.0, 9.5367431640625e-07, 0.0, 0.0], [0.0, 1.0, 9.5367431640625e-07, 1.0, 1.0], [9.5367431640625e-07, 2.0, 1.9073486328125e-06, 1.0, 0.0], [9.5367431640625e-07, 3.0, 9.5367431640625e-07, 2.0, 0.0], [2.1457672119140625e-06, 4.0, 2.1457672119140625e-06, 3.0, 3.0], [3.0994415283203125e-06, 5.0, 9.5367431640625e-07, 5.0, 5.0], [5.0067901611328125e-06, 6.0, 1.9073486328125e-06, 8.0, 8.0], [9.059906005859375e-06, 7.0, 1.9073486328125e-06, 13.0, 13.0], [1.5020370483398438e-05, 8.0, 2.1457672119140625e-06, 21.0, 21.0], [2.288818359375e-05, 9.0, 2.1457672119140625e-06, 34.0, 34.0], [3.790855407714844e-05, 10.0, 1.9073486328125e-06, 55.0, 55.0], [6.198883056640625e-05, 11.0, 2.1457672119140625e-06, 89.0, 89.0], [9.894371032714844e-05, 12.0, 2.86102294921875e-06, 144.0, 144.0], [0.00015997886657714844, 13.0, 3.0994415283203125e-06, 233.0, 233.0], [0.0002570152282714844, 14.0, 2.86102294921875e-06, 377.0, 377.0], [0.00041413307189941406, 15.0, 3.0994415283203125e-06, 610.0, 610.0], [0.0006690025329589844, 16.0, 2.86102294921875e-06, 987.0, 987.0], [0.0010819435119628906, 17.0, 2.86102294921875e-06, 1597.0, 1597.0], [0.001809835433959961, 18.0, 3.0994415283203125e-06, 2584.0, 2584.0], [0.0029108524322509766, 19.0, 2.1457672119140625e-06, 4181.0, 4181.0], [0.004680156707763672, 20.0, 2.86102294921875e-06, 6765.0, 6765.0], [0.008035898208618164, 21.0, 2.86102294921875e-06, 10946.0, 10946.0], [0.012122869491577148, 22.0, 4.0531158447265625e-06, 17711.0, 17711.0], [0.01978015899658203, 23.0, 3.0994415283203125e-06, 28657.0, 28657.0], [0.03191089630126953, 24.0, 2.86102294921875e-06, 46368.0, 46368.0]]}, "id": "9104497414864", "toolbar": ["reset", "move"], "height": 480.0});
}
if(typeof(mpld3) !== "undefined"){
// already loaded: just create the figure
create_fig91044974148646267950353();
}else if(typeof define === "function" && define.amd){
// require.js is available: use it to load d3/mpld3
require.config({paths: {d3: "/d3"}});
require(["d3"], function(d3){
window.d3 = d3;
mpld3_load_lib("/mpld3.v0.2git.min.js", create_fig91044974148646267950353);
});
}else{
// require.js not available: dynamically load d3 & mpld3
mpld3_load_lib("http://d3js.org/d3.v3.min.js", function(){
mpld3_load_lib("/mpld3.v0.2git.min.js", create_fig91044974148646267950353);})
}
</script></body></html>
<p><em>Figure 2. Run times for calculating a range of n (0-25) using the iterative (blue) and recursive (red) approaches. Figure made with</em> <a class="reference external" href="http://mpld3.github.io/">mpld3</a>.</p>
<p>Where the red line represents recursive times and the blue iterative times. For small values of n the two methods are relatively the same, but for large values of n the time really starts to lengthen for the recursive case! So, why is this? Let's take another look at <strong>Figure 1</strong>. The tree structure shows that the F(4) and F(3) cases are only calculated once, but each subsequent case is calculated multiple times! For example, the F(1) case is calculated 3 times! Hence, the recursive method calculates the same thing multiple times, wasting valuable run time.</p>
<p>So why use recursion? Well, the recursive code is a lot easier to read. For the n > 1 case, the mathematical equation is explicitly written out, whereas in the iterative case the programer has to step through the script to understand what is going on. But the obvious gorilla in the room is that recursion in python is REALLY slow. <strong>Memoization</strong> (pronounced like Elmer Fud trying to say memorization) is a technique used to deal with this problem. Memoization and memorization are kind of synonomous in this case -- we want to make the program 'memorize' the result from previous runs. These memorized runs will be used for subsequent, repeated calls. We do this by assigning the value to a hash table. This involves a simple modification of the code:</p>
<pre class="literal-block">
mem = {}
def F_mem(n):
if (n == 0):
return 0
elif (n == 1):
return 1
elif (n > 1):
if n not in memo:
memo[n] = (F_mem(n-1) + F_mem(n-2))
return memo[n]
else:
return -1
</pre>
<p>Now, let's run our timer again, but this time use the memoized recursion:</p>
<html><head><title>mpld3 plot</title></head><body>
<style>
</style>
<div id="fig72644029038885532319697"></div>
<script>
function mpld3_load_lib(url, callback){
var s = document.createElement('script');
s.src = url;
s.async = true;
s.onreadystatechange = s.onload = callback;
s.onerror = function(){console.warn("failed to load library " + url);};
document.getElementsByTagName("head")[0].appendChild(s);
}
function create_fig72644029038885532319697(){
mpld3.draw_figure("fig72644029038885532319697", {"width": 640.0, "axes": [{"xlim": [0.0, 2.4999999999999998e-05], "yscale": "linear", "axesbg": "#FFFFFF", "texts": [{"v_baseline": "auto", "h_anchor": "middle", "color": "#000000", "text": "n", "coordinates": "axes", "zorder": 3, "alpha": 1, "fontsize": 12.0, "position": [-0.067792338709677422, 0.5], "rotation": -90.0, "id": "7264402978640"}], "zoomable": true, "images": [], "xdomain": [0.0, 2.4999999999999998e-05], "ylim": [0.0, 200.0], "paths": [], "sharey": [], "sharex": [], "axesbgalpha": null, "axes": [{"grid": {"gridOn": false}, "position": "bottom", "nticks": 6, "tickvalues": null, "tickformat": null}, {"grid": {"gridOn": false}, "position": "left", "nticks": 5, "tickvalues": null, "tickformat": null}], "lines": [{"color": "#FF0000", "yindex": 1, "coordinates": "data", "dasharray": "10,0", "zorder": 2, "alpha": 1, "xindex": 0, "linewidth": 10, "data": "data01", "id": "7264453795664"}, {"color": "#0000FF", "yindex": 1, "coordinates": "data", "dasharray": "10,0", "zorder": 2, "alpha": 1, "xindex": 2, "linewidth": 10, "data": "data01", "id": "7264453963920"}], "markers": [], "id": "7264402954512", "ydomain": [0.0, 200.0], "collections": [], "xscale": "linear", "bbox": [0.125, 0.53636363636363638, 0.77500000000000002, 0.36363636363636365]}, {"xlim": [0.0, 2.4999999999999998e-05], "yscale": "linear", "axesbg": "#FFFFFF", "texts": [{"v_baseline": "hanging", "h_anchor": "middle", "color": "#000000", "text": "Time(seconds)", "coordinates": "axes", "zorder": 3, "alpha": 1, "fontsize": 12.0, "position": [0.5, -0.13177083333333334], "rotation": -0.0, "id": "7264453809360"}, {"v_baseline": "auto", "h_anchor": "middle", "color": "#000000", "text": "Fibonacci Value", "coordinates": "axes", "zorder": 3, "alpha": 1, "fontsize": 12.0, "position": [-0.10434167786738351, 0.5], "rotation": -90.0, "id": "7264453837904"}], "zoomable": true, "images": [], "xdomain": [0.0, 2.4999999999999998e-05], "ylim": [0.0, 1.8000000000000001e+41], "paths": [], "sharey": [], "sharex": [], "axesbgalpha": null, "axes": [{"grid": {"gridOn": false}, "position": "bottom", "nticks": 6, "tickvalues": null, "tickformat": null}, {"grid": {"gridOn": false}, "position": "left", "nticks": 10, "tickvalues": null, "tickformat": null}], "lines": [{"color": "#FF0000", "yindex": 3, "coordinates": "data", "dasharray": "10,0", "zorder": 2, "alpha": 1, "xindex": 0, "linewidth": 10, "data": "data01", "id": "7264453963280"}, {"color": "#0000FF", "yindex": 4, "coordinates": "data", "dasharray": "10,0", "zorder": 2, "alpha": 1, "xindex": 2, "linewidth": 10, "data": "data01", "id": "7264453965648"}], "markers": [], "id": "7264453796304", "ydomain": [0.0, 1.8000000000000001e+41], "collections": [], "xscale": "linear", "bbox": [0.125, 0.099999999999999978, 0.77500000000000002, 0.36363636363636365]}], "data": {"data01": [[1.1920928955078125e-06, 0.0, 1.1920928955078125e-06, 0.0, 0.0], [9.5367431640625e-07, 1.0, 0.0, 1.0, 1.0], [1.1920928955078125e-06, 2.0, 2.1457672119140625e-06, 1.0, 0.0], [9.5367431640625e-07, 3.0, 9.5367431640625e-07, 2.0, 0.0], [9.5367431640625e-07, 4.0, 9.5367431640625e-07, 3.0, 3.0], [1.9073486328125e-06, 5.0, 2.1457672119140625e-06, 5.0, 5.0], [2.1457672119140625e-06, 6.0, 1.9073486328125e-06, 8.0, 8.0], [1.9073486328125e-06, 7.0, 1.1920928955078125e-06, 13.0, 13.0], [1.9073486328125e-06, 8.0, 1.9073486328125e-06, 21.0, 21.0], [1.9073486328125e-06, 9.0, 2.1457672119140625e-06, 34.0, 34.0], [9.5367431640625e-07, 10.0, 1.9073486328125e-06, 55.0, 55.0], [2.1457672119140625e-06, 11.0, 2.1457672119140625e-06, 89.0, 89.0], [1.9073486328125e-06, 12.0, 1.9073486328125e-06, 144.0, 144.0], [9.5367431640625e-07, 13.0, 1.9073486328125e-06, 233.0, 233.0], [1.1920928955078125e-06, 14.0, 2.1457672119140625e-06, 377.0, 377.0], [9.5367431640625e-07, 15.0, 1.9073486328125e-06, 610.0, 610.0], [9.5367431640625e-07, 16.0, 3.0994415283203125e-06, 987.0, 987.0], [9.5367431640625e-07, 17.0, 3.0994415283203125e-06, 1597.0, 1597.0], [9.5367431640625e-07, 18.0, 2.86102294921875e-06, 2584.0, 2584.0], [9.5367431640625e-07, 19.0, 6.198883056640625e-06, 4181.0, 4181.0], [9.5367431640625e-07, 20.0, 2.86102294921875e-06, 6765.0, 6765.0], [2.1457672119140625e-06, 21.0, 2.86102294921875e-06, 10946.0, 10946.0], [1.9073486328125e-06, 22.0, 3.0994415283203125e-06, 17711.0, 17711.0], [1.9073486328125e-06, 23.0, 3.0994415283203125e-06, 28657.0, 28657.0], [9.5367431640625e-07, 24.0, 3.814697265625e-06, 46368.0, 46368.0], [9.5367431640625e-07, 25.0, 3.0994415283203125e-06, 75025.0, 75025.0], [9.5367431640625e-07, 26.0, 4.0531158447265625e-06, 121393.0, 121393.0], [9.5367431640625e-07, 27.0, 4.0531158447265625e-06, 196418.0, 196418.0], [1.1920928955078125e-06, 28.0, 4.0531158447265625e-06, 317811.0, 317811.0], [9.5367431640625e-07, 29.0, 4.0531158447265625e-06, 514229.0, 514229.0], [9.5367431640625e-07, 30.0, 4.0531158447265625e-06, 832040.0, 832040.0], [1.1920928955078125e-06, 31.0, 4.0531158447265625e-06, 1346269.0, 1346269.0], [9.5367431640625e-07, 32.0, 3.814697265625e-06, 2178309.0, 2178309.0], [9.5367431640625e-07, 33.0, 4.0531158447265625e-06, 3524578.0, 3524578.0], [2.1457672119140625e-06, 34.0, 4.0531158447265625e-06, 5702887.0, 5702887.0], [1.9073486328125e-06, 35.0, 5.0067901611328125e-06, 9227465.0, 9227465.0], [2.1457672119140625e-06, 36.0, 5.0067901611328125e-06, 14930352.0, 14930352.0], [1.9073486328125e-06, 37.0, 3.814697265625e-06, 24157817.0, 24157817.0], [2.1457672119140625e-06, 38.0, 3.814697265625e-06, 39088169.0, 39088169.0], [1.9073486328125e-06, 39.0, 4.76837158203125e-06, 63245986.0, 63245986.0], [9.5367431640625e-07, 40.0, 5.0067901611328125e-06, 102334155.0, 102334155.0], [2.1457672119140625e-06, 41.0, 5.0067901611328125e-06, 165580141.0, 165580141.0], [1.9073486328125e-06, 42.0, 5.0067901611328125e-06, 267914296.0, 267914296.0], [9.5367431640625e-07, 43.0, 5.0067901611328125e-06, 433494437.0, 433494437.0], [9.5367431640625e-07, 44.0, 5.9604644775390625e-06, 701408733.0, 701408733.0], [9.5367431640625e-07, 45.0, 5.9604644775390625e-06, 1134903170.0, 1134903170.0], [9.5367431640625e-07, 46.0, 5.9604644775390625e-06, 1836311903.0, 1836311903.0], [1.9073486328125e-06, 47.0, 5.9604644775390625e-06, 2971215073.0, 2971215073.0], [1.9073486328125e-06, 48.0, 5.9604644775390625e-06, 4807526976.0, 4807526976.0], [2.1457672119140625e-06, 49.0, 5.9604644775390625e-06, 7778742049.0, 7778742049.0], [9.5367431640625e-07, 50.0, 5.0067901611328125e-06, 12586269025.0, 12586269025.0], [9.5367431640625e-07, 51.0, 5.9604644775390625e-06, 20365011074.0, 20365011074.0], [9.5367431640625e-07, 52.0, 5.9604644775390625e-06, 32951280099.0, 32951280099.0], [9.5367431640625e-07, 53.0, 5.9604644775390625e-06, 53316291173.0, 53316291173.0], [1.1920928955078125e-06, 54.0, 6.198883056640625e-06, 86267571272.0, 86267571272.0], [9.5367431640625e-07, 55.0, 5.9604644775390625e-06, 139583862445.0, 139583862445.0], [9.5367431640625e-07, 56.0, 6.9141387939453125e-06, 225851433717.0, 225851433717.0], [9.5367431640625e-07, 57.0, 7.152557373046875e-06, 365435296162.0, 365435296162.0], [9.5367431640625e-07, 58.0, 6.9141387939453125e-06, 591286729879.0, 591286729879.0], [2.1457672119140625e-06, 59.0, 6.9141387939453125e-06, 956722026041.0, 956722026041.0], [1.9073486328125e-06, 60.0, 6.9141387939453125e-06, 1548008755920.0, 1548008755920.0], [9.5367431640625e-07, 61.0, 7.152557373046875e-06, 2504730781961.0, 2504730781961.0], [9.5367431640625e-07, 62.0, 6.9141387939453125e-06, 4052739537881.0, 4052739537881.0], [9.5367431640625e-07, 63.0, 6.198883056640625e-06, 6557470319842.0, 6557470319842.0], [9.5367431640625e-07, 64.0, 6.9141387939453125e-06, 10610209857723.0, 10610209857723.0], [9.5367431640625e-07, 65.0, 7.152557373046875e-06, 17167680177565.0, 17167680177565.0], [9.5367431640625e-07, 66.0, 6.9141387939453125e-06, 27777890035288.0, 27777890035288.0], [2.1457672119140625e-06, 67.0, 7.152557373046875e-06, 44945570212853.0, 44945570212853.0], [1.9073486328125e-06, 68.0, 6.9141387939453125e-06, 72723460248141.0, 72723460248141.0], [1.9073486328125e-06, 69.0, 7.152557373046875e-06, 117669030460994.0, 117669030460994.0], [2.1457672119140625e-06, 70.0, 6.9141387939453125e-06, 190392490709135.0, 190392490709135.0], [9.5367431640625e-07, 71.0, 7.152557373046875e-06, 308061521170129.0, 308061521170129.0], [9.5367431640625e-07, 72.0, 8.106231689453125e-06, 498454011879264.0, 498454011879264.0], [9.5367431640625e-07, 73.0, 7.867813110351562e-06, 806515533049393.0, 806515533049393.0], [9.5367431640625e-07, 74.0, 8.106231689453125e-06, 1304969544928657.0, 1304969544928657.0], [9.5367431640625e-07, 75.0, 8.106231689453125e-06, 2111485077978050.0, 2111485077978050.0], [9.5367431640625e-07, 76.0, 7.867813110351562e-06, 3416454622906707.0, 3416454622906707.0], [1.1920928955078125e-06, 77.0, 8.106231689453125e-06, 5527939700884757.0, 5527939700884757.0], [9.5367431640625e-07, 78.0, 8.106231689453125e-06, 8944394323791464.0, 8944394323791464.0], [1.9073486328125e-06, 79.0, 8.821487426757812e-06, 1.447233402467622e+16, 1.447233402467622e+16], [9.5367431640625e-07, 80.0, 9.059906005859375e-06, 2.3416728348467684e+16, 2.3416728348467684e+16], [9.5367431640625e-07, 81.0, 9.059906005859375e-06, 3.78890623731439e+16, 3.78890623731439e+16], [1.9073486328125e-06, 82.0, 9.059906005859375e-06, 6.130579072161159e+16, 6.130579072161159e+16], [2.1457672119140625e-06, 83.0, 7.867813110351562e-06, 9.91948530947555e+16, 9.91948530947555e+16], [9.5367431640625e-07, 84.0, 8.821487426757812e-06, 1.605006438163671e+17, 1.605006438163671e+17], [9.5367431640625e-07, 85.0, 9.059906005859375e-06, 2.596954969111226e+17, 2.596954969111226e+17], [9.5367431640625e-07, 86.0, 1.0013580322265625e-05, 4.2019614072748966e+17, 4.2019614072748966e+17], [3.0994415283203125e-06, 87.0, 9.059906005859375e-06, 6.798916376386122e+17, 6.798916376386122e+17], [9.5367431640625e-07, 88.0, 9.059906005859375e-06, 1.1000877783661019e+18, 1.1000877783661019e+18], [9.5367431640625e-07, 89.0, 1.0013580322265625e-05, 1.7799794160047142e+18, 1.7799794160047142e+18], [9.5367431640625e-07, 90.0, 1.0013580322265625e-05, 2.880067194370816e+18, 2.880067194370816e+18], [9.5367431640625e-07, 91.0, 1.0013580322265625e-05, 4.66004661037553e+18, 4.66004661037553e+18], [2.1457672119140625e-06, 92.0, 1.0013580322265625e-05, 7.540113804746346e+18, 7.540113804746346e+18], [9.5367431640625e-07, 93.0, 1.0967254638671875e-05, 1.2200160415121877e+19, 1.2200160415121877e+19], [9.5367431640625e-07, 94.0, 1.0013580322265625e-05, 1.974027421986822e+19, 1.974027421986822e+19], [1.9073486328125e-06, 95.0, 1.0013580322265625e-05, 3.19404346349901e+19, 3.19404346349901e+19], [1.9073486328125e-06, 96.0, 9.775161743164062e-06, 5.168070885485833e+19, 5.168070885485833e+19], [2.1457672119140625e-06, 97.0, 1.0013580322265625e-05, 8.362114348984843e+19, 8.362114348984843e+19], [9.5367431640625e-07, 98.0, 1.0013580322265625e-05, 1.3530185234470674e+20, 1.3530185234470674e+20], [9.5367431640625e-07, 99.0, 1.0013580322265625e-05, 2.1892299583455517e+20, 2.1892299583455517e+20], [1.1920928955078125e-06, 100.0, 1.0967254638671875e-05, 3.542248481792619e+20, 3.542248481792619e+20], [9.5367431640625e-07, 101.0, 1.0967254638671875e-05, 5.731478440138171e+20, 5.731478440138171e+20], [9.5367431640625e-07, 102.0, 1.0967254638671875e-05, 9.27372692193079e+20, 9.27372692193079e+20], [9.5367431640625e-07, 103.0, 1.1920928955078125e-05, 1.500520536206896e+21, 1.500520536206896e+21], [9.5367431640625e-07, 104.0, 1.0967254638671875e-05, 2.427893228399975e+21, 2.427893228399975e+21], [1.1920928955078125e-06, 105.0, 1.0967254638671875e-05, 3.928413764606871e+21, 3.928413764606871e+21], [9.5367431640625e-07, 106.0, 1.1920928955078125e-05, 6.356306993006847e+21, 6.356306993006847e+21], [1.9073486328125e-06, 107.0, 1.2159347534179688e-05, 1.0284720757613718e+22, 1.0284720757613718e+22], [2.1457672119140625e-06, 108.0, 1.1920928955078125e-05, 1.6641027750620564e+22, 1.6641027750620564e+22], [9.5367431640625e-07, 109.0, 1.1920928955078125e-05, 2.692574850823428e+22, 2.692574850823428e+22], [2.1457672119140625e-06, 110.0, 1.1920928955078125e-05, 4.356677625885484e+22, 4.356677625885484e+22], [9.5367431640625e-07, 111.0, 1.1920928955078125e-05, 7.049252476708912e+22, 7.049252476708912e+22], [9.5367431640625e-07, 112.0, 1.3113021850585938e-05, 1.1405930102594397e+23, 1.1405930102594397e+23], [1.1920928955078125e-06, 113.0, 1.2874603271484375e-05, 1.8455182579303308e+23, 1.8455182579303308e+23], [9.5367431640625e-07, 114.0, 1.3113021850585938e-05, 2.9861112681897705e+23, 2.9861112681897705e+23], [9.5367431640625e-07, 115.0, 1.2159347534179688e-05, 4.831629526120102e+23, 4.831629526120102e+23], [1.1920928955078125e-06, 116.0, 1.2874603271484375e-05, 7.817740794309872e+23, 7.817740794309872e+23], [9.5367431640625e-07, 117.0, 1.3113021850585938e-05, 1.2649370320429975e+24, 1.2649370320429975e+24], [9.5367431640625e-07, 118.0, 1.2874603271484375e-05, 2.0467111114739846e+24, 2.0467111114739846e+24], [9.5367431640625e-07, 119.0, 1.2874603271484375e-05, 3.311648143516982e+24, 3.311648143516982e+24], [9.5367431640625e-07, 120.0, 1.4066696166992188e-05, 5.358359254990966e+24, 5.358359254990966e+24], [2.1457672119140625e-06, 121.0, 1.4066696166992188e-05, 8.670007398507949e+24, 8.670007398507949e+24], [1.9073486328125e-06, 122.0, 1.3828277587890625e-05, 1.4028366653498915e+25, 1.4028366653498915e+25], [2.1457672119140625e-06, 123.0, 1.4066696166992188e-05, 2.2698374052006864e+25, 2.2698374052006864e+25], [9.5367431640625e-07, 124.0, 1.4066696166992188e-05, 3.672674070550578e+25, 3.672674070550578e+25], [9.5367431640625e-07, 125.0, 1.4066696166992188e-05, 5.9425114757512645e+25, 5.9425114757512645e+25], [1.1920928955078125e-06, 126.0, 1.4066696166992188e-05, 9.615185546301842e+25, 9.615185546301842e+25], [9.5367431640625e-07, 127.0, 1.4066696166992188e-05, 1.5557697022053105e+26, 1.5557697022053105e+26], [2.1457672119140625e-06, 128.0, 1.4066696166992188e-05, 2.517288256835495e+26, 2.517288256835495e+26], [9.5367431640625e-07, 129.0, 1.5020370483398438e-05, 4.073057959040806e+26, 4.073057959040806e+26], [2.1457672119140625e-06, 130.0, 1.5020370483398438e-05, 6.590346215876301e+26, 6.590346215876301e+26], [9.5367431640625e-07, 131.0, 1.4066696166992188e-05, 1.0663404174917106e+27, 1.0663404174917106e+27], [9.5367431640625e-07, 132.0, 1.5020370483398438e-05, 1.7253750390793406e+27, 1.7253750390793406e+27], [9.5367431640625e-07, 133.0, 1.5020370483398438e-05, 2.7917154565710513e+27, 2.7917154565710513e+27], [9.5367431640625e-07, 134.0, 1.5974044799804688e-05, 4.517090495650392e+27, 4.517090495650392e+27], [9.5367431640625e-07, 135.0, 1.5974044799804688e-05, 7.308805952221443e+27, 7.308805952221443e+27], [1.1920928955078125e-06, 136.0, 1.5974044799804688e-05, 1.1825896447871835e+28, 1.1825896447871835e+28], [1.9073486328125e-06, 137.0, 1.5974044799804688e-05, 1.9134702400093278e+28, 1.9134702400093278e+28], [1.9073486328125e-06, 138.0, 1.5020370483398438e-05, 3.0960598847965113e+28, 3.0960598847965113e+28], [2.1457672119140625e-06, 139.0, 1.5974044799804688e-05, 5.009530124805839e+28, 5.009530124805839e+28], [1.9073486328125e-06, 140.0, 1.6927719116210938e-05, 8.105590009602351e+28, 8.105590009602351e+28], [2.1457672119140625e-06, 141.0, 1.5974044799804688e-05, 1.3115120134408189e+29, 1.3115120134408189e+29], [1.9073486328125e-06, 142.0, 1.5974044799804688e-05, 2.122071014401054e+29, 2.122071014401054e+29], [1.9073486328125e-06, 143.0, 1.5974044799804688e-05, 3.433583027841873e+29, 3.433583027841873e+29], [1.1920928955078125e-06, 144.0, 1.5974044799804688e-05, 5.555654042242927e+29, 5.555654042242927e+29], [1.9073486328125e-06, 145.0, 1.6927719116210938e-05, 8.9892370700848e+29, 8.9892370700848e+29], [9.5367431640625e-07, 146.0, 1.71661376953125e-05, 1.4544891112327727e+30, 1.4544891112327727e+30], [9.5367431640625e-07, 147.0, 1.6927719116210938e-05, 2.3534128182412526e+30, 2.3534128182412526e+30], [9.5367431640625e-07, 148.0, 1.71661376953125e-05, 3.8079019294740253e+30, 3.8079019294740253e+30], [9.5367431640625e-07, 149.0, 1.6927719116210938e-05, 6.161314747715278e+30, 6.161314747715278e+30], [1.9073486328125e-06, 150.0, 1.71661376953125e-05, 9.969216677189303e+30, 9.969216677189303e+30], [9.5367431640625e-07, 151.0, 1.6927719116210938e-05, 1.613053142490458e+31, 1.613053142490458e+31], [1.1920928955078125e-06, 152.0, 1.71661376953125e-05, 2.6099748102093883e+31, 2.6099748102093883e+31], [9.5367431640625e-07, 153.0, 1.7881393432617188e-05, 4.223027952699846e+31, 4.223027952699846e+31], [9.5367431640625e-07, 154.0, 1.811981201171875e-05, 6.833002762909235e+31, 6.833002762909235e+31], [9.5367431640625e-07, 155.0, 1.7881393432617188e-05, 1.1056030715609081e+32, 1.1056030715609081e+32], [9.5367431640625e-07, 156.0, 1.811981201171875e-05, 1.7889033478518318e+32, 1.7889033478518318e+32], [1.1920928955078125e-06, 157.0, 1.811981201171875e-05, 2.89450641941274e+32, 2.89450641941274e+32], [9.5367431640625e-07, 158.0, 1.7881393432617188e-05, 4.683409767264571e+32, 4.683409767264571e+32], [9.5367431640625e-07, 159.0, 1.8835067749023438e-05, 7.577916186677312e+32, 7.577916186677312e+32], [9.5367431640625e-07, 160.0, 1.9073486328125e-05, 1.2261325953941882e+33, 1.2261325953941882e+33], [9.5367431640625e-07, 161.0, 1.9073486328125e-05, 1.9839242140619194e+33, 1.9839242140619194e+33], [1.1920928955078125e-06, 162.0, 1.8835067749023438e-05, 3.2100568094561075e+33, 3.2100568094561075e+33], [9.5367431640625e-07, 163.0, 2.002716064453125e-05, 5.193981023518027e+33, 5.193981023518027e+33], [9.5367431640625e-07, 164.0, 2.002716064453125e-05, 8.404037832974135e+33, 8.404037832974135e+33], [9.5367431640625e-07, 165.0, 2.002716064453125e-05, 1.3598018856492163e+34, 1.3598018856492163e+34], [9.5367431640625e-07, 166.0, 2.002716064453125e-05, 2.2002056689466297e+34, 2.2002056689466297e+34], [9.5367431640625e-07, 167.0, 1.9073486328125e-05, 3.5600075545958458e+34, 3.5600075545958458e+34], [2.1457672119140625e-06, 168.0, 1.9073486328125e-05, 5.760213223542476e+34, 5.760213223542476e+34], [9.5367431640625e-07, 169.0, 2.002716064453125e-05, 9.32022077813832e+34, 9.32022077813832e+34], [2.1457672119140625e-06, 170.0, 2.002716064453125e-05, 1.5080434001680798e+35, 1.5080434001680798e+35], [1.9073486328125e-06, 171.0, 2.09808349609375e-05, 2.440065477981912e+35, 2.440065477981912e+35], [1.9073486328125e-06, 172.0, 2.002716064453125e-05, 3.948108878149992e+35, 3.948108878149992e+35], [2.1457672119140625e-06, 173.0, 2.002716064453125e-05, 6.3881743561319036e+35, 6.3881743561319036e+35], [1.9073486328125e-06, 174.0, 2.09808349609375e-05, 1.0336283234281895e+36, 1.0336283234281895e+36], [2.1457672119140625e-06, 175.0, 2.002716064453125e-05, 1.67244575904138e+36, 1.67244575904138e+36], [1.9073486328125e-06, 176.0, 2.09808349609375e-05, 2.7060740824695694e+36, 2.7060740824695694e+36], [2.1457672119140625e-06, 177.0, 2.09808349609375e-05, 4.3785198415109494e+36, 4.3785198415109494e+36], [1.9073486328125e-06, 178.0, 2.193450927734375e-05, 7.084593923980518e+36, 7.084593923980518e+36], [1.9073486328125e-06, 179.0, 2.09808349609375e-05, 1.1463113765491467e+37, 1.1463113765491467e+37], [2.1457672119140625e-06, 180.0, 2.193450927734375e-05, 1.8547707689471987e+37, 1.8547707689471987e+37], [1.9073486328125e-06, 181.0, 2.193450927734375e-05, 3.0010821454963454e+37, 3.0010821454963454e+37], [2.1457672119140625e-06, 182.0, 2.193450927734375e-05, 4.855852914443544e+37, 4.855852914443544e+37], [1.9073486328125e-06, 183.0, 2.2172927856445312e-05, 7.856935059939889e+37, 7.856935059939889e+37], [1.9073486328125e-06, 184.0, 2.193450927734375e-05, 1.2712787974383434e+38, 1.2712787974383434e+38], [2.1457672119140625e-06, 185.0, 2.2172927856445312e-05, 2.0569723034323324e+38, 2.0569723034323324e+38], [1.9073486328125e-06, 186.0, 2.288818359375e-05, 3.3282511008706755e+38, 3.3282511008706755e+38], [1.9073486328125e-06, 187.0, 2.3126602172851562e-05, 5.385223404303008e+38, 5.385223404303008e+38], [2.1457672119140625e-06, 188.0, 2.288818359375e-05, 8.713474505173684e+38, 8.713474505173684e+38], [9.5367431640625e-07, 189.0, 2.2172927856445312e-05, 1.409869790947669e+39, 1.409869790947669e+39], [9.5367431640625e-07, 190.0, 2.288818359375e-05, 2.2812172414650375e+39, 2.2812172414650375e+39], [9.5367431640625e-07, 191.0, 2.4080276489257812e-05, 3.691087032412707e+39, 3.691087032412707e+39], [9.5367431640625e-07, 192.0, 2.4080276489257812e-05, 5.972304273877745e+39, 5.972304273877745e+39], [1.1920928955078125e-06, 193.0, 2.4080276489257812e-05, 9.66339130629045e+39, 9.66339130629045e+39], [9.5367431640625e-07, 194.0, 2.288818359375e-05, 1.5635695580168194e+40, 1.5635695580168194e+40], [9.5367431640625e-07, 195.0, 2.4080276489257812e-05, 2.5299086886458645e+40, 2.5299086886458645e+40], [9.5367431640625e-07, 196.0, 2.4080276489257812e-05, 4.093478246662684e+40, 4.093478246662684e+40], [9.5367431640625e-07, 197.0, 2.384185791015625e-05, 6.623386935308548e+40, 6.623386935308548e+40], [1.1920928955078125e-06, 198.0, 2.384185791015625e-05, 1.0716865181971233e+41, 1.0716865181971233e+41], [9.5367431640625e-07, 199.0, 2.384185791015625e-05, 1.734025211727978e+41, 1.734025211727978e+41]]}, "id": "7264402903888", "toolbar": ["reset", "move"], "height": 480.0});
}
if(typeof(mpld3) !== "undefined"){
// already loaded: just create the figure
create_fig72644029038885532319697();
}else if(typeof define === "function" && define.amd){
// require.js is available: use it to load d3/mpld3
require.config({paths: {d3: "/d3"}});
require(["d3"], function(d3){
window.d3 = d3;
mpld3_load_lib("/mpld3.js", create_fig72644029038885532319697);
});
}else{
// require.js not available: dynamically load d3 & mpld3
mpld3_load_lib("/d3.js", function(){
mpld3_load_lib("/mpld3.js", create_fig72644029038885532319697);})
}
</script></body></html>
<p><em>Figure 3. Run times for calculating a range of n (0-200) using the iterative (blue) and memoized recursive (red) approaches. Figure made with</em> <a class="reference external" href="http://mpld3.github.io/">mpld3</a>.</p>
<p>Woa! As <a class="reference external" href="http://maryrosecook.com/">Mary Rose Cook</a> would say, "Now we're cookin' with gas." Run times for the recursive method are now out-pacing the iterative method! So, by memoizing your recursive function you can get your code to run REALLY fast! A down side to this, however, is that you are going to take up memory by storing information in your hash table.</p>
<p>Hope you liked my blog! 'Till next time!</p>
</div>
Projecting GPS velocity vectors onto a profile2014-03-19T14:56:00-04:00Gina Schmalzletag:geodesygina.com,2014-03-19:VectorProj.html<div class="section" id="global-positioning-systems">
<h2><strong>Global Positioning Systems</strong></h2>
<p><strong>Global Positioning Systems</strong> (<strong>GPS</strong>) are used to measure the three dimensional position of a point over time. High precision GPS are used to measure tectonic plate motion by measuring the position of a permanently installed geodetic monument over time. The GPS instruments are either permanently installed over the monument and continuously recording its position over time, or the GPS monuments are perodically measured. With either method, three dimensional position estimates are made over time.</p>
<p>This is an image of a high precision GPS antenna, whose image I took from the <a class="reference external" href="http://www.unavco.org/projects/major-projects/pbo/pbo.html">UNAVCO website</a>:</p>
<img alt="UNAVCO GPS antenna" class="align-right" src="/images/gps_site.jpg" style="width: 400.0px; height: 200.0px;" />
<p>And this is an example of the GPS site BEMT position 3D time series taken from UNAVCO:</p>
<img alt="UNAVCO GPS antenna" class="align-right" src="http://cws.unavco.org:8080/cws/modules/GPStimeseriesCA/versions/version2011may/BEMT_2011.png" style="width: 400.0px; height: 400.0px;" />
<p>Blue dots are daily position estimates in the north (top), east (middle) and vertical (bottom) components. This site experienced an offset due to an earthquake in 2010. For more information on how GPS works, please visit: <a class="reference external" href="http://www.unavco.org/edu_outreach/teachers/teachers.html">http://www.unavco.org/edu_outreach/teachers/teachers.html</a></p>
</div>
<div class="section" id="gps-velocities">
<h2><strong>GPS Velocities</strong></h2>
<p>The rate at which a geodetic monument moves can be estimated by taking the slope of the time series for each component. Take for example the time series shown above. This is going to be a rough estimate, but between 2004 and 2010, the monument north component position moved from -15 mm to +15 mm, totalling 30 mm of displacement over 6 yrs, indicating that it is moving at 5 mm/yr in the north position.</p>
<img alt="North Component" class="align-right" src="/images/NGPS.png" style="width: 200.0px; height: 200.0px;" />
<p>Using the same logic, the east component moved ~-50 mm over about 6 years, giving the east component a rate of 8.3 mm/yr to the west.</p>
<img alt="East Component" class="align-right" src="/images/EGPS.png" style="width: 200.0px; height: 200.0px;" />
<p>The Magnitude and direction of the horizontal GPS velocity vector can now be calculated:</p>
<img alt="Horizontal Components" class="align-right" src="/images/ENGPS.png" style="width: 400.0px; height: 400.0px;" />
</div>
<div class="section" id="vector-projection">
<h2><strong>Vector Projection</strong></h2>
<p>In map view small variations in GPS velocities may be difficult to see, hence it is sometimes useful to plot GPS velocities along a profile. The profile line can follow a fault line, and, if it does, one can calculate the fault parallel and perpendicular components of motion. Fault parallel motion will give you an idea of lateral motion across the fault (as in strike-slip fault systems), and fault perpendicular motion will tell you if the two sides of the fault are separating or converging. In this section, we will talk about deriving the profile parallel and perpendicular components of the GPS vectors using <strong>vector projection</strong>.</p>
<img alt="Horizontal Components" class="align-right" src="/images/vector_projection.png" style="width: 400.0px; height: 200.0px;" />
<p>Here the fault perpendicular velocity is: R perp= R*sin(t)
and the fault parallel velocity is: R par= R*cos(t)</p>
</div>
<div class="section" id="the-vector-projector">
<h2><strong>The Vector Projector</strong></h2>
<p><strong>Stuart Sandine</strong>, <strong>Andrea Fey</strong> and <strong>Thomas Ballinger</strong> and I created a web app called <strong>The Vector Projector</strong> that calculates the magnitude, transect parallel and transect perpendicular components of GPS velocities along a profile. In this app, you are given the option of several GPS velocity fields, calculated with respect to stable North America. For now you can choose your profile width and you can filter what data you would like to use by their uncertainties (i.e., uncertainties that are more than the value specified are not used). This beta version does not plot uncertainties, which we plan to change in the future. Give it a try! <a class="reference external" href="http://geodesygina.com/vectorprojector/vectorprojector.html">Go to the Vector Projector</a>.</p>
</div>
About Me2014-03-13T12:40:00-04:00Gina Schmalzletag:geodesygina.com,2014-03-13:MyBio.html<img alt="Gina Schmalzle" class="align-right" src="/images/Gina.jpg" style="width: 200.0px; height: 200.0px;" />
<div class="section" id="hello-i-m-gina">
<h2>Hello! I'm Gina.</h2>
<p>I am a geodesist who recieved her PhD at the <a class="reference external" href="http://www.rsmas.miami.edu/">University of Miami Rosenstiel School of Marine and Atmospheric Sciences</a> in Miami FL. Since then I have been a postdoctoral scholar and research scientist at the <a class="reference external" href="http://www.washington.edu/">University of Washington</a> in Seattle, WA, and along with Scott Baker and Batuhan Osmanoglu, started a geodetic services company called <a class="reference external" href="http://bostechnologies.com/">BOS Technologies LLC</a>. We specialize in high precision Global Positioning Systems (GPS) and Interferometric Synthetic Aperture Radar (InSAR). I am currently an Assistant Scientist at the <a class="reference external" href="http://www.miami.edu/">University of Miami</a> working remotely from Seattle, WA. I have an extensive background studying the tectonics of the west coast of the United States. I started my geophysics career studying the San Andreas fault, and I am currently studying the <a class="reference external" href="http://geodesygina.com/Cascadia.html">Cascadia Subduction Zone</a> and Southern California.</p>
<p>I love data. I love visualizing and analyzing it, and just getting my hands dirty with it. Recently I have been working on developing interactive websites for geophysical datasets. Many geophysical datasets are publicly available, but it is difficult for the general public to use and play with these data. Presented on this website are two interactive websites that I built with a little help from my friends at <a class="reference external" href="www.hackerschool.com">Hacker School, NYC</a>. The <a class="reference external" href="http://geodesygina.com/vectorprojector/vectorprojector.html">Vector Projector</a> is a nifty little tool we made that visualizes high precision GPS velocity fields that measure how much tectonic plates move over time. The <a class="reference external" href="http://geodesygina.com/JapanEarthquake/index.html">Japan Earthquake Movie</a> is an animation showing the locations of the main shock and aftershocks of the 2011 Japan earthquake, that simultaneously plots with charts of magnitude versus time.</p>
<p>Thanks for reading my website! I'd love to hear your thoughts, so <a class="reference external" href="mailto:ginaschmalzle@gmail.com">keep in touch</a>!</p>
</div>
Setting up Custom Domain Names with Github Pages2014-03-11T13:40:00-04:00Gina Schmalzletag:geodesygina.com,2014-03-11:GeodesyGina.html<p>Hello World!</p>
<p>This is my first blog post generated by pelican and hosted by <strong>github</strong> with my snazzy new domain name <strong>geodesygina.com</strong>! That is pronounced:
GEE-ODD-ESS-Y-GEE-NA. Ha! I'm such a geek.</p>
<p>Anyway, thanks to Amy Hanlon who posted directions on how to set up a blog with <strong>Pelican</strong> at <a class="reference external" href="http://mathamy.com/">http://mathamy.com/</a>.</p>
<p>I purchased my <strong>domain name</strong> (<strong>geodesygina.com</strong>) with <strong>godaddy.com</strong>, for $12.19 for 2 years. Getting the <strong>domain name</strong> to point to <strong>github</strong> was a little tricky. <strong>Github</strong> posted a "Setting up a custom domain with Pages" site (<a class="reference external" href="https://help.github.com/articles/setting-up-a-custom-domain-with-pages">https://help.github.com/articles/setting-up-a-custom-domain-with-pages</a>) that goes through how to set up your <strong>domain name</strong> in your repository. This part is pretty clear until you need to set up your DNS. I can only speak for my experience at godaddy.com, but this is how you set up your DNS through them:</p>
<ol class="arabic simple">
<li>Log into your godaddy account.</li>
<li>Go to the My Account page and click on the green launch button to the right of where it says domains.</li>
<li>On the Domains page, click on the domain name. On the page it brings you to click the tab that says DNS Zone File then click edit.</li>
<li>Go to the A(host) section. There should be 1 record with @ as the host. Github has two different IP addresses. Edit the current one. Keep the @ as the host and replace the IP address with 192.30.252.153. Once that is in, click "Quick add" right underneath and for the new record, put @ as the host again and enter the IP address 192.30.252.154.</li>
</ol>
<p>That should be it! It may take a little time for your website to update.</p>
My Publications2014-03-10T02:45:00-04:00Gina Schmalzletag:geodesygina.com,2014-03-10:MyPubs.html<p><strong>Schmalzle, G. M.</strong>, McCaffrey, R., Creager, K., <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1002/2013GC005172/abstract">Central Cascadia Subduction Zone Creep</a>, Geophysics, Geochemistry, Geosystems, doi: 10.1002/2013GC005172, 2014.</p>
<p>Karimzadeh, S., Cakir, Z., Osmanoglu, B., <strong>Schmalzle, G. M.</strong>, Miyajima, M., Amiraslanzadeh, R., and Djamour, Y., <a class="reference external" href="https://www.researchgate.net/publication/235926502_Interseismic_strain_accumulation_across_the_North_Tabriz_Fault_(NW_Iran)_deduced_from_InSAR_time_series">Interseismic strain accumulation across the North Tabriz Fault (NW Iran) deduced from InSAR time series</a>, Journal of Geodynamics, 66, doi: 10.1016/j.jog2013.02.003, 2013.</p>
<p>Gourmelen, N., Dixon, T.H., Amelung, F., <strong>Schmalzle, G. M.</strong>, <a class="reference external" href="http://www.sciencedirect.com/science/article/pii/S0012821X10007119">Acceleration and Evolution of Faults: An Example from the Hunter Mountain-Panamint Valley Fault Zone, Eastern California</a>, Earth and Planetary Science Letters, 10.1016/j.epsl.2010.11.016, 2010.</p>
<p>Fulton, P., <strong>Schmalzle, G. M.</strong>, Harris, R., Dixon, T. H., <a class="reference external" href="http://www.sciencedirect.com/science/article/pii/S0012821X10006679">Reconciling patterns of interseismic strain accumulation with thermal observations across the Carrizo segment of the San Andreas Fault</a>, Earth and Planetary Science Letters, 10.1016/j.epsl.2010.10.024, 2010.</p>
<p><strong>Schmalzle, G. M.</strong>, <a class="reference external" href="http://scholarlyrepository.miami.edu/oa_dissertations/177/">The Earthquake Cycle of Strike-Slip Faults</a>, PhD Thesis, University of Miami, Miami, FL, 211 pp., 2008.</p>
<p>Biggs, J., Burgmann, R., Freymueller, J., Lu, Z., Parsons, B., Ryder, I., <strong>Schmalzle, G. M.</strong>, Wright, T., <a class="reference external" href="http://gji.oxfordjournals.org/content/176/2/353.abstract?sid=034a1429-fe9e-464a-8593-2616ee43ac4a">The postseismic response to the 2002 M7.9 Denali Fault earthquake: constraints from InSAR</a>, Geophysical Journal International, 175 (3), 10.1111/j.1365-246X.2008.03932.x, 2008.</p>
<p><strong>Schmalzle, G. M.</strong>, Dixon, T.H., Malservisi, R., Govers, R., <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1029/2005JB003843/full">Strain accumulation across the Carrizo segment of the San Andreas Fault, California: Impact of laterally varying crustal properties</a>, Journal of Geophysical Research, B, Solid Earth and Planets, 111, doi:10.1029/2005JB003843, 2006.</p>
<p>Sabburg, J., Kimlin, M. G., Rives, J. E., Meltzer, R. S., Taylor, T. E., <strong>Schmalzle, G. M.</strong>, Zheng, S., Huang, N., Wilson, A. R., Udelhofen, P. M., <a class="reference external" href="http://proceedings.spiedigitallibrary.org/proceeding.aspx?articleid=892841">Comparisons of corrected daily integrated erythemal UVR from the U.S. EPA/UGA network of Brewer spectroradiometers with model and satellite data</a>, Proceedings of SPIE, 4482, doi:10.1117/1112.452955, 2002.</p>
<p>Sabburg, J., Rives, J. E., Meltzer, R. S., Taylor, T. E., <strong>Schmalzle, G. M.</strong>, Zheng, S., Huang, N., Wilson, A. R., Udelhofen, P. M., <a class="reference external" href="http://onlinelibrary.wiley.com/doi/10.1029/2001JD001565/abstract">Comparisons of corrected daily integrated erythemal UVR data from the U.S. EPA/UGA network of Brewer spectroradiometers with model and TOMS-inferred data</a>, Journal of Geophysical Research, 107, doi:10.1029/2001JD001565, 2002.</p>
<p>REFERENCE MANUALS</p>
<p><strong>Schmalzle, G. M.</strong> (2005), <a class="reference external" href="/papers/Survival_Guide_Schmalzle.pdf">Survival Guide for the Geodesy Lab</a>, edited, p. 30, University of Miami, Rosenstiel School of Marine and Atmospheric Sciences Miami, FL.</p>
<p>Thomas, T., and <strong>Schmalzle, G. M.</strong> (2000), <a class="reference external" href="http://www.esrl.noaa.gov/gmd/grad/neubrew/docs/uga/Site_Operator_Procedure34100.pdf">The Site Operator's Standard Operating Procedure for the Brewer Spectrophotometer</a>, edited, NOAA.</p>