Computer modeling of earth processes and Visual Basic for Applications.

** Lecture Index:
Introductory comments. / Stella
software models. / VBA. / An example: the data density function.
/ An example - the dipping beds stratigraphic
thickness calculator. / An example
- calculation of the Z statistic for the Runs Test**.

**Exercise goal**: In this portion of the course we will look at how the ability to create custom functions in Excel, also known as UDFs - User Defined Functions. This allows you to develop customized Excel sheets for analysis or modeling purposes, a powerful tool to have in your toolbox.

We will focus in on **Visual Basic for Applications** **(VBA)** because it has many advantages. It can be learned
to some degree of sophistication in a few weeks time of intense
effort, is powerful, and can be used to write "macros"
and modules in many other programs, such as Excel and Access.
Another major advantage is that wherever you have access to Excel
you typically (not always) have access to this option. As you might guess, VBA shares
a lot with Visual Basic, which is a programming language
where you can create your own stand alone applications. Also,
many other people will be able to easily use your products. **C++** and Java are other powerful languages often used that you could invest
your time learning about.

**Significance of computer models in geosciences**: In addition to laboratory simulations, numerical
simulations (and in their modern form, computer models), have given
significant insight into earth processes. Two high profile examples these
days are **global climate models** and **models
for mantle convection**, but many others exist. These are of course beyond the scope of this course, but are accessible with time and effort. Many models/programs are available as freeware. One of particular interest and value to structural geology is the fault/fold software by Allmendinger, which models the folds that occur above faults in a tri-shear zone geometry - http://www.geo.cornell.edu/geology/faculty/RWA/programs.html . Dr. Allmendinger has contributed a great service to the geologic community by making it, and a suite of other tools available. Another notable softeare package is **GPLATES**, which models plate movements on the earth as a sphere (this one is sophisticated enough that there is a couple of hour learning curve before useful products can come out of it). Remember that we have
already done some very simple modeling of earth processes and
features in this course just using Excel. For example, we looked at a model for
crustal thickness and roots as a function of isostatic equilibrium
given some mountain topography. As always, understanding the geology behind the model is crucial.

Modeling comes in many different forms, from **data
modeling** to **system modeling**. Data modeling is where
sample data is used to estimate some trait of the sampled system. You might think of a process or __system model__
for an earth system (or any complex system) as being composed
of a __qualitative framework with quantitative links__. The
framework consists of the components of the system and includes
what are important component properties and how the various components
are related. It is a conceptual model. A simple example is the
hydrologic cycle as depicted in many textbooks. Then using the
language of mathematics, and principles of physics and chemistry,
rules or equations can be written for the flux of water from one
reservoir to another to provide for the possibility of more precise
predictions and descriptions. Laws describing evaporation as function
of temperature would, for example, govern how water moves from
a surface reservoir into the atmosphere.

One can confuse a __quantitative model__
as being inherently __more precise and accurate__ than a qualitative
model. In reality what it does is provide more specific predictions
that are much more useful in testing - __it is more precise__.
__It may or may not be more accurate__, and it is useful to remember the difference. Most quantitative models have an underlying qualitative conceptual model underlying it. The saying garbage in, garbage out applies here.

The software package Stella is quite useful
for modeling system behavior in a very accessible yet powerful
way. You build your system model from the following components:
a) **reservoirs**, b) **transfer processes** (inputs and
outputs, c) **variables** that influence transfer processes
and each other, d) **links** between these various elements,
and e) **rules** that govern the inputs and outputs. With initial
values input you can then watch graphically how the reservoirs
and variables change with time. By the way, in linked systems
like this non-linearity rules.

Below is an example of a simple Stella diagram that models how a population that is dependent on a resource that is provided at a constant rate varies with time given some initial value. The boxes are reservoirs, the spigots are transfer processes (inputs and outputs) and the circle is a variable.

userate = 1, (1 unit needed per individual)

input = 10,000 (10,000 units added per unit time).

use = population * userate

birth rate = population * .1 * (resource/population)

death rate = population * .09 *( population/resource)^2

Example of Stella modeling of mountain erosion.

Link to Carleton College site with examples and background material on systems thinking and systems dynamics.

**Object oriented versus code/command oriented
programming:** Early programming languages
were command/code oriented, where the program consisted of a list
of commands in the programming language. You can think of a formula
in an Excel cell as one line of code. The examples of VBA modules
below are examples. In object oriented programming one selects
and links icons that represent elements of the program. Typically,
you set the 'properties' of that element. A scroll bar is a simple
example of such icon and element. Can you think of an example
of object oriented programming?? Stella!

**User Defined Functions - UDFs**: In Excel a function takes variables, operates upon them, and then returns output. A whole suite of functions are provided, many or which you have used in the course already. You can also create custom functions, also known as UDFs, and that is the focus of this weeks exercise.

**How do you get to VBA in Excel?** In Excel 10 **alt F11** will open up the VBA window. Chose module from the Insert tab, and the window that you will insert your code for your UDF in will appear. You can also add the **Developer** option to the ribbon of options up top if it is not already there. To do this under **File**, choose **Options**. Then select the** Customize Ribbon** option and under **Main Tabs** to the right check the box for Developer in the drop-down menu. In Excel 2007 the path is - Tools/Macro/Visual Basic editor/Insert/Module. You can switch
back and forth between Visual Basic and Excel. __Again, for different
versions of Excel the Visual Basic macro editor is in different
places.__ You may have to hunt for it. You may also have to
add in the Visual Basic macro editor (like you may have had to
do with the Analysis Toolpak). Remember, as always to ask questions.

**Visual Basic Application Language**: Below is a short list of some of the most critical
language components of a VBA module that creates a UDF (your reading does a nice job with elaborating on
these).

**Function Name(variables)**- this is a required part that names the functions and defines a list of variables to be passed from the Excel sheet into the VBA function. Special language is required if you are passing an array (a suite of cells). See the examples below.**Visual Basic functions**- (these include the normal gamut of possibilities, e.g. Sin, Cos, Tan, Log).**Dim**- these declare a variable as a certain type, e.g. integer, or floating point variable (double), etc.. VBA has a default type if the variable type is not specified, but sometimes it is wise to define your variables.**If-Then-Else**and**End If**- this allows conditional operations and is very useful for sorting through data.**For i = 1 to 20 .... Next i**- this structure is used a lot in programming in order to move through and manipulate number arrays.**FunctionName =**- at some point before the end you have to tell the module what value to return from the function by setting the function name equal to that value.**End Function**- just like it sounds. This form of the End statement is unique to VBA, but all modules and programs need a terminate instruction of some type. The newer versions of VBA add this automatically as soon as you insert the Function langauge in your code line.

One of the best ways to learn this language is to review examples. Three are provided below. Most of these were written for Excel 2003, and the language for passing arrays has evolved. The tutorials below can help bring you up to date.

- Good intro to VBA Function for Excel.
- On-line description of how to create a Excel Add-in.
- On-line tutorial on working with arrays in VBA functions.
- On-line tutorial that shows another way an array can be worked with.

The basic idea is to have a function that given an x, y array of position values and an x, y position returns the number of data points in the array that are at a distance R or less from the point. This function could be used to model the distribution and specifically the density of some point feature in space.

**Example of data density function**:
The lines ingreen are comments and provide internal documentation
for the function program. __It is crucial that you document your
programs very thoroughly!__ These documentation lines are identified in the VBA code window by a single quotation mark in the beginning. The program will automatically turn that text into green to distinguish it easily from code lines.

*Screen shot of VBA UDF code in the Module window. This is for the Mac version and the PC version may look a bit different. *

**Link to a PowerPoint with more explanation for this example****. **

Using this function you could then map in a column-row grid the data density of x, y data, i.e. of point location data. That in turn can be contoured in Surfer.

Below is another example of a function I've used in work, using the GPS position data taken in the field along a uniformly dipping stratigraphic thickness. The question is if you have a 3-D position for two points, and the strike and dip of layers is known and constant, what is the bedding perpendicular thickness ( the stratigraphic thickness) between those two points?

- Function stratthick(x1, y1, z1, x2, y2, z2, strike, dip)
*'Compute the stratigraphic thickness given x,y,z positions on the top and bottom of unit*- '
*and given strike and dip, which are assumed to be constant in the area.* *' Strike and dip need to be in Radians.*- k = Cos(strike) * ((x1 - x2) + (y1 - y2) * Tan(strike))
- '
*The above calculates strike-perpendicular distance between strike lines between the 2 points.* - k1 = (z1 - z2) / Tan(dip)
- k2 = k + k1
*' The above calculates the vertical distance between two dip lines passing through 2 points.*- t = k2 * Sin(dip)
- stratthick = t
*' The above calculates the perpendicular stratigraphic thickness and then assigns it to function name.*- End Function

This statistical test is described in Swan and Sandilands. It defines a run as a sequence of numbers where the numbers consistently increase or decrease. A sequence of numbers then has within it a certain number of runs. The Z statistic simply measures how far a given sequence departs from the number of runs expected in a random sequence. A higher negative number means there are fewer runs than would be expected. A higher positive number means that there are more runs than would be expected. This can be useful for looking to see if there is a robust trend in your data, or if there might be some oscillatory behavior. There are shorter ways to code this program.

- Function RT(datal)
*'Function to return Z statistic for a runs test**' U is the number of runs (the number of times switch from positive to negative).**'a value of 400 is an end of data list*- U = 1
*'need to count the first sequence as a run.* - n1 = 0
- n2 = 0
- Count = 0
- track = 0
*'track keeps track of whether the last change was + or -.* - If datal(1) < datal(2) Then n1 = 1 Else n2 = 1
- If datal(1) < datal(2) Then track = 5 Else track = 6
*'n1 is the number of time an increase exists.**'n2 is the number of times a decrease exists.*- For z = 1 To 1000
- If datal(z) = 400 Then Count = z - 1
- If datal(z) = 400 Then GoTo 200
- Next z
*' The above loop simply counts how many data points in the array.*- 200 '
*loop out* - For x = 2 To Count - 1
- If datal(x + 1) > datal(x) And track =
6 Then U = U + 1
*'if - to + then have a run.* - If datal(x + 1) < datal(x) And track =
5 Then U = U + 1
*'if + to - then have run.* - If datal(x + 1) > datal(x) Then n1 = n1 + 1
- If datal(x + 1) > datal(x) Then track
= 5
*'an increase gets counted and kept track of* - If datal(x + 1) < datal(x) Then n2 = n2 + 1
- If datal(x + 1) < datal(x) Then track
= 6
*'a decrease gets counted* - Next x
- EV = (((2 * n1 * n2) * ((2 * n1 * n2) - n1 - n2)) / (((n1 + n2) ^ 2) * (n1 + n2 - 1))) ^ 0.5
*'EV=expected variance*- 'UE=
*expected number of runs* - UE = 1 + ((2 * n1 * n2) / (n1 + n2))
- RT = (U - UE) / EV
*'runtest should be the Z statistic*- End Function

**Part A:** The objective
is to write a function for Excel that helps model some geoscience
phenomenon. This will introduce you to the instruction-language
(code) part of Visual Basic, which is very much the original Basic
I learned some 30 years ago. ** The idea is to pass variables
or an array of numbers of some geologic significance to a function
of your devising and then have the function return a value or
array that models some aspect of a physical phenomena**.
Some specific suggestions are provided below for you to work on.

**What to model:**

Geophysics text books are full of models and a good place to peruse for one. One whole suite of possibilities is to model gravity anomaly values over a body of specified, shape, size and position. Other places to look for something to model is in geochemistry text books. Some of the exponential and power law functions we discussed in the previous week could also be used.

**Example: modeling the dispersion associated
with a random walk. **Think of the question,
can dispersion occur by a random walk mechanism? Write a function,
that given the process of incrementally moving a unit distance,
but in a random direction each time, for x increments, that yields
the straight line distance from the starting point to the final
point for that particular random walk. Then you can plot these
results versus x (or time) to see if, on average, dispersion occurs.
This one is fun, and teaches you a lot about random walks.

**Example: modeling gravity profiles over
a buried sphere**. gz in milligals =
.02794 * density contrast in gm/cc * radius in meters cubed *
depth in meters all divided by (x^2+z^2)^1.5, where x is horizontal
distance from center of sphere in meters (formula from Lillie,
1999, Whole Earth Geophysics, Prentice Hall)

**Example: modeling shortening for a mountain
belt assuming isostatic equilibrium and ignoring erosion**. You can start with the model or isostasy we explored
earlier. Input to the module would be an array of column elevations
for a cross section across a mountain belt, a standard column
width, the thickness of continental crust at sea level (representing the original thickness), and the output would be
the amount of shortening.

Other examples will be provided in class.

Create the function and employ it on some real data in order to test it. You should assume that someone else may use your sheet and function, and thus you should make it user friendly (add documentation). Consider also the appropriate units. The Visual Basic Editor is associated with the Tools/Macro window for most versions of Excel.

You should hand in the following:

- a paragraph description of what you function is attempting to model.
- a copy of your literature source for the equations and system you are modeling (textbook(s), publications).
- a copy of the working Excel file, with the example application data.
- if at all appropriate a chart/plot of results (e.g. of a gravity profile over a buried sphere).

Copyright by Harmon D. Maher Jr.. This material may be used for non-profit educational purposes if proper attribution is given. Otherwise please contact Harmon D. Maher Jr.