Computer modeling of earth processes and Visual Basic for Applications.

Lecture Index: Introductory comments. / Stella software models. / VBA. / An example: the data density function. / An example - the dipping beds stratigraphic thickness calculator. / An example - calculation of the Z statistic for the Runs Test. / Exercise 11: Data modeling in VBA.


Reading:

  • Site that introduces you to the basics with two simple examples - http://office.microsoft.com/en-us/excel-help/create-custom-functions-in-excel-2007-HA010218996.aspx .
  • Chapt. 8 User-defined functions ­ Liengme, B. V., A guide to Microsoft Excel for Scientists and Engineers; Arnold Press, p. 86-108. Some of the language in this section may not work with newer versions of Excel, but it provides a good overview of the basic framework for developing functions.
  • Chapt. 9 Modeling 1 - Liengme, B. V., A guide to Microsoft Excel for Scientists and Engineers; Arnold Press, p. 108-121. Breeze through this to get a general idea for using Excel as a modeling environment.
  • Site with tutorials on developing Excel Macros in VBA - http://www.excel-vba.com/ .

  • Introductory comments

    Exercise goal: In this portion of the course we will look at how the ability to create custom functions in Excel, also known as UDFs - User Defined Functions. This allows you to develop customized Excel sheets for analysis or modeling purposes, a powerful tool to have in your toolbox.

    We will focus in on Visual Basic for Applications (VBA) because it has many advantages. It can be learned to some degree of sophistication in a few weeks time of intense effort, is powerful, and can be used to write "macros" and modules in many other programs, such as Excel and Access. Another major advantage is that wherever you have access to Excel you typically (not always) have access to this option. As you might guess, VBA shares a lot with Visual Basic, which is a programming language where you can create your own stand alone applications. Also, many other people will be able to easily use your products. C++ and Java are other powerful languages often used that you could invest your time learning about.

    Significance of computer models in geosciences: In addition to laboratory simulations, numerical simulations (and in their modern form, computer models), have given significant insight into earth processes. Two high profile examples these days are global climate models and models for mantle convection, but many others exist. These are of course beyond the scope of this course, but are accessible with time and effort. Many models/programs are available as freeware. One of particular interest and value to structural geology is the fault/fold software by Allmendinger, which models the folds that occur above faults in a tri-shear zone geometry - http://www.geo.cornell.edu/geology/faculty/RWA/programs.html . Dr. Allmendinger has contributed a great service to the geologic community by making it, and a suite of other tools available. Another notable softeare package is GPLATES, which models plate movements on the earth as a sphere (this one is sophisticated enough that there is a couple of hour learning curve before useful products can come out of it). Remember that we have already done some very simple modeling of earth processes and features in this course just using Excel. For example, we looked at a model for crustal thickness and roots as a function of isostatic equilibrium given some mountain topography. As always, understanding the geology behind the model is crucial.

    Modeling comes in many different forms, from data modeling to system modeling. Data modeling is where sample data is used to estimate some trait of the sampled system. You might think of a process or system model for an earth system (or any complex system) as being composed of a qualitative framework with quantitative links. The framework consists of the components of the system and includes what are important component properties and how the various components are related. It is a conceptual model. A simple example is the hydrologic cycle as depicted in many textbooks. Then using the language of mathematics, and principles of physics and chemistry, rules or equations can be written for the flux of water from one reservoir to another to provide for the possibility of more precise predictions and descriptions. Laws describing evaporation as function of temperature would, for example, govern how water moves from a surface reservoir into the atmosphere.

    One can confuse a quantitative model as being inherently more precise and accurate than a qualitative model. In reality what it does is provide more specific predictions that are much more useful in testing - it is more precise. It may or may not be more accurate, and it is useful to remember the difference. Most quantitative models have an underlying qualitative conceptual model underlying it. The saying garbage in, garbage out applies here.


    Stella software models

    The software package Stella is quite useful for modeling system behavior in a very accessible yet powerful way. You build your system model from the following components: a) reservoirs, b) transfer processes (inputs and outputs, c) variables that influence transfer processes and each other, d) links between these various elements, and e) rules that govern the inputs and outputs. With initial values input you can then watch graphically how the reservoirs and variables change with time. By the way, in linked systems like this non-linearity rules.

    Below is an example of a simple Stella diagram that models how a population that is dependent on a resource that is provided at a constant rate varies with time given some initial value. The boxes are reservoirs, the spigots are transfer processes (inputs and outputs) and the circle is a variable.

    Program objects in Stella for a resource-population dependency.

    Input and rules below

    initial population = 10,000. initial resource = 50,000 units
    userate = 1, (1 unit needed per individual)
    input = 10,000 (10,000 units added per unit time).
    use = population * userate
    birth rate = population * .1 * (resource/population)
    death rate = population * .09 *( population/resource)^2

    Output below

    Example of Stella modeling of mountain erosion.

    Link to Carleton College site with examples and background material on systems thinking and systems dynamics.


    VBA and UDFs, and programming environments

    Object oriented versus code/command oriented programming: Early programming languages were command/code oriented, where the program consisted of a list of commands in the programming language. You can think of a formula in an Excel cell as one line of code. The examples of VBA modules below are examples. In object oriented programming one selects and links icons that represent elements of the program. Typically, you set the 'properties' of that element. A scroll bar is a simple example of such icon and element. Can you think of an example of object oriented programming?? Stella!

    User Defined Functions - UDFs: In Excel a function takes variables, operates upon them, and then returns output. A whole suite of functions are provided, many or which you have used in the course already. You can also create custom functions, also known as UDFs, and that is the focus of this weeks exercise.

    How do you get to VBA in Excel? In Excel 10 alt F11 will open up the VBA window. Chose module from the Insert tab, and the window that you will insert your code for your UDF in will appear. You can also add the Developer option to the ribbon of options up top if it is not already there. To do this under File, choose Options. Then select the Customize Ribbon option and under Main Tabs to the right check the box for Developer in the drop-down menu. In Excel 2007 the path is - Tools/Macro/Visual Basic editor/Insert/Module. You can switch back and forth between Visual Basic and Excel. Again, for different versions of Excel the Visual Basic macro editor is in different places. You may have to hunt for it. You may also have to add in the Visual Basic macro editor (like you may have had to do with the Analysis Toolpak). Remember, as always to ask questions.

    Visual Basic Application Language: Below is a short list of some of the most critical language components of a VBA module that creates a UDF (your reading does a nice job with elaborating on these).

    One of the best ways to learn this language is to review examples. Three are provided below. Most of these were written for Excel 2003, and the language for passing arrays has evolved. The tutorials below can help bring you up to date.


    An example: the data density function

    The basic idea is to have a function that given an x, y array of position values and an x, y position returns the number of data points in the array that are at a distance R or less from the point. This function could be used to model the distribution and specifically the density of some point feature in space.

    Example of data density function: The lines ingreen are comments and provide internal documentation for the function program. It is crucial that you document your programs very thoroughly! These documentation lines are identified in the VBA code window by a single quotation mark in the beginning. The program will automatically turn that text into green to distinguish it easily from code lines.

    Screen shot of VBA UDF code in the Module window. This is for the Mac version and the PC version may look a bit different.

    Link to a PowerPoint with more explanation for this example.

    Using this function you could then map in a column-row grid the data density of x, y data, i.e. of point location data. That in turn can be contoured in Surfer.


    An example - the dipping beds stratigraphic thickness calculator

    Below is another example of a function I've used in work, using the GPS position data taken in the field along a uniformly dipping stratigraphic thickness. The question is if you have a 3-D position for two points, and the strike and dip of layers is known and constant, what is the bedding perpendicular thickness ( the stratigraphic thickness) between those two points?


    An example - calculation of the Z statistic for the Runs Test.

    This statistical test is described in Swan and Sandilands. It defines a run as a sequence of numbers where the numbers consistently increase or decrease. A sequence of numbers then has within it a certain number of runs. The Z statistic simply measures how far a given sequence departs from the number of runs expected in a random sequence. A higher negative number means there are fewer runs than would be expected. A higher positive number means that there are more runs than would be expected. This can be useful for looking to see if there is a robust trend in your data, or if there might be some oscillatory behavior. There are shorter ways to code this program.


    Exercise 11: Data modeling in VBA

    Part A: The objective is to write a function for Excel that helps model some geoscience phenomenon. This will introduce you to the instruction-language (code) part of Visual Basic, which is very much the original Basic I learned some 30 years ago. The idea is to pass variables or an array of numbers of some geologic significance to a function of your devising and then have the function return a value or array that models some aspect of a physical phenomena. Some specific suggestions are provided below for you to work on. This exercise requires that you go through your reading carefully. You should make your function user friendly, so that it is clear what number should be put where, what the units are, and what the output is.

    What to model:

    Geophysics text books are full of models and a good place to peruse for one. One whole suite of possibilities is to model gravity anomaly values over a body of specified, shape, size and position. Other places to look for something to model is in geochemistry text books. Some of the exponential and power law functions we discussed in the previous week could also be used.

    Example: modeling the dispersion associated with a random walk. Think of the question, can dispersion occur by a random walk mechanism? Write a function, that given the process of incrementally moving a unit distance, but in a random direction each time, for x increments, that yields the straight line distance from the starting point to the final point for that particular random walk. Then you can plot these results versus x (or time) to see if, on average, dispersion occurs. This one is fun, and teaches you a lot about random walks.

    Example: modeling gravity profiles over a buried sphere. gz in milligals = .02794 * density contrast in gm/cc * radius in meters cubed * depth in meters all divided by (x^2+z^2)^1.5, where x is horizontal distance from center of sphere in meters (formula from Lillie, 1999, Whole Earth Geophysics, Prentice Hall)

    Example: modeling shortening for a mountain belt assuming isostatic equilibrium and ignoring erosion. You can start with the model or isostasy we explored earlier. Input to the module would be an array of column elevations for a cross section across a mountain belt, a standard column width, the thickness of continental crust at sea level (representing the original thickness), and the output would be the amount of shortening.

    Other examples will be provided in class.

    Create the function and employ it on some real data in order to test it. You should assume that someone else may use your sheet and function, and thus you should make it user friendly (add documentation). Consider also the appropriate units. The Visual Basic Editor is associated with the Tools/Macro window for most versions of Excel.

    You should hand in the following:


    Copyright by Harmon D. Maher Jr.. This material may be used for non-profit educational purposes if proper attribution is given. Otherwise please contact Harmon D. Maher Jr.