Exercise 2: Analyzing sample populations.

The idea behind this exercise is to familiarize yourself with the basic statistical descriptors of a sample population, explore just a little bit population comparisons, and finally to think about what it all means. You should be able to use Excel to complete all of this exercise as demonstrated in class, or you are free to complete it with other software (e.g. SPSS) or by hand (long and tedious and not suggested). Don't hesitate to come see me with questions. If you are struggling with some aspect for more than a half hour it is time to get help. This can be done in a half hour, but may take up to two hours to complete. The most difficult step may be the very first - obtaining you data.

1) The first step is to obtain a data set. Various possibilities are described on the data set links page, or you can go looking for your own. What you will need are two data sets of the same type of data, each with 50 or more entries in it. The data set should have some sort of geoscience significance. Many of the web databases are huge and you will want to extract two subsets out for comparison purposes. This could be data from two different locations for the same time period or two different time periods for the same area. Describe below the nature of your two data sets and give a reference for their source.

2) Look at the numbers in each of your populations.

What are the units? ________

What is the maximum value for population 1? ______

What is the maximum value for population 2? ______

What is the minimum value for population 1? _______

What is the minimum value for population 2? _______

What is n, the sample size for population 1? ________

What is n, the sample size for population 2? ________

4) Change the initial histogram interval starting point by a one half interval and construct a histogram for one of your two populations. In other words if a interval ten units wide has boundaries at 10, 20, 30, ..... , shift them so the boundaries are at 5, 15, 25, 35 .... Label the histogram clearly and attach it to this page.

5) Double the size of the histogram interval for the same population used in 2, and construct a histogram. Label the histogram clearly and attach it to this page.

6) For each histogram plot, describe the form of the population distribution as evident in the histogram plot and using the terms presented in lecture and the readings. Also, address whether one could do a Chi Square test to decide whether it is likely that it came from a distribution or not, and if not why not. (2 bonus points for actually conducting the Chi Square test against the expected distribution).

7) Fill out the table below:

 population 1 population 2 median mean range variance standard deviation

8) Write a less than one hundred word description of the geologic significance of all the above. In other words, what might it mean?