Monday, January 29, 2007

Ch. 2, Gotelli and Ellison

Ch. 2 discussed random variables and probability distributions. I found this chapter really helpful. In the last several stats classes I've taken, we've discussed and/or used various types of probability distributions (Poisson, log-normal, etc.), but I've never really understood "what" they are/what they're based upon, nor in what cases they should be used. I've marked the pages with tables 2.3 and 2.4 - they have good, concise summaries of when and how to use each distribution.

On a side note, I really like the way these guys write. It's very accessible and easy to understand, unlike other stats texts I've read (e.g., Sokal and Ralph is a great example of an unreadable book!). They're pretty funny, too - I like the offhand comments like "In our continuing series of profiles of dead white men in wigs..." (p. 26). Keeps your attention.

The discussion of discrete vs. continuous variables reminded me of a paper I read for Phylogenetics class tonight, on the use of qualitative and quantitative variables in phylogenetic/cladistic analysis. Stevens (1991) argued that most traits used in cladistic analysis are quantitative/continuous, even some of those that were classed as discrete. Often, the classes/character states are determined arbitrarily by the analyst - and, no big surprise here, are defined in such a way as to fit their preconceptions of what the phylogenetic order should be. We like to think of science as objective, yet it's becoming more and more clear to me just how much our backgrounds and preconceptions influence our thinking, often without us even realizing (re: Journal Club today). Anyway, back to the topic of discrete variables - I think it's difficult to find discrete classes when measuring character states within a population, because there's so much individual variation. Even things seemingly quite discrete like color can vary - is that flower pale pink or white with a slight tinge? But when you're dealing with outcomes of trials, it is possible to have discrete variables - e.g., a coin toss can only be heads or tails (well, unless it lands on it's edge, which is extremely unlikely - but hey, when you're dealing with Ecology, it seems there's an exception for every rule). Interesting stuff...

Wednesday, January 24, 2007

Independent Project ideas

I've been kicking a few ideas around for the independent project. Ideally, I was hoping to be able to analyze some of my own data. However, the data I collected last summer has already been analyzed as fully as I believe it could/should be. I'll be collecting more data over spring break, but I'm concerned that will be too late to begin the project.

Instead, I think I'll work with the niche breadth and complementarity dataset of Tom's, which I began working with last fall but was put on hold while we waited for some samples from Tom's former advisor. The dataset includes information on foraging height, arthropod species consumed, habitat, maneuvers used, substrate, et al. for about a dozen species of rainforest-dwelling insectivorous birds in Costa Rica (mostly in or near La Selva). We're testing to see whether there is an inverse relationship between habitat and food specialization in this guild resulting in niche diffferentiation. That is, we expect that species that consume similar diets will be separated spacially (by habitat, height, etc.) and vice versa. We intend to combine this with a phylogenetic analysis (perhaps phylogenetic independent contrasts? need to look into this further) to identify which differentiation (habitat vs. food) developed first - the prediction is that similar species will differentiate spatially fairly rapidly, while specialized food habits will take longer to involve.

We plan to test this using Mantel tests, which can be run using an add-on package for R which you can download from CRAN. I haven't looked into this yet, but had good success with the VEGAN package which I used last summer, and I expect this will be similar.

Monday, January 22, 2007

Introduction to Probability

The first chapter of Gotelli and Ellison discusses probability. Probability is defined as the expected frequency with which events occur. However, this itself is dependent upon the scale of the investigation - the more you understand about a process, the less "stochasticity" there is. They make an interesting point on p. 11 - random/stochastic events are, in reality, processes that we are unable or unwilling to measure. Under this view, everything can be measured if you look close enough - yet this doesn't seem to agree with quantum physics, where the closer you get to a particle, the more random it's behavior and very existence seems to be. Hmmm....

Complex events = sequences of simple events (in coding, = OR, probabilities additive)
Shared events = multiple simultaneous occurrences of simple events (= AND, probabilities multiplicative)

This chapter also discusses conditional probabilities, e.g. P(A|B), and the Bayes theorem, which provides a way of measuring conditional probabilities. Essentially, conditional probability and Bayesian analysis involves narrowing down all possible outcomes of an event based on prior knowledge. So, if there are 10 different possible outcomes of A, and 10 different outcomes of B, but only 3 outcomes of A that also include outcome X of B, then you're limited to those 3 possibilities for A. This seems to me a legitimate method of narrowing down possible outcomes - I don't fully understand the controversy re: Bayesian analysis (given, of course, that the data you're introducing into the analysis are relevant and complete).

My First Post

Well, I've been reading blogs for several years now. During that time I've thought and talked about starting my own blog, even went so far as to get a domain space last summer (manakinescapades.com), but never found the time to actually post anything. Now, thanks to Mike Guill's Biostats and Experimental Design class, I'm actually going to begin regularly keeping a blog. For now I'll probably mostly post about readings for Biostats, dissertation ideas, Spanish assignments, and occasional random thoughts on (liberal) politics, New Orleans, and life in general.

I spent the last half an hour messing with the settings on this site, and still can't get the "oh-so-easy" Hello/Picasa picture-sharing software to work...maybe later. For now I'm off to class...later, a real post for biostats...

Ciao,
N