Wednesday, March 30, 2011

When math doesn't speak English

Okay, so maybe for you advanced people out there, this is just "plain as day"
but for me, this is pretty hard to interpret, and I bet other people also have to sit there when they see this kind of stuff and say "uhm, math, please, in ENGLISH?"

So what I want to say is (so that I never waste time googling this again)... that "upside-down A" means "for all"

So lets read this in English.
Line one: the function of y_1 and y_2 is greater than or equal to zero FOR ALL values of y_1 and y_2

... Line 2 later.. just realized I'm late to class. Egads.

Thursday, March 10, 2011

Kriging

Kriging is a special form of spatial interpolation. Really, it's a fancy and statistically specific way for taking data that you have in "point" form (like plot data) and putting it in "map form" (like an image). Fundamentally, I find it easiest to think about in terms of X,Y, and Z coordinates. In space, you normally describe elevation as "Z". But really that Z doesn't have to describe elevation, it could describe anything-- number of trees, family income size, biodiversity, etc.--- So with Kriging, it's that you know the value of this "Z" coordinate at some places in space, but you know the values of X and Y (just your locations) everywhere. Using the distance between known values of the Z coordinates, as well as the variance between these Z values, we can extrapolate between them in space.

Now there's some freedom when it comes to how this extrapolation is done, but I would say generally to be safe it's best to try a few different extrapolation patterns. These patterns describe the shape of the variance between points. Spherical patterns, for example, indicate that the variance between two points has a maximum at some certain distance. Exponential patterns indicate that the variance continues to get bigger. Oscillating or "hole effect" patterns indicate that your pattern may be repeating itself, etc. I generally think spherical is a good "first guess" but that's me. Well, actually, that's also ArcGIS.

So you've got some data and you've joined it to your locations (See my joins and relates post, below).Now what you will want to do is first-- and no one ever tells you this but it's REALLY important--
CONVERT YOUR POINTS TO A FEATURE FILE-- a point feature file.

Okay now the key
To do this sucessfully is... whatever you are displaying in your locations (also a feature file, strangely enough), that is what is going to convert. So if you have 9 joins, make sure you are displaying the one you want. Now go to Data Management Toolbox and select Feature to Point. Convert that feature to a point file named whatever you want.
Good! Okay! Now wait a long time, hope it doesn't crash.
Now, you've got some points that hold your data. Go to the Spatial Analyst and choose Kriging

Okay! So key here is the second entry box. The first just choose your file. The second-- MAKE SURE YOU CHOOSE THE DATA YOU WANT TO KRIG. It defaults to the name of your locations (objectID or something). And Kriging can run slowly!
It WILL sometimes switch back on you just to be evil. BE CAREFUL!!!
Now you can select your Krig shape if you want.... spherical I have chosen here.
Now it will run for a long time, and slowly.
Oh god! So ugly! That's because we need to extract it (sigh of relief).
Forgive me Komogorov, for I have BINNED.
Now we extract using a mask of something the size that we want. I can't remember if I've written a post on Mask Extraction. It's in the spatial analyst Toolbox. It works like a cookie cutter-- you place it on top of your badly sized image and it extracts one of the size of the "cookie" which is the image of the appropriate size.


Okay! KRIGED!

Tuesday, March 8, 2011

Concatenation and Matlab-- what no one tells you

Now this is sort of a basic thing about Matlab, but until recently I hadn't really though of it.

Concatenation.
Everyone needs to do it.

What is concatenation? Simply, it's when you stack two vectors on top of one another.
For example, if you have

2
3
4

and

5
6
7

the concatenated vector is

2
3
4
5
6
7

You might use this a lot in data processessing, etc.

To put vectors together vertically it's pretty easy, you just use the format

new vector = ([x ; y])

The semi colon indicates put them together vertically.

To put vectors next to one another horizontally, you just say;

new vector = [ x  y]

No semicolon indicates they should be next to one another. Note that this will ONLY work if they are the same length. One of my favorite uses for horizontal "joins" (if you will) is actually to use this property to "test" if my vectors are the same length. If i have a list of indices to line up, this serves nicely.

You can also do more complex things, such as joining rows in an array, like this

new array = ([oldarray (1:3, :); oldarray (5:8,:) ; oldarray (10:245,:)])

Okay, so not the most exciting post today, but trust me--- if you don't ever see this written by someone else, I think it's hard to figure out!

Sunday, March 6, 2011

MatLab grouping

So the other day I was making a class project in MatLab for a "group" and I dumbly called my file group.

BAD KS.Never name your files after a MatLab command. Little did I know I would need that command the next day.

What does "group" do?
It's great.

Recently I've been working on a carbon project where I've been asked to subset 10332 points (10077 of these points have consistent data) into two groups, one with about 1480 members and the other with the remainder. The smaller group is representative of a certain carbon pattern and the larger group representative of another pattern.

Well, the other day I needed to perform PCA on my subsets. It would defeat the point of performing the PCA's separately because I wanted to know about whole site characteristics, but I wanted to identify the members by group from the whole site.

In short,the output I wanted was:
Look closely and you'll see the red peeking out-- that's the smaller set, and what's important is that it's peeking out in that left corner there where the blue is not, indicating that it is driven by low values of components 1 and 2, whereas the blue is driven primarily by component 2 which is somewhat insensitive to component 1 (blue bar across middle). I think...I'm still interpreting this one-- unfortunately PCA is very useful for visualization but hard to interpret without more data than we have re. other factors.

Anyway,how is this done?
It's not too hard...
First, import your data and name your columns. Standardize your data by dividing by the Standard Deviation (as required by PCA).

call PCA as
[coeff, scores, latent] = princomp (your matrix)

Also label your variables!
See below to start:

avgdem = sortdem(:,2)./std(sortdem(:,2));
avgwind = sortwind(:,2)./std(sortwind(:,2));
avgslope = sortslope(:,2)./std(sortslope(:,2));


x = zeros(length(avgwind), 3);
x(:,1) = avgdem;
x(:,2) = avgwind;
x(:,3) = avgslope;

[coeff, scores, latent] = princomp(x);
vbls = {'elevation','wind','slope'};



Okay, you got this.
Now what you have is a nice PCA. Coeff will tell you the coefficients for your eigenvectors. Scores is your eigenvectors, and latent is the cumulative variation explained by the eigenvectors. If you want to know the percent variation explained, you can just divide latent by the cumsum.

the code for making the PCA is below:
because you want to be more creative, using the biplot is not very good for this scenario. Note that my groups have been created by the "scores" since I am plotting them here. I use scatter3 instead of biplot to allow more variability.
the 0.3 makes the dots really little.'r.' and 'b.' make them red and blue.



figure(3)
clf
group1 = scores(1:1480,:);
group2= scores(1481:10077,:);
hold all
N1 = length(group1);
N2 = length(group2);
scatter3(group1(:,1),group1(:,2),group1(:,3),0.3,'r.')
scatter3(group2(:,1),group2(:,2), group2(:,3),0.3,'b.')
biplot(coeff(:,1:3)*4,'varlabels',vbls,'LineWidth',2,'Color',[0 0 0]);


hold off
title('LB and ND subsets')
xlabel('component1')
ylabel('component2')
zlabel('component3')
view(2)
legend('LB','ND')


I hope that this helps you all out! It took me a while to figure out how to present this okay, but I'm happy with the results.
Well, for the graph itself. The data, seriously... that data is mystery data.

Saturday, March 5, 2011

Manuscript p. 2

So our manuscript was accepted on Feb. 28 with a minor revision (a few super-scripts in an equation were off).

Hooray!

Peterson and Straka!

Just to recap that's about a 6 m to 1 year process for pubs.
Now for some people it doesn't take so long.
But for me it does.

So if you're looking to push pubs out the door as a n00b Ph.D., expect a lag time of around a year to be safe.
By the way, Manuscript Central overall as a submission system I'd rank a 4.0/5.0. They do a good job of informing you, the log-on is not annoying, and they get back to you with help. The one thing that would make it better would be if they would email you when your status changed so you wouldn't have to check. I understand they email the corresponding author, but we want to know (lead author) too!