Request

To request a blog written on a specific topic, please email James@StatisticsSolutions.com with your suggestion. Thank you!

Tuesday, December 30, 2008

Statistical Analysis for your Dissertation and Thesis

What types of statistical analysis are appropriate for a dissertation or thesis?

Multivariate statistics are usually appropriate but not exclusively used. I help graduate students everyday with dissertations and theses that utilize simple linear regressions, correlations, and t-tests, however, most institutions and committees want to see multivariate statistics used by their graduate students. That said, here is a very short list of the common ones.

Multiple Regression

Multiple regression for your dissertation or thesis will simply include more than one predictor. The advantage to using this statistical test for your dissertation or thesis is that you include multiple variables in your model predicting your variable of interest. Very rarely – if ever – is it the case that only one variable is responsible for values of another variable. I like an example using the Super Bowl. I may be able to predict a good percentage of Super Bowl victories with salaries, but we all know there are many more factors involved in predicting Super Bowl victories, such as injuries, weather, experience, and strength of schedule. Including multiple predictors makes for a more accurate model. Get help with using multiple regressions for your Master's thesis, Master's dissertation, Ph.D. thesis, or Ph.D. dissertation.

Logistic Regression

The logic behind the multiple regression applies to the logistic regression, except that the logistic regression utilizes an odds ratio to predict the occurrence of a dichotomous variable. Get help with using logistic regressions for your Master's thesis, Master's dissertation, Ph.D. thesis, or Ph.D. dissertation.

n – way ANOVA (Analysis of Variance) or Factorial ANOVA(Analysis of Variance)

The n in this case is simply referring to the virtually limitless number of independent variables that can be used in an ANOVA. A two-way ANOVA is the equivalent of conducting two ANOVAs or t-tests in one test and is simply a factorial ANOVA. A factorial ANOVA is just an ANOVA with two or more independent variables. An n-way ANOVA or factorial ANOVA could have three, four, five, or more independent variables. This method also allows for not just testing of differences between the groups but also testing of interactions between the independent variables. Get help with n-way ANOVA factorial ANOVA for your Master's thesis, Master's dissertation, Ph.D. thesis, or Ph.D. dissertation.

Mixed ANOVA(Analysis of Variance)

Again, the logic behind this test is the same as the n-way ANOVA or factorial ANOVA, but is the equivalent of conducting:

  • a dependent samples t-test or paired samples t-test and an independent samples t-test or two sample t-test at the same time.
  • a repeated measures ANOVA and simple ANOVA at the same time. 

The complexity of the test depends completely on the number of variables involved in the statistical analysis. The effect of conducting a mixed ANOVA is the increase in power from conducting multiple statistical tests in one test, while protecting your alpha in the process. Get help with using mixed ANOVA for your Master's thesis, Master's dissertation, Ph.D. thesis, or Ph.D. dissertation.

MANOVA (Multivariate Analysis of Variance)

The same logic again, except this time we are analyzing multiple dependent variables. For example, we may want to test for significant differences in GPA, SAT scores, and ACT scores, by religious affiliation. We can do all of these comparisons at the same time in the same test with the MANOVA or multivariate analysis of variance. This is a favorite of many a professional researcher and committee. Get help with using MANOVA or multivariate analysis of variance for your Master's thesis, Master's dissertation, Ph.D. thesis, or Ph.D. dissertation.

ANCOVA (Analysis of Covariance) MANCOVA (Multivariate Analysis of Covariance

These have the same benefits and accomplish the same thing as their siblings without the "C" or "Co" but add that capability of excluding variables that could somehow invalidate your results. To do this, the ANCOVA and the MANCOVA utilize a control variable. For example, if I wanted to know if there is a significant difference in GPA between college students, there could be any number of factors that could cause the difference. But by isolating the effect those factors have on my test, I am able to test for real differences. In this case I might control for socioeconomic status and the number of extracurricular activities. Utilizing the control variable will do a great deal to silence the critics of your research that may attribute the differences you found to the existence of some extraneous, unidentified, and unaccounted for variable. Get help with using ANCOVAs (analysis of covariance) or MANCOVAs (multivariate analysis of covariance) for your Master's thesis, Master's dissertation, Ph.D. thesis, or Ph.D. dissertation.

Doubly Multivariate Analysis of Covariance

I just thought I would throw this in here to get you thinking a little about what's possible. If your head's spinning at this point, click here and I will be more than happy to help you with your statistics for your dissertation or thesis.

Monday, December 29, 2008

Bivariate Correlations Continued

To this point I have been writing about what I thought people might be interested in reading, but I thought I would start taking requests… Debussy's Clair de Lune, 50 Cent, Michael Bolton, Britney Spears… okay maybe not Michael Bolton. Post what you would like to see a blog entry about and I will do my best to comply.

It seems a great number of you are interested in bivariate correlation, so here is another entry on this favorite of statistical tests. I also think I could be a bit more comprehensive on the assumptions of ANOVA, so look for that in the very near future. I also plan on covering the assumptions of bivariate correlation. In the meantime…

What is bivariate correlation?

A bivariate correlation is a statistical test that measures the association or relationship between two continuous/interval/ordinal level variables. This test will use probability and tell the researcher the nature of the relationship between the two variables, but not the direction of the relationship in the sense of describing causality…but I digress.

How to interpret bivariate correlation

To understand how to interpret a bivariate correlation, we have to first understand what the possible results are. If the correlation is significant, our correlation coefficient will be either positive or negative.  

Positive Correlation Coefficients

A positive correlation coefficient means that the relationship between the two variables is positive and that the variables move in the same direction. It also means that an increase in say… height, corresponds with an increase in weight. Stated another way, "As height increases, weight also increases, or as weight increases, height also increases."  

Negative Correlation Coefficients

A negative correlation coefficient means that the relationship between the two variables is negative and means that the variables move in opposite directions. Using the same example, we would say, "As height increases, weight decreases, or as weight increases, height decreases." 

The sign of the correlation coefficient tells us the nature of the relationship, as in one variable decreasing as one variable is increasing, or both variables increasing or decreasing together, but does not tell us how one variable affects another variable.  

Note that the sign of the correlation – negative or positive – can be interpreted two ways. With a positive correlation, as height increases, weight also increases, or as weight increases, height also increases. Both are correct. For a negative correlation, as weight increases, height decreases, or as height increases, weight decreases. I hope this isn't confusing. If any of you are having trouble, just post a comment and let me know. Better yet, let me do the correlations for you. Get help with how to interpret bivariate correlation coefficients for your Master's thesis, Master's dissertation, Ph.D. thesis, or Ph.D. dissertation. 

What does the bivariate correlation coefficient mean?

Correlation coefficients range from -1 to +1. If the bivariate correlation coefficient is -1, the relationship between the two variables is perfectly negative, and if the bivariate correlation coefficient is +1, the relationship between the two variables is perfectly positive. The closer the correlation coefficient is to -1 or +1, the stronger the relationship. The closer the correlation coefficient is to 0, the weaker the relationship.  

What is r2?

Often the correlation is interpreted in terms of the amount of variance explained in one variable by another variable – just remember, though, that the correlation is bidirectional and can be interpreted either way. If you are conducting a Pearson correlation, in your output, you will get a Pearson correlation coefficient or a product-moment correlation coefficient. Squaring this gives you…r2. If I have a correlation coefficient of 0.4, then r2 = 0.16 and would be interpreted as, "…16% of the variance in height is explained by weight," and vice versa. Get help with interpreting r2 for your Master's thesis, Master's dissertation, Ph.D. thesis, or Ph.D. dissertation. 

How to report a Pearson correlation or a Pearson product-moment correlation?

Here is the gem of the entire post and free of charge.  

There was a significant, positive relationship between height and weight, r(98) = 0.40, p < 0.01, indicating that as height increases, weight also increases. Height accounted for 16% of the variance in weight.  

Of course there is more to it than this, such as determining which variable is going to be your dependent variable and which variable is going to be your independent variable, as well as how the appropriateness of the test and how it relates to the rest of your thesis or dissertation. Click here for help with writing bivariate correlations for your Master's thesis, Master's dissertation, Ph.D. thesis, or Ph.D. dissertation.  

What are the degrees of freedom for a bivariate correlation?

The degrees of freedom for a bivariate correlation are n – 2, where n is the sample size. This is also the number in the parenthesis above. The number in parenthesis is not the sample size.

How do I use the bivariate correlation?

If you were interested merely in the relationship of two variables, you would use the bivariate correlation. If you are interested in the effect of one variable on another variable, you would use a regression. The regression is the same as the correlation, but will tell you the specific impact one variable has on another variable in terms of the unstandardized beta coefficient and the standardized beta coefficient. It will also tell you the equation for the best fit line. We'll cover regressions very soon. Sometimes knowing the relationship of the two variables is enough, however, and if this is the case then bivariate correlation is the statistical test for you. Get help with how to use bivariate correlation for your Master's thesis, Master's dissertation, Ph.D. thesis, or Ph.D. dissertation. We even provide customized videos of your bivariate correlations being conducted.


 

Tuesday, December 23, 2008

The Dependent Samples t-test or the Paired Samples t-test

What is a dependent samples t-test or a paired samples t-test?

One of the most common statistical test, a dependent samples t-test, or a paired samples t-test, is used to find significant mean differences between two groups on a particular measure like SAT scores, ACT scores, GPA, height, or weight. In the case of the dependent samples t-test or a paired samples t-test, the groups of interest are related somehow as siblings or in the environment of a pretreatment vs. posttreatment setting. Either way, the two groups being compared are related somehow. Get help with dependent samples t-test


What is the difference between a dependent samples t-test or a paired samples t-test and an independent samples t-test?

Both tests are used to find significant differences between groups, but the independent samples t-test assumes the groups are not related to each other, while the dependent samples t-test or paired samples t-test assumes the groups are related to each other.

If you're familiar with the tests, a dependent samples t-test or paired samples t-test would be used to find differences within groups, while the independent samples t-test would be used to find differences between groups. Get help with dependent samples t-tests or independent samples t-tests


In a dependent samples t-test or a paired samples t-test, what is the independent variable and what is the dependent variable?

The independent variable and the dependent variable is the same in both the dependent samples t-test and the independent samples t-test. The variable of measure of the variable of interest is the dependent variable and the grouping variable is the independent variable. Get help with dependent samples t-test


Example of dependent samples t-test or a paired samples t-test

The most common use of the dependent samples t-test is in a pretreatment vs. posttreatment scenario where the researcher wants to test the effectiveness of a treatment.

  1. The participants are tested pretreatment, to establish some kind of a baseline measure
  2. The participants are then exposed to some kind of treatment
  3. The participants are then tested posttreatment, for the purposes of comparison with the pretreatment scores

Having both pretreatment scores and the posttreatment scores for the same participants allows us to measure the effectiveness of the treatment, ceteris paribus. Get help with dependent samples t-test


Tuesday, December 9, 2008

Analysis of Variance (ANOVA)

An analysis of variance (ANOVA) is a statistical test conducted to examine difference in a continuous variable by a categorical variable. Let’s talk about:

  1. The variables in ANOVA,
  2. The assumptions of ANOVA,
  3. The logic of ANOVA, and
  4. What the ANOVA results indicate.

I am going to limit the conversation to a one-way ANOVA (i.e., an ANOVA with just 1 independent variable).

Variables in ANOVA

The variables in ANOVA: there are two variables in an ANOVA—a dependent variable and an independent variable. For example, let’s imagine we want to examine differences in SAT scores by gender. SAT scores are the ANOVA dependent variable (i.e., the scores depend on the participants), and it’s a continuous variable because the scores range from 200 to 800. Gender is the ANOVA independent variable (i.e., the designation of male and female are independent of the participant). Further, the independent variable is categorical—you are either male or female. (For more on variables, look here.)

The Assumptions of ANOVA

The assumptions of ANOVA: when an ANOVA is conducted, there are three assumptions. The first ANOVA assumption is that of independence—in this example, males’ scores are unrelated or unaffected with the females’ scores. This ANOVA assumption cannot be violated; if it is, then a different test needs to be conducted. The second ANOVA assumption is normality—that is, the distribution of females’ scores are not dissimilar from a normal bell curve.

The third ANOVA assumption is homogeneity of variance. This ANOVA assumption essentially assesses whether the standard deviation of males and females scores are similar (or homogeneous); that is that the males’118.15 standard deviation is not dissimilar from females standard deviation of 101.03 (Table 1). The ANOVA assumption is homogeneity of variance can be assessed with the Levene test. Table 2 shows the resulting Levene test statistic, where a non-significant difference (i.e., sig > .05) indicates no difference between the standard deviations and the assumption is met.


Table 1.

Descriptives

sat





N

Mean

Std. Deviation

male

13

608.5385

118.15288

female

13

506.7692

101.03148

Total

26

557.6538

119.55415


Table 2.

Test of Homogeneity of Variances

sat




Levene Statistic

df1

df2

Sig.

.062

1

24

.806

The Logic of ANOVA

The logic of ANOVA: the logic of ANOVA is to test whether the males mean score (M=608.53) differs from females mean score (M=506.76).

What the ANOVA Results Indicate

What the ANOVA results indicate: the ANOVA (Table 3) shows the resulting F-value (F=5.571) with a significance level of .027, indicating that the F-value would occur by chance less than 3 times in 100. We can than say there is a statistically significant difference between the male and female scores, with male achieving a higher average scores compared to females.


Table 3.


df

F

Sig.

Between Groups

1

5.571

.027

Within Groups

24



Total

25



For a customized, confidential help with ANOVA and/or conducting your statistical analysis, please email us at James@StatisticsSolutions.com or call Statistics Solutions Inc. at (877) 437-8622 for a free 30-minute consultation.