- Absence of multivariate outliers
- Linearity
- Absence of multicollinearity
- Equality of covariance matrices
Request
Thursday, January 24, 2013
Checking the Additional Assumptions of a MANOVA
Thursday, August 13, 2009
Analysis of Variance (ANOVA)
Statistics Solutions is the country's leader in analysis of variance and statistical consulting. Contact Statistics Solutions today for a free 30-minute consultation.
Analysis of variance (ANOVA) should always have an interval or a ratio scaled dependent variable and one or more categorical independent variables. In analysis of variance (ANOVA), the categorical independent variables are generally called factors. A particular combination of factor levels or categories is often designated as treatments in analysis of variance (ANOVA).
An analysis of variance (ANOVA) technique consists of only one categorical independent variable. This single factor is used in a technique that is called one way analysis of variance (ANOVA). On the other hand, if an analysis of variance (ANOVA) technique consists of two or more than two categorical independent variables or factors, then that technique is called n way analysis of variance (ANOVA). Here, the term ‘n’ denotes the number of factors in analysis of variance (ANOVA).
Thus, there are two techniques of analysis of variance (ANOVA). The technique called one way analysis of variance (ANOVA) can be used to understand the variation in the brand evaluation exposed to different types of commercials. One way analysis of variance (ANOVA) can also be used to understand the difference of attitudes among the retailers, wholesalers and agents towards the distribution policy of a particular firm.
So, in general, the technique of one way analysis of variance (ANOVA) is a useful technique to test the similarity of means at one time by usage of their respective variances. It is for this reason that analysis of variance (ANOVA) has its name.
The F test statistic used in analysis of variance (ANOVA) is nothing but the ratio of the sample variances. This test in analysis of variance (ANOVA) is basically done to test the statistical significance of the variability of the components. In other words, we can say that this test is a measure of variance that is used in analysis of variance (ANOVA).
The n-way of analysis of variance (ANOVA) can be used in understanding the variation in the consumer’s intentions to buy a particular brand of product with respect to different levels of price and different levels of distribution. In the field of psychology, the n-way of analysis of variance (ANOVA) can be used in understanding the affect of the consumption of a particular brand in terms of the educational level of a person and the age of that person. This technique of analysis of variance (ANOVA) also helps in understanding the interaction in the levels of advertisement and the price level of the brand.
There are major assumptions that the researcher must follow while conducting analysis of variance (ANOVA). In analysis of variance (ANOVA), the sample drawn from the population must be independent of each other. The sample drawn from the population is assumed to be a normal population in analysis of variance (ANOVA). The variances in analysis of variance (ANOVA) should always be homogeneous in nature.
The following are the steps involved in conducting analysis of variance (ANOVA):
The first and foremost step in analysis of variance (ANOVA) is to identify the dependent and the independent variables. The next step is to disintegrate the total variation in analysis of variance (ANOVA). The third step involves the measurement of the effects while conducting analysis of variance (ANOVA). The fourth step is to test the significance in the analysis of variance (ANOVA). And the last step is to interpret the results obtained after the analysis of variance (ANOVA).
Wednesday, June 3, 2009
ANOVA
For a free consultation on ANOVA or statistical methods, click here.
Assumptions in ANOVA:
1. Normality: The first assumption in ANOVA is that the data should be normally distributed or the distribution of the particular data should be normal. There are many statistical tests that are applied to know the distribution of the ANOVA data. More commonly, many researchers use Kolmogorov-Smirnov, Shapiro-Wilk, or the histogram test to test the normality of the data.
2. Homogeneity: The second important assumption in ANOVA is that homogeneity or variance between the groups should be the same. In SPSS, Levene’s test is applied to test the homogeneity of the ANOVA data.
3. The third assumption is ANOVA is independence of case. This means that the grouping variables should be independent of each other or there should not be any pattern between the cases.
In research, after the regression technique, ANOVA is the second technique that is the most commonly used by the researcher. It is used in business, medicine or in psychology research. For example, in business, ANOVA is used to know the sales difference of different regions. A Psychology researcher can use ANOVA to compare the behavior of different people. A medical researcher can use ANOVA statistics in the experiment of a drug as he or she can test whether or not the drug cures the illness.
Procedure of ANOVA:
Set up hypothesis: To perform ANOVA statistics, a researcher has to set up the null and alternative hypothesis.
Calculation of MSB, MSW and F ratio: After set up, the researcher must calculate the hypothesis and the variance between the samples. In the calculation of variance between the samples, first we calculate the grand mean from the all the samples. Then, the researcher must make the deviation from individual mean to the grand mean for each sample, and square the deviation and divide the square deviation of all the samples by their degree of freedom. This is called MSB, which stands for the mean sum of square between the samples. The second component of ANOVA statistics will be the variance within the sample. To calculate the variance within the sample, take each deviation sample from the respective sample means, find the square of each sample, and divide it by the respective degree of freedom. This is called MSW, which stands for the mean sum of the square within the sample. The ratio of the MSB and MSW is called the F ratio.
Testing of hypothesis in ANOVA: In ANOVA statistics, the calculated F ratio value is compared to the standardized table value. If the calculated F ratio value is greater than the table value, we will reject the null hypothesis and conclude that the means of the groups are different. If the calculated value is less than the table value, then we will accept the null hypothesis and conclude that the means of all the groups are the same.
ANOVA and SPSS: Manual calculation of ANOVA statistics is a long procedure. These days, almost all statistical computer software has the option for calculating ANOVA statistics. In SPSS, ANOVA can be performed by using the “analysis menu” and the “compare means option.” Select “one way ANOVA” from the compare means option. In SPSS, the hypothesis probability value is used to accept or reject the null hypothesis.
Monday, April 6, 2009
Analysis of Variance
Additionally, it can be used in cases of two samples analysis of variance (ANOVA) and results will be the same as the t-test. For example, if we want to compare income by gender group. In this case, t-test and analysis of variance (ANOVA) results will be the same. In the case of more than two groups, we can use t-test as well, but this procedure will be long. Thus, analysis of variance (ANOVA) technique is the best technique when the independent variable has more than two groups. Before performing the analysis of variance (ANOVA), we should consider some basics and some assumptions on which this test is performed:
Assumptions:
1. Independence of case: Independence of case assumption means that the case of the dependent variable should be independent or the sample should be selected randomly. There should not be any pattern in the selection of the sample.
2. Normality: Distribution of each group should be normal. The Kolmogorov-Smirnov or the Shapiro-Wilk test may be used to confirm normality of the group.
3. Homogeneity: Homogeneity means variance between the groups should be the same. Levene's test is used to test the homogeneity between groups.
If particular data follows the above assumptions, then the analysis of variance (ANOVA) is the best technique to compare the means of two populations, or more than two populations. Analysis of variance (ANOVA) has three types.
One way analysis of variance (ANOVA): When we are comparing more than three groups based on one factor variable, then it said to be one way analysis of variance (ANOVA). For example, if we want to compare whether or not the mean output of three workers is the same based on the working hours of the three workers, then it said to be one way analysis of variance (ANOVA).
Two way analysis of variance (ANOVA): When factor variables are more than two, then it is said to be two way analysis of variance (ANOVA). For example, based on working condition and working hours, we can compare whether or not the mean output of three workers is the same. In this case, it is said to be two way analysis of variance (ANOVA).
K way analysis of variance (ANOVA): When factor variables are k, then it is said to be the k way of analysis of variance (ANOVA).
Key terms and concepts:
Sum of square between groups: For the sum of the square between groups, we calculate the individual means of the group, then we take the deviation from the individual mean for each group. And finally, we will take the sum of all groups after the square of the individual group.
Sum of squares within group: In order to get the sum of squares within a group, we calculate the grand mean for all groups and then take the deviation from the individual group. The sum of all groups will be done after the square of the deviation.
F –ratio: To calculate the F-ratio, the sum of the squares between groups will be divided by the sum of the square within a group.
Degree of freedom: To calculate the degree of freedom between the sums of the squares group, we will subtract one from the number of groups. The sum of the square within the group’s degree of freedom will be calculated by subtracting the number of groups from the total observation.
BSS df = (g-1) for BSS is between the sum of squares, where g is the group, and df is the degree of freedom.
WSS df = (N-g) for WSS within the sum of squares, where N is the total sample size.
Significance: At a predetermine level of significance (usually at 5%), we will compare and calculate the value with the critical table value. Today, however, computers can automatically calculate the probability value for F-ratio. If p-value is lesser than the predetermined significance level, then group means will be different. Or, if the p-value is greater than the predetermined significance level, we can say that there is no difference between the groups’ mean.
Analysis of variance (ANOVA) in SPSS: In SPSS, analysis of variance (ANOVA) can be performed in many ways. We can perform this test in SPSS by clicking on the option “one way ANOVA,” available in the “compare means” option. When we are performing two ways or more than two ways analysis of variance (ANOVA), then we can use the “univariate” option available in the GLM menu. SPSS will give additional results as well, like the partial eta square, Power, regression model, post hoc, homogeneity test, etc. The post hoc test is performed when there is significant difference between groups and we want to know exactly which group has means that are significantly different from other groups.
Extension of analysis of variance (ANOVA):
MANOVA: Analysis of variance (ANOVA) is performed when we have one dependent metric variable and one nominal independent variable. However, when we have more than one dependent variable and one or more independent variable, then we will use multivariate analysis of variance (MANOVA).
ANCOVA: Analysis of covariance (ANCOVA) test is used to know whether or not certain factors have an effect on the outcome variable after removing the variance for quantitative predictors (covariates).
For information on statistical consulting, click here.
Tuesday, January 20, 2009
T-test
A t-test is a statistical technique for comparison of the means of two samples or populations. There are other techniques similar to t-test for comparison of means, with the other popular measure being a z-test. However, a z-test is typically used where the sample size is relatively large, with t-test being the standard for usage in samples where the size or ‘n’ is 30 or smaller. Another key feature of the t-test is that it can be used for comparison of no more than 2 samples, with ANOVA being the most appropriate alternative. The t-test was discovered in the early 20th century by an Englishman, W.S. Gosset. The t-test is also commonly known as the student’s t-test, due to the fact that the usage of statistical analysis was considered a trade secret by Guiness, Gosset’s employer, forcing him to use a pen-name instead of his own real name.
In conducting a t-test, certain key assumptions have to be valid, including the following:
- Data have to be normally distributed, meaning that there should be no outliers and the mean, median and mode should be the same. In the event that the data are not normal, they have to be normalized by converting into logarithm form. The variance of each sample dataset should also be equal.
- Sample(s) may be dependent or independent, depending on the hypothesis. Where the samples are dependent, repeat measure are typically used. An example of a dependent sample is where observations are taken before and after a treatment.
- For help assessing the assumptions of a t-test click here
T-tests are widely used in hypothesis testing for comparison of sample means, to determine whether or not they are statistically different from each other. For instance, a t-test may be used to:
- Determine whether a sample belongs to a certain population
- Determine whether two different samples belong to the same population or two different populations.
- Determine whether the correlation between two samples or two different variables is statistically significant.
- Determine whether, in case of dependent samples, the treatment has been statistically significant.
In order to conduct a t-test, we need to follow certain steps as follows:
- Set up a Hypothesis for which the t-test is being conducted. The hypothesis is simply a statement that suggests what our expectation of the existing sample(s) is, and determines how the result of the t-test will be interpreted.
- Select the level of significance and critical or ‘alpha’ region. Most often, a level of 95% significance is used in non-clinical applications, wherein a 99% or upwards level of significance is used. The balance is simply the alpha region which determines our hypothesis rejection zone or range.
- Calculation: we obtain the value of the t-test by calculating the mean of the sample and comparing it with the population mean, to determine the standard deviation and dividing it by the number of observations (n), and taking a square root. The resulting value is the coefficient of the t-test.
- Hypothesis testing: this step involves comparing our original hypothesis in step 1 using the obtained t-test value or coefficient. The idea is to compare our level of significance or ‘alpha’ value with the result of the t-test. For instance, if our t-test is conducted at 95% significance, for the hypothesis to be valid, our coefficient of the t-test should be lower than 5% or .05. If this is the case, then we can say that our hypothesis holds true. If not, we simply reject our hypothesis and can claim that the opposite is true.
While being a very useful tool in data analysis, the t-test is not without its limitations. For one thing, it can only be used in a small sample of 30 observations or less. In large data analysis projects, the t-test is practically useless. In addition, the t-test is a parametric test, which implies that in a non-normal distribution, it cannot be applied without making changes to dataset. In reality, few datasets are ever normal without having to make changes, and a t-test is thus a more cosmetic test. A non-parametric test can thus be applied more effectively, such as the Mann-Whitney U test (for independent samples) or the binomial or signed rank test (for related or dependent samples).
Click here for assistance with conducting T-tests