Request

To request a blog written on a specific topic, please email James@StatisticsSolutions.com with your suggestion. Thank you!
Showing posts with label t-test. Show all posts
Showing posts with label t-test. Show all posts

Thursday, August 6, 2009

T-test

The idea behind parametric tests is to provide the researcher with a statistical inference about the population by conducting statistically significant tests (like t-test) on the sample drawn from the population. The parametric test called t-test is based on a student’s t statistic. This statistic in t-test assumes that variables are drawn from the normal population. The mean of the population in this statistic of t-test has been assumed to be known. The distribution of t-test, called t-distribution, has a similar shape to that of a normal distribution, i.e. a bell shaped appearance.

Statistics Solutions is the country's leader in statistical consulting and t-test analysis. Contact Statistics Solutions today for a free 30-minute consultation.
The parametric test called t-test is useful for testing those samples whose size is less than 30. The reason behind this is that if the size of the sample is more than 30, then the distribution of the t-test and the normal distribution will not be distinguishable.

The parametric test called t-test is used for conducting statistically significant tests in the testing of hypotheses. There are basically three types of t-tests: one sample t-test, two independent sample t-test and paired sample t-test.

In the case of a one sample t-test, if a researcher in the field of psychology is working on a study where he wants to make sure that at least 65% of students will pass the IQ test, he can use the t-test. So, one sample t-test will be used after the hypothesis has been formulated in this particular case. The parametric test called t-test is then calculated by selecting an appropriate formula of t-test. In this case, the appropriate formulae will be a t-test for a single mean. A selection of the level of significance is conducted to check the t-test of the null hypothesis. Usually, the researcher takes 0.05 as the appropriate level of significance while conducting the t-test. The level of significance in the t-test refers to the minimum probability that there will be a false rejection of the null hypothesis. Now, if the value calculated from the t-test is more than the tabulated value, then the null hypothesis gets rejected at a particular level of significance. Similarly, if the value calculated from the t-test is less than the tabulated value, then the null hypothesis gets accepted at a particular level of significance.

In two independent sample t-tests, two samples that are not at all related to each other are tested. The main idea behind two independent sample t-tests is to draw out a statistical inference about the comparison of two independent samples of data. For example, in the field of psychology, if the researcher wants to compare the IQ level of students living in region A and region B, then a two independent sample t-test is useful. The region A and the region B are not at all related to each other, i.e. they are independent of each other. The procedure for conducting this t-test is the same, except that now the sample number is double instead of single. Also, in the case of t-test for single mean and two independent sample t-tests, there are different formulas for the degree of freedom. The degree of freedom is referred to as the restriction that a researcher puts forward while conducting parametric tests, like t-test in this case.

The paired sample t-test refers to that type of sample in which the variables form paired categories. For example, if a researcher wants to compare male and female smokers, paired sample t-test comes into play if the variable is in following form: male chain smoker and female chain smoker, male occasional smoker and female occasional smoker, etc.

Reference:

Introduction to the theory of statistics: Mood A.M., Graybill F.A., Boes D.C

Tuesday, January 20, 2009

T-test


A t-test is a statistical technique for comparison of the means of two samples or populations. There are other techniques similar to t-test for comparison of means, with the other popular measure being a z-test. However, a z-test is typically used where the sample size is relatively large, with t-test being the standard for usage in samples where the size or ‘n’ is 30 or smaller. Another key feature of the t-test is that it can be used for comparison of no more than 2 samples, with ANOVA being the most appropriate alternative. The t-test was discovered in the early 20th century by an Englishman, W.S. Gosset. The t-test is also commonly known as the student’s t-test, due to the fact that the usage of statistical analysis was considered a trade secret by Guiness, Gosset’s employer, forcing him to use a pen-name instead of his own real name.

In conducting a t-test, certain key assumptions have to be valid, including the following:

  • Data have to be normally distributed, meaning that there should be no outliers and the mean, median and mode should be the same. In the event that the data are not normal, they have to be normalized by converting into logarithm form. The variance of each sample dataset should also be equal.
  • Sample(s) may be dependent or independent, depending on the hypothesis. Where the samples are dependent, repeat measure are typically used. An example of a dependent sample is where observations are taken before and after a treatment.
  • For help assessing the assumptions of a t-test click here

T-tests are widely used in hypothesis testing for comparison of sample means, to determine whether or not they are statistically different from each other. For instance, a t-test may be used to:

  • Determine whether a sample belongs to a certain population
  • Determine whether two different samples belong to the same population or two different populations.
  • Determine whether the correlation between two samples or two different variables is statistically significant.
  • Determine whether, in case of dependent samples, the treatment has been statistically significant.

In order to conduct a t-test, we need to follow certain steps as follows:

  • Set up a Hypothesis for which the t-test is being conducted. The hypothesis is simply a statement that suggests what our expectation of the existing sample(s) is, and determines how the result of the t-test will be interpreted.
  • Select the level of significance and critical or ‘alpha’ region. Most often, a level of 95% significance is used in non-clinical applications, wherein a 99% or upwards level of significance is used. The balance is simply the alpha region which determines our hypothesis rejection zone or range.
  • Calculation: we obtain the value of the t-test by calculating the mean of the sample and comparing it with the population mean, to determine the standard deviation and dividing it by the number of observations (n), and taking a square root. The resulting value is the coefficient of the t-test.

  • Hypothesis testing: this step involves comparing our original hypothesis in step 1 using the obtained t-test value or coefficient. The idea is to compare our level of significance or ‘alpha’ value with the result of the t-test. For instance, if our t-test is conducted at 95% significance, for the hypothesis to be valid, our coefficient of the t-test should be lower than 5% or .05. If this is the case, then we can say that our hypothesis holds true. If not, we simply reject our hypothesis and can claim that the opposite is true.

While being a very useful tool in data analysis, the t-test is not without its limitations. For one thing, it can only be used in a small sample of 30 observations or less. In large data analysis projects, the t-test is practically useless. In addition, the t-test is a parametric test, which implies that in a non-normal distribution, it cannot be applied without making changes to dataset. In reality, few datasets are ever normal without having to make changes, and a t-test is thus a more cosmetic test. A non-parametric test can thus be applied more effectively, such as the Mann-Whitney U test (for independent samples) or the binomial or signed rank test (for related or dependent samples).

Click here for assistance with conducting T-tests