Request

To request a blog written on a specific topic, please email James@StatisticsSolutions.com with your suggestion. Thank you!

Monday, June 29, 2009

Statistical Tests

Statistical tests are of various types, depending upon the nature of the study. Statistical tests provide a method for making quantitative decisions about a particular sample. Statistical tests mainly test the hypothesis that is made about the significance of an observed sample.

Statistics Solutions is the country's leader in statistical consulting and can assist with selecting and analyzing the appropriate statistical test for your dissertation. Contact Statistics Solutions today for a free 30-minute consultation.

There are some key concepts of statistical tests that can help in understanding statistical tests.

Type I error: Type I error in a statistical test is usually committed when a correct sample is rejected.

Type II error: Type II error in a statistical test is usually committed when a false sample is accepted.

We can say that statistical tests are generally categorized into various types depending upon the type of field. Statistical tests are carried out extensively in psychology, medicine, nursing and business.

In the field of psychology, statistical tests of significances like t-test, z test, f test, chi square test, etc., are carried out to test the significance between the observed samples and the hypothetical or expected samples. For example, if a researcher wants to conduct a statistical test upon the significant difference between the IQ levels of two college students, then the researcher can perform the t statistical test for the difference of the two samples. If one wants to test the goodness of fit of a particular assumed model, then one can use the chi square test of goodness of fit. This is the only statistical test (among other statistical tests), that helps in testing the goodness of fit of an assumed model.

In the field of business, statistical tests are used for conducting Analysis of variance (ANOVA). Analysis of variance (ANOVA) is basically used for examining the differences in the mean values of the dependent variables associated with the effect of the controlled independent variables, after taking into account the influence of the uncontrolled independent variables. Thus, in Analysis of variance (ANOVA), f statistical test calculates the significance of the samples.

Statistical tests are directly correlated to statistical inference. Statistical inference involves tests of hypothesis, where statistical tests play a crucial role. In the field of medicine and nursing, tests of hypotheses are conducted using various statistical tests. If the researcher makes mistakes in calculation while performing the statistical tests, then the researcher might end up committing a type II error. In other words, if the researcher makes a mistake in calculation, then the statistical tests will conclude that a false drug sample is a correct drug sample. Further, the researcher might end up tagging a false drug sample as a correct drug sample. Thus, the researcher should be cautious while performing statistical tests. In the field of medicine and nursing, errors in statistical tests can result in huge problems in people’s lives, as it affects their drugs and dosages etc.

Statistical tests can be performed in software such as SPSS. Statistical tests in SPSS can be performed with the help of the “analysis” menu. For every statistical test, there are different sample sizes. In the t test, for example, we should have a sample size less than 30. Similarly, for a statistical z test, we should have a sample size more than 30. Statistical tests come with some general assumptions like the assumption that samples should be drawn from the population in a random manner. The observations in statistical tests must be independent. The observations in statistical tests must have the same underlying distribution. Especially, in the chi sq statistical test, observations must be grouped in different categories. Normal distribution of deviations is assumed in statistical tests.

Monte Carlo Methods

Monte Carlo methods are methods that help the researcher in estimating solutions. This in turn helps the researcher in addressing a variety of mathematical problems that involve several statistical sampling experiments. The Monte Carlo methods are nothing but the collection of the different types of procedures that perform the same operations. The Monte Carlo methods iteratively evaluate the deterministic model by utilizing the method of random numbers and the theory of probability for getting an almost-accurate answer for the problem.

Statistics Solutions is the country's leader in statistical consulting and can assist with Monte Carlo methods for your dissertation. Contact Statistics Solutions today for a free 30-minute consultation.

The researchers should note that Monte Carlo methods only help in obtaining an approximate outcome. Therefore, when a researcher uses the Monte Carlo methods, the error approximation is the crucial factor. While using the Monte Carlo methods, the researcher must take the error approximation factor into account.

The various types of Monte Carlo methods used by the researcher have relatively different levels of exactness. This also depends upon the type of question or problem that is to be addressed by the investigator. The Monte Carlo methods are applicable because they consist of the computation of difficult integrals. The Monte Carlo methods are used in those cases where multi dimensional integrals are involved. The Monte Carlo methods are useful in those cases where logical approximation is required.

There are several Monte Carlo methods.

A method called the crude Monte Carlo method is a type of method that is used to solve the integral of a particular function, f(x), for example if it falls under the limits ‘a’ and ‘b.’ In this Monte Carlo method, the researcher selects a number ‘N’ from the random sample, ‘s.’ In this Monte Carlo method, the limit ‘a’ and ‘b’ on which the function is integrated is not equal to the value of the sample size. After this, the investigator then locates the function value f(s) in the function f(x) for each random sample‘s’ in the Monte Carlo method. After locating, the researcher then performs the addition of all these values and divides the sum by ‘N,’ which gives the mean values from the sample in the Monte Carlo methods.

The investigator then multiplies the value in order to obtain the integral in the Monte Carlo methods.

A method called the acceptance rejection Monte Carlo method is that type of Monte Carlo method that is a flexible technique and is simple to understand. However, this type of Monte Carlo method provides the researcher with one of the least approximate results among all four types of Monte Carlo methods.

The reason behind using this type of Monte Carlo method is that it is useful for the researcher to obtain the variance by simply adding up the variances for each sub interval.

The first two Monte Carlo methods that are discussed are generally the two basic important techniques that are important for the researcher to understand. As such, Monte Carlo methods are used extensively since they serve as a basis in more complex techniques.

The Monte Carlo methods are widely used in various disciplines like physics and chemistry, as they simulate the tedious reactions and interactions.

There is also a smoothening property in the Monte Carlo methods that is useful in the case of complex problems. Approximation of the complex problems is generally very time-consuming but the Monte Carlo methods make it easy.

The Monte Carlo methods can be used in the field of computer vision. The Monte Carlo methods in the field of computer vision are used for object tracking.

Friday, June 26, 2009

Path Analysis

Path analysis represents an attempt to deal with causal types of relationships. Path analysis was developed by Sewall Wright in 1930. Path analysis is very useful in illustrating the number of issues that are involved in causal analysis.

Statistics Solutions is the country's leader in statistical consulting and can assist with path analysis for your dissertation. Contact Statistics Solutions today for a free 30-minute consultation.

Path Analysis can be carried out by the researcher diagrammatically or graphically in the form of circles and arrows, which indicate various causations among variables. The ultimate goal of path analysis is to predict the regression weight. The regression weight is predicted during path analysis, and then compared to the observed correlation matrix. Path analysis is also applicable in cases where the researcher wants to perform a goodness of fit test.

There are certain terminologies that are used in Path analysis.

The exogenous variables in Path analysis are variables whose causes are outside of the model.
The endogenous variables in Path are variables whose causes are inside the model.

The recursive model in Path analysis is a causal model that is unidirectional. In other words, they have one way causal flow. This model in Path analysis has neither feedback loops nor any reciprocal effects. In this type of model in Path analysis, the variables cannot be both cause and affect at the same time.

The non recursive model in Path analysis is a causal model with feedback loops and reciprocal effects.

The path coefficient in Path analysis is the standardized regression coefficient that predicts one variable from another.

Path analysis is usually conducted with the help of an added module called the analysis of moment structures (AMOS). Other than the added module of SPSS called the analysis of moment structures (AMOS), there is other statistical software like SAS, LISREL, etc. that can be used to conduct path analysis. According to a well known researcher named Kline (1998), an adequate sample size should always be 10 times the amount of the parameters in path analysis.

The best sample size should be 20 times the number of parameters in path analysis.

Since path analysis is also a kind of statistical analysis, path analysis also comes with several assumptions.

In path analysis, the association among the model should be linear in nature. In path analysis, the associations among the models should be additive in nature. In path analysis, the association among the model should be causal in nature. The data that is used in path analysis should follow an interval type of scale. In order to reduce volatilities in the data, it is assumed in the theory of path analysis that all the error terms in path analysis are not correlated among the various variables. In path analysis, is it is also assumed that errors are not correlated among themselves. In path analysis, it is assumed that there is only one way causal flow.

Path analysis does have some limitations.

Path analysis can very well evaluate, test or compute two or more than two types of causal hypotheses. However, the major limitation of path analysis is that path analysis cannot establish the direction of causality.

Path analysis is applicable only in those kinds of cases where relatively small numbers of hypotheses can be easily represented by a single path.

Multicollinearity

The term multicollinearity was first used by Ragnar Frisch. Multicollinearity means that there is a perfect or exact relationship between the regression exploratory variables. Linear regression analysis assumes that there is no perfect exact relationship among exploratory variables. In regression analysis, when this assumption is violated, the problem of Multicollinearity occurs.

Statistics Solutions is the country's leader in dissertation statistical consulting and can assist with your regression analysis. Contact Statistics Solutions today for a free 30-minute consultation.

In regression analysis, multicollinearity has the following types:

1. No multicollinearity: When the regression exploratory variables have no relationship with each other, then there is no multicollinearity in the data.
2. Low multicollinearity: When there is a relationship among the exploratory variables, but it is very low, then it is a type of low multicollinearity.
3. Moderate multicollinearity: When the relationship among the exploratory variables is moderate, then it is said to be moderate multicollinearity.
4. High multicollinearity: When the relationship among the exploratory variables is high or there is perfect correlation among them, then it said to be high multicollinearity.
5. Very high multicollinearity: When the relationship among the exploratory variables is exact, then it is the problem of very high multicollinearity, which should be removed from the data when regression analysis is conducted.

Many Factors affect multicollinearity. For example, multicollinearity may exist during the data collection process, or multicollinearity may exist due to the wrong selection of the model. For example, if we take the exploratory variables to be income and house size in our model, then the model will have the problem of multicollinearity because income and house size are highly correlated. Multicollinearity may also occur if we take too many exploratory variables in regression analysis.

Consequences of multicollinearity: If the data has a perfect or exact multicollinearity problem, then the following will be the impact of multicollinearity:

1. In the presence of multicollinearity, variance and covariance will be wider, which will make it difficult to reach a statistical decision for the null and alternative hypothesis.
2. In the presence of multicollinearity, the confidence interval will be wider due to the wider confidence interval. In this case, we will accept the null hypothesis, which should be rejected.
3. In the presence of multicollinearity, the standard error will increase and it makes the value of the t-test smaller. We will accept the null hypothesis that should be rejected.
4. Multicollinearity will increase the R-square as well, which will impact the goodness of fit of the model.

Detection of multicollinearity: The following are the methods that show the presence of multicollinearity:

1. In regression analysis, when R-square of the model is very high but there are very few significant t ratios, this shows multicollinearity in the data.
2. High correlation between exploratory variables also indicates the problem of multicollinearity.
3. Tolerance limit and variance inflating factor: In regression analysis, one-by-one minus correlation of the exploratory variable is called the variance inflating factor. As the correlation between the repressor variable increases, VIF also increases. More VIF shows the presence of multicollinearity. The inverse of VIF is called Tolerance. So the VIF and TOI have a direct connection.

Remedial measure of multicollinearity: In regression analysis, the first step is to detect multicollinearity. If multicollinearity is present in the data, then we can solve this problem by taking several steps. The first step is to drop the variable, which has the specification bias of multicollinearity. By combining the cross sectional data and the time series data, multicollinearity can be removed. If there is a high multicollinearity, then it can be removed by transforming the variable. By taking the first or the second, different variables can be transformed. By adding some new data, multicollinearity can be removed. In multivariate analysis, by taking the common score of the multicollinearity variable, multicollinearity can be removed. In factor analysis, principle component analysis is used to drive the common score of multicollinearity variables. A rule of thumb to detect multicollinearity is that when the VIF is greater than 10, then there is a problem of multicollinearity.

Contact Statistics Solutions today for more information on multicollinearity.

Friday, June 19, 2009

Dissertation Statistics Consultants

Dissertation statistics consultants are the most efficient and affordable way to get help on your dissertation. Dissertation statistics consultants are consultants who are experts at both dissertations and at statistics, and as such, dissertation statistics consultants can help any student with his or her dissertation needs.

Statistics Solutions is the nation's leader in dissertation statistics consulting. Contact Statistics Solutions today for a free 30-minute consultation.

Dissertation statistics consultants are experts at both dissertations and statistics because dissertation statistics consultants are trained statisticians. Additionally, dissertation statistics consultants have provided help to thousands of students who need to finish a dissertation, and thus, they are also extremely well trained when it comes to working on a dissertation and ensuring that it is approved. In other words, the statistical skills of a dissertation statistics consultant combined with the experience of helping students with dissertations is the perfect combination of skills and expertise need to help any student finish his or her dissertation.

The dissertation itself is a very lengthy and difficult process, and chances are that if you have started this long and difficult project, that you know just how tedious and time consuming it is. In fact, many students often consider it to be the most challenging aspect of their entire student career. Add to all of this the pressure and stress that go along with having to finish it on time and successfully, and students simply become overwhelmed by the entire process of the dissertation. Dissertation statistics consultants can help these students, however, as dissertation statistics consultants can offer invaluable guidance, support and assistance throughout the entire dissertation process. And there is no better way to alleviate stress and assuage fears than to get help and assistance. Thus, dissertation statistics consultants can be the most invaluable tool when it comes to finishing a dissertation on-time and with success.

Dissertation statistics consultants will make the entire process of writing and working on your dissertation more efficient. Dissertation statistics consultants will do this by making sure that a student is taking every single correct step when it comes to the dissertation and the methodology that the student is using to complete his or her dissertation. In other words, the dissertation statistics consultant will ‘break everything down’ into little steps that are manageable, understandable and doable, and this will make the process of completing a dissertation much easier. Additionally, with the help of dissertation statistics consultants, the student will not make any missteps. A misstep occurs when a student somehow makes a mistake in terms of the methodology of a dissertation. These missteps are quite common and occur in the statistical part of the dissertation. Because statistics is a science and requires students to follow procedures, rules and guidelines very precisely, there is much room for students to make mistakes. And when a mistake or misstep occurs, the student must start that section over again. And this is incredibly time consuming. Dissertation statistics consultants will not let this happen, however, as dissertation statistics consultants are trained professionals when it comes to statistics. Additionally, they know the common missteps that are likely to happen—for example, using the wrong sample size when gathering data, or using the wrong test to interpret the data that has been gathered, etc—and dissertation statistics consultants will make sure that the student does not waste time and energy by falling into the common pitfalls of a dissertation. Thus, with the help of dissertation statistics consultants, the student will be able to finish his or her dissertation on time because he or she does not need to waste any time making very costly mistakes.

There is no doubt that dissertation statistics consultants are the best way to ensure results. Whether you are at the beginning, the middle or the end of your project, you should look into hiring dissertation statistics consultants as they can guarantee that you finish your dissertation successfully, on-time, and with confidence.

Tuesday, June 16, 2009

Dissertation Statistics Consultant

While it is extremely exciting to get to the point where a student is finally ready to begin his or her dissertation, that excitement is often short-lived as the student realizes exactly what needs to be done in order to successfully complete his or her dissertation. Because the dissertation is necessary to attain a doctoral degree, that excitement can quickly turn to stress, nervousness, anxiety and fear.

There is no need to stress over a dissertation, however, as a dissertation statistics consultant can help any student that needs to finish a dissertation. Statistics Solutions, Inc. is the country's leader in dissertation statistics consulting. Contact Statistics Solutions today for a free 30-minute consultation on your dissertation statistics.

A dissertation statistics consultant is a trained professional who offers help, assistance and guidance to a student throughout the dissertation writing process. With the help of a dissertation statistics consultant, the student will quickly learn that it is not necessary to worry over the completion of his or her dissertation as a dissertation statistics consultant can ease the pressures and worries of a student.

Because the dissertation statistics consultant is trained to help students, and because the dissertation statistics consultant has helped many students in the past, the dissertation statistics consultant is very well aware of the pressure and anxiety that students feel as they work on their dissertation. For this reason, one of the first things that a dissertation statistics consultant does is to come up with a plan of action. This plan of action includes building a timeline with the student so the student knows exactly what needs to be done and when it needs to be done. This timeline that is created with the dissertation statistics consultant will detail specific targets and due-dates. Thus, with the timeline created by the dissertation statistics consultant, the student can see exactly what needs to be done. This is one sure way to alleviate the stress and pressures that students who work on their dissertation feel. This is because building a timeline lessens the ‘mystery’ of what needs to be done next.

Not only will the dissertation statistics consultant build a reliable and easy-to-follow timeline, the dissertation statistics consultant will also tell the student exactly how to do everything on that timeline. Thus, the student who has the help of the dissertation statistics consultant will have the tools necessary to carry out the procedures and methodology of his or her dissertation.

One of the most time consuming aspects of the dissertation is the obtaining of valid and useful statistics. The statistics will eventually prove the dissertation, or a student’s point, so it is incredibly important to get accurate and valid statistics. With the help of a dissertation statistics consultant, this too is easy and manageable. Because a dissertation statistics consultant is a trained and expert statistician, the dissertation statistics consultant knows how to perform statistical analysis. And while everything in statistical analysis can be time-consuming and difficult, this is not true with the help of a dissertation statistics consultant. Further the dissertation statistics consultant will explain every single aspect of the statistics. This is incredibly helpful because most doctoral students are not adequately trained in statistics. While they are thoroughly trained and knowledgeable in their field of study, they are not trained properly in statistics. Thus, one of the most beneficial things that a dissertation statistics consultant provides is instruction. This instruction, provided by the dissertation statistics consultant, goes a long way in ensuring that the student succeeds with his or her dissertation. This is true because the student must orally defend his or her dissertation, and it is important for the student to have a working knowledge of what has been completed statistically in his or her dissertation. The dissertation statistics consultant will provide that working knowledge and the dissertation statistics consultant will guarantee that a student finishes his or her dissertation with success.

Monday, June 15, 2009

Continuous Probability Distribution

The continuous probability distribution is basically a kind of distribution that is based on the continuous type of random variables. The continuous type of random variable deals with the continuous probability distribution.

Statistics Solutions can assist with continuous probability distribution analysis for your dissertation, thesis or research. Contact Statistics Solutions today for a free 30-minute consultation.

The continuous type of variable in continuous probability distribution consists of the probability density function, called the pdf. Because in the continuous probability distribution the variable is not countable, it is measured with respect to the density.

A normal distribution in the continuous probability distribution generally falls in between the range of -∞ to +∞. This continuous probability distribution has the parameter as µ (called the mean) and σ2 (called the variance). The probability density function (pdf) of this continuous probability distribution is being given by the following:
f(x;µ, σ)= (1/ σ π) exp(-0.5 (x-µ)2/ σ2).

This kind of continuous probability distribution has an important role in statistical theory for several reasons.

The distributions, like the Binomial distribution, Poisson distribution and Hyper Geometric distribution, are approximated by using the continuous probability distribution.

This type of continuous probability distribution is very much applicable in the Statistical Quality Control (SQC).

This type of continuous probability distribution is used in the study of large sample theory, when the normality is involved. The study of the sample statistics is done with the help of the curves of this type of continuous probability distribution.

The theory on which the significance tests (like the t-test or the F test) are based use assumptions that are assumed on the parent population, which belongs to this type of continuous probability distribution.

A continuous probability distribution called the gamma distribution generally falls in the range of 0 to ∞. This type of continuous probability distribution has the parameter ‘d>0’. The probability density function (pdf) of the continuous probability distribution is given by the following:

f(x)= exp(-x) xd-1/

This continuous probability distribution has a property called the additive property, which tells that the sum of the independent variables of this continuous probability distribution is equal to the variable of this continuous probability distribution.

A continuous probability distribution called beta distribution of the first kind has the range which ranges between 0 and 1. The parameters of this type of continuous probability distribution is µ>0 and v>0. The probability density function (pdf) of this type of continuous probability distribution is given by the following:

f(x)= (1/B(µ,v)) xµ-1 (1-x)v-1

A continuous probability distribution called the beta distribution of the second kind falls in the range of 0 to ∞. This type of continuous probability distribution has the parameters namely µ>0 and v>0.The probability density function (pdf) of this continuous probability distribution is given by the following:

f(x)= (1/B(µ,v)) xµ-1 (1+x)v+µ

The continuous probability distribution called the Weibul distribution falls in the range of µ to ∞. This type of continuous probability distribution consists of three parameters, namely c(>0), α(>0) and µ. The probability density function (pdf) in continuous probability distribution is given by the following:

f(x;c,α,µ) = (c (x-µ/α)c-1)/ α exp (-(x-µ/α)c)

A logistic distribution is a continuous probability distribution with the parameter α and β. This type of continuous probability distribution is used widely as a growth function in population and other demographic studies. This type of continuous probability distribution is considered as the mixture of the extreme values of the distributions.

A Cauchy distribution is a continuous probability distribution with the parameter ‘l’ > 0 and ‘µ’. This type of continuous probability distribution has the range of -∞ to +∞. The continuous probability distribution is given by the following:

f(x)= l/π(l2+(x-µ)2)

Friday, June 12, 2009

Statistical Consultants

If you are struggling with statistics, you are most definitely wasting valuable time! This is because there are trained statisticians who can help you with all of your statistical needs. What’s more, because these statistical consultants are trained statisticians, they can perform statistics efficiently and accurately. Thus, there is no need to struggle with statistics when it is cost effective and relatively inexpensive to hire statistical consultants.

Click here for a free 30-minute consultation with the country's leader in statistical consulting.

Statistical consultants are available to help anyone who needs statistical help. And because statistics has become increasingly important in today’s world, the demand for statistical consultants continues to increase. The reason for this is simple: proper statistics increases productivity, efficiency, sales and profit. Any business, then, can benefit from the use of statistics and from the people who can accurately obtain those statistics. Statistical consultants are the perfect fit for businesses as statistical consultants can be hired to analyze the efficiency of a business. Statistical consultants can see what does and does not work in terms of product price, product placement, product location, etc. Additionally, statistical consultants can provide statistics on every single aspect of business.

Statistical consultants can be especially beneficial to small businesses as small businesses often do not have people who are trained in statistics. Instead, small businesses rely on intuition, something that can be very costly for a business owner. Instead of relying on intuition, however, small businesses should seek the help of statistical consultants to provide them with the information that they need to maximize profit.

Just at statistical consultants can help small business owners, statistical consultants can be helpful to students who need to use statistics to write a dissertation. Because not all students are well versed in statistics, statistical consultants can step-in and fill in the gaps in terms of what needs to be done to produce accurate statistics. Statistical consultants, then, can gather the data necessary on which to base all of the dissertation statistics. This data collecting must be done very methodically and precisely, otherwise the data will be skewed and the entire dissertation will be void, for it is impossible to have an accurate dissertation without accurate data. Once the data is collected, statistics can be produced based on that data. Statistical consultants can then interpret the data so that the student can apply it properly to his or her dissertation. Statistical consultants, then, can be the missing link in a student’s quest to obtain his or her dissertation.

Statistical consultants are essential to anyone needing to work with statistics. Additionally, statistical consultants are cost effective. Compared to the time and money that would be spent attempting to get accurate statistics, hiring a statistical consultant is not expensive.
It is important, however, that the person hiring the statistical consultants choose the right statistical consulting firm. Proper research must be done on the statistical consulting firm that is obtained. This research does not take long and essentially ensures that the statistical consultants will provide what the client needs. This research can be as simple as inquiring if the statistical consultant has a PhD. The reason it is important to hire a statistical firm with a PhD is because a PhD knows how to properly obtain and interpret statistics. Additionally, if you are a student and you need to write a dissertation, someone who has been there and done it before (a PhD) can prove invaluable as you face the same challenges and hardships that the PhD has faced in the past. In other words, because the PhD has “been there and done that,” they know the unique circumstances of writing a dissertation. Thus, their help can be all the more precise and informative. For information on Statistics Solutions and the PhD qualifications, click here.

Whatever your statistical needs, statistical consultants can provide easy, effective, and inexpensive solutions to your statistical questions and problems.

Logistic Regression

Logistic regression is an extension of multiple linear regressions, where the dependent variable is binary in nature. Logistic regression predicts the discreet outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or of any other type. Logistic regression is an extension of discriminant analysis. Discriminant analyses also predict the group memberships of the dependent variable, much like logistic regression. However, in discriminant analysis, there is an assumption of the relationship between the normal and linear distribution. Often, assumptions of equal variance do not meet. But in logistic regression, there is no assumption related to normal distribution, linear relationship and equal variance. In logistic regression, there may be many independent variables, like multiple-linear regressions.

Statistics Solutions can help with logistic regression and additional dissertation statistics, click here for a free 30-minute consultation.

The model:

In logistic regression, the dependent variable is dichotomous. In logistic regression, we can take the value of 1 with the probability of success q and or the value 0, with the probability of failure 1- q. When there are two dependent variable categories, then it is said to be binary logistic regression. When there are more than two dependent variable categories, then it is a form of multinomial logistic regression. Symbolically, the probability of the dependent variable can be measured by using the following formula:






Where α= the constant of the equation and β= the coefficient of the predictor variable. An alternative form of logistic regression can be represented as the following:





Logistic regression has two main uses. The first use of logistic regression is that it predicts group membership. Second, logistic regression tells us about the relationship and strengths among the variables.

Test statistics in logistics:

1. Wald statistics: In logistic regression, Wald statistics is used to test the significance of each variable. In logistic regression, Wald statistics is simply the Z statistics, which is simply described as the following:





After squaring the Z value, it follows the chi-square distribution. In the case of a small sample size, the likelihood ratio test is more suitable than Wald statistics in logistic regression.

2. Likelihood ratio: The Likelihood ratio test maximizes the value of the likelihood function for the full model. Symbolically it is as follows:




After the log transformation, the likelihood ratio test follows the chi-square distribution. In logistic regression, it is suggested that the likelihood ratio test is used for significance when we are using backward stepwise elimination.

3. Goodness of fit: In logistic regression, goodness of fit is measured by the Hosmer-lemshow test statistics. This statistic basically compares the observed and predicted observation for the goodness of fit model.

Logistic regression and statistical software: Most software, like SPSS, STATA, SAS, and MATLAB, etc. have the option of performing logistic regression. In SAS, there is a procedure to perform logistic regression. SPSS is GUI software and it has the option to perform logistic regression. To perform logistic regression in SPSS, select the analysis menu from SPSS and select “binary logistic regression” from the regression option. If the dependent variable has more than two categories, then select the “multinomial model” from the regression option. If data are in order, then select the “ordinal logistic regression” from the regression option. After clicking on the logistic regression, select “binary variable” as the dependent variable, “others” as the continuous variables and “dichotomous variable” as the independent variable. After selecting the dependent and independent variable, select the model for logistic regression. The user can select to see both backward and forward methods in logistic regression.

Thursday, June 11, 2009

Dispersion

In statistics, the measure of central tendency gives a single value that represents the whole value. But the central tendency cannot describe the observation fully. The measure of dispersion helps us to study the variability of the items. In a statistical sense, dispersion has two meanings: first it measures the variation of the items among themselves, and second, dispersion measures the variation around the average. If the difference between the value and average is high, then dispersion will be high. Otherwise it will be low. According to Dr. Bowley, “dispersion is the measure of the variation between items.” Researchers use the technique of dispersion because it determines the reliability of the average. Dispersion also helps a researcher in comparing two or more series. Dispersion is also the facilitating technique to many other statistical techniques like correlation, regression, structural equation modeling, etc. In statistics, dispersion has two measure types. The first is the absolute measure, which measures the dispersion in the same statistical unit. The second type of dispersion is the relative measure of dispersion, which measures the dispersion in a ratio unit. In statistics, there are many techniques that are applied to measure dispersion.

Range: Range is the simple measure of dispersion, which is defined as the difference between the largest value and the smallest value. Mathematically, the absolute and the relative measure of range can be written as the following:
R= L - S



Where R= Range, L= largest value, S=smallest value

Quartile deviation: This is a measure of dispersion. In this method, the difference between the upper quartile and lower quartile is taken and is called the interquartile range. Symbolically it is as follows:






Where Q3= Upper quartile Q1= Lower quartile

Mean Deviation: Mean deviation is a measure of dispersion, which is known as the average deviation. Mean deviation can be computed from the mean or median. Mean deviation is the arithmetic deviation of different items of central tendency. It may be the mean or the median. Symbolically, mean deviation is defined as the following:






Where M= median,
= mean

Standard Deviation: In the measure of dispersion, the standard deviation method is the most widely used method. In 1983, it was first used by Karl Pearson. Standard deviation is also known as root mean square deviation. Symbolically it is as follows:



Where
=Deviation
= standard deviation N= total number of observations.

Variance: Variance is another measure of dispersion. The term variance was first used in 1918, by R.A Fisher. Variance is known as the square of the standard deviation. Symbolically, variance can be written as the following:

Variance= (S.D)
2 =

If we know standard deviation, then we can compute the variance by squaring it. If we have variance, then we can also compute the standard deviation, by using the following formula:

Standard deviation has some mathematical properties. They are as follows:

1. Standard deviation of the n natural numbers can be found by using the following formula:


2. The sum of the square deviation taken by the arithmetical mean is minimal.

3. In asymmetrical distribution, Standard deviation has the following relationship with the mean


, Including 68.27% of the items

Include 95.45% of the items


Include the 99.73% of the items.

Coefficient of variation: Coefficient of variation is the relative measure of the dispersion. This method was developed by Karl Pearson. Coefficient of variation is used, while measuring the dispersion of two series. Coefficient of variation can be calculated by using the following formula:


Where C.V. = coefficient of variance, = standard deviation and = mean.