MCom I Semester Statistical Analysis Test Significance Large Samples Study Material Notes

//

MCom I Semester Statistical Analysis Test Significance Large Samples Study Material Notes

MCom I Semester Statistical Analysis Test Significance Large Samples Study Material Notes: Parameter and Statistic Object os sampling Theory General Procedure of Significance Testing Utility of Standard Error Test of Significance Simple Sampling of Variables Assumptions of the Test of Significance in Large Samples Standard Error of Mean

Test Significance Large Samples
Test Significance Large Samples

CTET Paper Level 2 Previous Year Science Model paper II in Hindi

Test of Significance: Large Samples

The need for adequate and reliable data is ever-increasing for taking decisions in different fields of human activity and business is no exception There are two ways in which the required information may be obtained.

(i) Census Method;

(ii) Sampling Method

Test Significance Large Samples

In the census method, data are collected for each and every unit of the popular or universe. On the other hand, in a sample enquiry only selected number of one are observed and conclusions are drawn about the universe from their study. In other words, under the sampling method, out of many items we pick few items to analyze them and draw inference which can be applied to the universe as a whole. In Sample Survey Method, the enquiry is based on a small number of items chosen from a large number of items which can carry a fairly reliable estimate of the characteristics of the population.

The use of sampling in making inference about the aggregate (population) is possibly as old as civilization itself.

It is very frequently used in practice by almost all of us. We only taste a grape or two at the fruit dealer’s store on the basis of which we decide whether or not to buy the bunch. Similarly, a person after reading the first few pages of a book concludes whether he does or does not wish to read the entire book. People only give sample of their blood for testing purposes to a physician. Similarly, they test the temperature of the water with their toes before plunging into the ocean. All this is done on the assumption that the sample will provide approximation to the population parameters.

The most important aim of sampling study is to obtain maximum information about the phenomenon under the study with least spending of money, time and energy. The aim of sampling study is to obtain the best possible values of the parameters. The word parameter is used to indicate various statistical measures like Mean, Standard Deviation, Correlation and Coefficient of variation etc., occurring in the universe. Hence, sampling theory is used to find out some concrete results which can be applied to the universe.

Test Significance Large Samples

Explanation of the following terminology is essential to understand the sampling theory :

Universe and Sample : In statistics, universe or population is the aggregate of objects, animate or inanimate under study in any statistical investigation. In other words, a universe is the complete group of items about which knowledge is sought. On the other hand, the term sample refers to that part of the universe which is selected for the purpose of investigation. The theory of sampling studies the relationships that exist between the universe and the sample or samples drawn from It  For example, if an enquiry is intended to determine the average per capita income of the people in a particular city, the universe (population) will comprise all the earning people in that city and selection of some earning people for the study would be called sample.

Types of Universe: The statistical population can be studied under the following heads :

(1) Finite and Infinite Universe: Universe (population) can be either a finite universe or an infinite universe. When the number of observations can be counted and definite, it is known as finite universe. For example, if we study the economic background of students of Meerut college, all the students belonging to that college will constitute the universe and their number will be finite. That is, finite is one in which all members can be counted. When the number of observations cannot be measured on number and is infinite, it is known as infinite universe. For example, the number of stars in the sky, the leaves on a tree etc.

(2) Hypothetical and Existent Universe : Universe can be classifed as existent (Real) and hypothetical. A universe containing persons or concrete objects is known as existent or real population. That is, real population is one that actually exists. For example, the number of students in a college; the population of a city; the employees ot a factory etc. In all these cases, existent universe refers to a population of concrete objects. As against this, a hypothetical universe, which is also known as theoretical population, is one which does not consist of concrete objects. That is a hypothetical population exists only in the imagination. We cannot count things. Tossing of a coin or throwing of a dice are examples of hypothetical universe.

Parameter and Statistic

It would be appropriate to explain the meaning of two terms viz., parameter and statistic. All the statistical measures based on all items of the universe are termed as parameters whereas statistical measures worked out on the basis of sample studies are termed as sample statistics. Thus, a sample mean or a sample standard deviation is an example of a statistic whereas the universe means or universe standard deviation is an example of a parameter. Obviously, parameters are functions of the population values while statistics are functions of the sample observations.

Objects of Sampling

Theory The main problem of sampling theory is the problem of relationship between a parameter and a statistic. The theory of sampling is concerned with estimating the properties of the population from those of the sample and also with gauging the precision of the estimate. This sort of movement from particular (Sample) towards general (Universe) is what is known as statistical induction or statistical inference. In brief, the main objectives of sampling theory are as follows:

(1) Estimation of Parameters: To obtain the estimate of parameter from statistic is the main objective of the sampling theory. In other words, sampling theory helps in estimating unknown population quantities or what are called parameters with the help of statistical measures based on sample studies often called as “statistic”.

When estimating parameters of the population, the following two types of the estimates are possible :

(a) Point estimate;

(b) Interval estimate

(a) Point Estimate: An estimate by a single value of statistic, used to approximate the parameter of an unknown population is known as point estimate or estimater of the parameter. For e.g., sample mean, used for estimating the population mean, is an estimator.

(b) Interval Estimate : Interval estimation is a technique under which two values are estimated regarding the parameter which provides the limits within which actual value lies. Out of the two values, one expresses the lower limit and other provides the upper limit within which actual value falls. The limits within which a parameter value is estimated are called the fiducial limits or confidence interval or confidence limits. These limits would vary on the basis of degree of precision which is desired to be achieved. For example, if we ask a person regarding his monthly income from the business, he may say that income ranges between Rs. 5,000 to 6,000. It means he has given the interval estimation by providing lower limit and upper limit.

(2) Testing of Hypothesis: As already discussed, sampling studies are conducted to obtain the estimates of parameter value. The main object of the sampling theory is the study of the tests of hypothesis or test of significance. Hypothesis is an assumption which may or may not be true about a population parameter. In other words, the formation of criteria that are to be used in constructing a test for any given hypothesis is the subject matter of the theory of testing of hypothesis. In all such cases where we have to find out whether the difference between statistics and parameters or the difference between two given values by two samples drawn from the same universe is significant or not, we formulate a hypothesis and then its validity is tested. In brief, the sampling theory helps in determining whether observed differences are actually due to chance or whether they are really significant.

General Procedure of Significance Testing

The following sequential steps constitute, in general, the procedure of Significance Testing:

(1) Statement of the Problem: First of all, the problem has to be stated in clear terms. It should be quite clear as to in respect of what the statistical decision has to be taken. The problem may be : whether the hypothesis is to be rejected or accepted ? Is the difference between a parameter and a statistic significant ? or the like ones.

(2) Laying Down the Hypothesis: For applying any test of significance we set up a hypothesis-a definite statement about the population parameter. This hypothesis mainly may be of the following two types :

Test Significance Large Samples

(a) Null Hypothesis: According to R.A. Fisher, “Hypothesis which is tested for possible rejection under the assumption that it is true is called Null Hypothesis”. The null hypothesis asserts that there is no significant difference between the Statistics and population parameter and whatever observed difference is that is merely due to fluctuations in sampling from the same population and not in the population itself. A Statistical Hypothesis which is stated for purpose of possible acceptance is called Null Hypothesis. It is usually denoted by the word H.

For example, if we want to test the mean of sample and parameter, we set the hypothesis that there is no significant difference and we write

(D) Alternative Hypothesis: Any hypothesis which is complementary to the null hypothesis is called an alternative hypothesis. It is usually denoted by He It is very important to explicitly state the alternative hypothesis in respect of any null hypothesis H, because the acceptance or rejection of H, is meaningful only if it is being tested against a rival hypothesis. For example, if we want to test the null hypothesis that the population has a specified mean yo, (say), i.e.,,

The alternative hypothesis in (i) is known as a two-tailed alternative and the alternatives in (ii) and (iii) are known as right-tailed and left-tailed alternatives. Accordingly, the corresponding tests of the signs are called two-tailed, right-tailed and left-tailed tests respectively.

Only one alternative hypothesis can be tested at one time against the null hypothesis. If the sample information leads us to reject Ho, then we will accept the alternative hypothesis H. Thus, the two hypothesis are constructed so that if one is true, then the other is false and vice-versa.

Difference Between Null Hypothesis and Alternative Hypothesis : As against the null hypothesis, the alternative hypothesis specifies those values that the researcher believes to hold true, and, of course, he hopes that the sample data lead to acceptance of this hypothesis as true. The alternative hypothesis may embrace the whole range of values rather than the single point. Now-a-days, it is usually accepted common factice not to associate any special meaning to the null or alternative hypothesis bu merely to let these terms represent two different assumptions about the population parameter. However, for statistical convenience it will make a difference as to which hypoti.csis is called the null hypothesis and which is called the alternative.

Errors of Sampling (Type I and Type II Error) : When a statistical hypothesis is tested, we may face following four types of possibilities :

(1) The Null Hypothesis (H) is true but it is rejected by Test procedure. It is type one error represented by a.

(2) The Null Hypothesis is false but it is accepted, it is type two error, represented by B.

(3) The Null Hypothesis is true and it is accepted, it is the correct decision.

(4) The Null Hypothesis is false and it is rejected, it is the correct decision. These four situations/possibilities are expressed in the form of a table.

Table: Decision for Sample Actual

Actual

Accepted ( Ho) Rejected ( Ho)

Ho is true

Correct Decision

Type I Error

Ho is false Type II Error

Correct Decision

 

The choice of the test for H, should sensibly be made by duly taking into account these errors that one may commit while using any test. In other words, a good test should properly keep under control both type of errors. However, noting that the occurrence of an error of either type is a random event, we should modify our statement and say that a good test should keep under control the probabilities of both types of errors.

(3) Level of Significance : The maximum probability of making type I error specified in a test of hypothesis is called the level of significance. The commonly used levels of significance are 5% (0.05) and 1% (0.01). If we adopt 5% level of significance, it implies that in 5 out of 100 we are likely to reject a correct Ho. In other words, this implies that we are 95% confident that our decision to reject H, is correct. Level of significance desired is always fixed in advance before applying the test.

(4) Computation of the Standard Error : Standard Error (S.E.) is used as a tool in testing the Hypothesis or tests of significance. The Standard Error of statistics is the standard deviation of the sampling distribution of the statistics. Although, the concept S.E. and S.D. are used interchangeably yet the two differ with each other. The standard deviation is concerned with the original values and the S.E. is concerned with the statistic computed from samples of original values. S.E. measures the sampling variability due to chance or random forces. According to WESSEI and WILLETT, the standard deviation applies to the distribution of items around their average. While the S.E. of the mean applies to the distribution of Averages of samples around the true average of the Universe.

For example, if from a large universe, a number of random samples are taken and their means are obtained, the frequency distribution of these means will be called sampling distribution of mean. Suppose we draw 100 random samples from a given universe and compute their means, there will be a series of 100 means, the frequency distribution of which we called the sampling distribution of the means. The mean of the sample means is called the mean of the universe. The standard deviation of the sample means is called the standard error of mean.

After determining the level of significance, the standard error of the concerning Statistic (mean, standard deviation or any other measure) is computed. There are different formulae for computing the standard errors of different statistic

Test Significance Large Samples

Utility of Standard Error

The utility of the concept of standard error in statistical induction arises on account of the following reasons :

(1) Measure of Reliability of Sample : The standard error gives an idea about the reliability and precision of a sample. If the relationship between the standard deviation and the sample size is kept in view, one would find that the standard error is smaller than the standard deviation. The smaller the S.E. the greater the uniformity of the sampling distribution and hence greater is the reliability of sample. Conversely, the greater the S.E., the greater the difference between observed and expected frequencies and in such a situation the unreliability of the sample is greater.

(2) Helpful in Test of Significance: The standard error helps in testing whether the difference between observed and expected frequencies could arise due to chance The criterion usually adopted is that if a difference is upto 3 times the S.E. then the difference is supposed to exist as a matter of chance and if the difference is than 3 times the S.E., chance fails to account for it, and we conclude the thence as significant difference. This criterion is based on the fact that at 3 (S.E.), the normal curve covers an area of 99.73 percent. Sometimes the criterion of 2 S.E. is also used in place of 3 S.E. Thus, the standard error is an important measure in significance tests or in examining hypothesis. If the estimated parameter differs from the calculated statistic by more than 1.96 times the S.E. then the difference is taken as significant at 5 percent level of significance.

(3) Determination of Confidence Limits : The standard error enables us to specify the limits-maximum and minimum-within which the parameters of the population are expected to lie with a specified degree of confidence. Such an interval is usually known as confidence interval. The degree of confidence with which it can be asserted that a particular value of the population lies within certain limits is known as the level of confidence. The following table gives the percentage of the samples having their mean values within a range of the population mean +

(5) Calculation of the Significance Ratio : The significance ratio, symbolically described as z,1,f etc., depending on the test we use, is often calculated by dividing the difference between a parameter and a statistic by the standard error concerned. For example : Significance Ratio for Arithmetic mean =

(6) Interpretation: The significance ratio is then compared with the pre-determined critical value. If the ratio exceeds the critical value then the difference is taken as significant but if the ratio is less than the critical value, the difference is considered insignificant. For instance, the critical value at 5 percent level of significance is 1.96. If the computed value exceeds 1.96 then the inference would be that the difference at 5 percent level is significant and this difference is not the result of sampling fluctuations but the difference is a real one and should be understood as such.

Thus, through the above stated procedure, it is clear that standard error plays an important part in hypothesis testing and in significance tests. It enables us to take rational decisions and is considered as the basic concept of the sampling theory,

Test of Significance

The theory of sampling can be studied under two heads viz., the sampling of variables and the sampling of attributes. Accordingly we shall study the tests of significance putting them under the following categories : (1) Tests of Significance in respect of samples concerning statistics of variables :

(a) Concerning large samples;

(b) Concerning small samples.

It is very difficult to draw a clear-cut line of demarcation between large and small samples, it is normally agreed amongst statisticians that a sample is to be considered large only if its size exceeds 30.

(2) Sampling of attributes.

Test of Significance in Variables for Large Samples

Here we will discuss the problems of sampling of variables such as height, weight, age etc., which may take any value. Hence, each individual of the population provides a value of the variable and the population is a frequency distribution of the variable. From the population a random sample can be drawn and the statistic is calculated.

Objects of the Test of Significance in Variables

There are three main objects in studying problems relating to a sampling of variables :

(i) To estimate from samples some characteristics of the parent population such as mean, standard deviation etc.;

(ii) To compare the observed and expected values and to see how far the deviation of one from the other can be attributed to fluctuations of sampling;

(iii) To assess the reliability of our estimates.

Simple Sampling of Variables

It should be remembered that we shall study the sampling of variables under simple sampling conditions only. This fulfillment of following conditions are necessary for simple sampling of variables :

(1) That we are drawing our samples from precisely the same record.

(2) That each member of our sample at each draw is drawn from the same record; and

(3) That the drawing of each member of the sample is independent of the draws of all other members.

Assumptions of the Test of Significance in Large Samples

The tests of significance used for dealing with problems relating to large samples are different from the ones used for small samples for the reason that the assumptions that we make in case of large samples do not hold good for small samples. The following assumptions are made while studying problems relating to large samples :

(1) The random sampling distribution of statistics is approximately normal.

(2) Sampling values are sufficiently close to the population value and can be used for the calculation of the standard error of the estimate.

Test Significance Large Samples

 

chetansati

Admin

https://gurujionlinestudy.com

Leave a Reply

Your email address will not be published.

Previous Story

MCom I Semester Statistical Analysis Theoretical Frequency Distribution Study Material notes ( part 3)

Next Story

MCom I Semester Statistical Analysis Test Significance Large Samples Study Material Notes ( Part 2 )

Latest from MCom I Semester Notes