mb0034 set 2

510910259 Rakesh Kumar Singh

1

Name : RAKESH KUMAR SINGH

Roll No. : 510910259

Learning Centre : Systems Domain (2779)

Subject : Research Methodology

Assignment No. : Set – II (MB0034)

Date of Submission : 2010


2

MBA Semester 3

MB0034– Research Methodology Assignment Set- 2

1. Write short notes on the following: (a) Null Hypothesis (b) What is exploratory research? (c) What is Random Sampling? (d) Rank Order Correlation

Ans: (a) A null hypothesis is a hypothesis (within the frequentist context of statistical hypothesis testing) that might be falsified using a test of observed data. Such a test works by formulating a null hypothesis, collecting data, and calculating a measure of how probable that data was assuming the null hypothesis were true. If the data appears very improbable (usually defined as a type of data that should be observed less than 5% of the time) then the experimenter concludes that the null hypothesis is false. If the data looks reasonable under the null hypothesis, then no conclusion is made. In this case, the null hypothesis could be true, or it could still be false; the data gives insufficient evidence to make any conclusion. The null hypothesis typically proposes a general or default position, such as that there is no relationship between two quantities, or that there is no difference between a treatment and the control. The term was originally coined by English geneticist and statistician Ronald Fisher.

In some versions of statistical hypothesis testing (such as developed by Jerzy Neyman and Egon Pearson), the null hypothesis is tested against an alternative hypothesis. This alternative may or may not be the logical negation of the null hypothesis. The use of alternative hypotheses was not part of Ronald Fisher's formulation of statistical hypothesis testing, though alternative hypotheses are standardly used today.

(b) Exploratory research provides insights into and comprehension of an issue or situation. It should draw definitive conclusions only with extreme caution. Exploratory research is a type of research conducted because a problem has not been clearly defined. Exploratory research helps determine the best research design, data collection method and selection of subjects. Given its fundamental nature, exploratory research often concludes that a perceived problem does not actually exist.

Exploratory research often relies on secondary research such as reviewing available literature and/or data, or qualitative approaches such as informal discussions with consumers, employees, management or competitors, and more formal approaches through in-depth interviews, focus groups, projective methods, case studies or pilot studies. The Internet allows for research methods that are more interactive in nature: E.g., RSS feeds efficiently supply researchers with up-to-date information; major search engine search results may be sent by email to researchers by services such as Google Alerts; comprehensive search results are tracked over lengthy periods of time by services such as Google Trends; and Web sites may be created to attract worldwide feedback on any subject.

The results of exploratory research are not usually useful for decision-making by themselves, but they can provide significant insight into a given situation. Although the results of qualitative research can give some indication as to the "why", "how" and "when" something occurs, it cannot tell us "how often" or "how many."


3

Exploratory research is not typically generalizable to the population at large.

(c) A sample is a subject chosen from a population for investigation. A random sample is one chosen by a method involving an unpredictable component. Random sampling can also refer to taking a number of independent observations from the same probability distribution, without involving any real population. A probability sample is one in which each item has a known probability of being in the sample.

The sample usually will not be completely representative of the population from which it was drawn— this random variation in the results is known as sampling error. In the case of random samples, mathematical theory is available to assess the sampling error. Thus, estimates obtained from random samples can be accompanied by measures of the uncertainty associated with the estimate. This can take the form of a standard error, or if the sample is large enough for the central limit theorem to take effect, confidence intervals may be calculated.

(d) When we are dealing with data at the ordinal level, such as ranks, we must use a measure of correlation that is designed to handle ordinal data. The Spearman Rank Order Correlation Coefficient was developed by Spearman to use with this type of data. The Symbol for the Spearman Rank Order

Correlation Coefficient is , r sub s, or the Greek letter rho ( ).

The formula for the Spearman Correlation Coefficient is:

Where 6 is a constant (it is always used in the formula),

D refers to the difference between subjects ranks on the two variables,

and N is the number of subjects.

2. Elaborate the format of a research report touching briefly on the mechanics of writing.

Ans: The format of a research report is given below:

I. Prefatory Items

Title page Declaration Certificates Preface/ acknowledgements Table of contents List of tables List of graphs/ figures/ charts Abstract or synopsis

II. Body of the Report


4

Introduction Theoretical background of the topic Statement of the problem Review of literature The scope of the study The objectives of the study Hypothesis to be tested Definition of the concepts Models if any Design of the study Methodology Method of data collection Sources of data Sampling plan Data collection instruments Field work Data processing and analysis plan Overview of the report Limitation of the study Results: findings and discussions Summary, conclusions and recommendations

III. Reference Material

Bibliography Appendix Copies of data collection instruments Technical details on sampling plan Complex tables Glossary of new terms used.

Research report is a means for communicating research experience to others. The purpose of the research report is to communicate to interested persons the methodology and the results of the study in such a manner as to enable them to understand the research process and to determine its validity. Research report is a narrative and authoritative document on the outcome of a research effort. It represents highly specific information for a clearly designated audience. It serves as a means for presenting the problem studied, methods and techniques used for collecting and analyzing data, findings and conclusions and recommendations. It serves as a basic reference material for future use. It is a means for judging the quality of research project. It is a means for evaluating researcher’s competency. It provides a systematic knowledge on problems and issues analyzed. In a technical report a comprehensive full report of the research process and its outcome. It covers all the aspects of the research process. In popular report the reader is less interested in the methodological details, but more interested in the findings of the study. An interim report in such case can narrate what has been done so far and what was its outcome. It presents a summary of the findings of that part of analysis which has been completed. Summary report is meant for lay audience i.e., the general pubic. It is written in non-technical, simple language with pictorial charts it just contains objectives, findings and its implications. It is a short report of two to three pages. Research abstract is a short summary of technical report. It is prepared by a doctoral student on the eve of submitting his thesis. Research article is designed for publication in a professional journal. A research article must be clearly written in concise and unambiguous language.


5

3. Discuss the importance of case study method. Ans: Case study research excels at bringing us to an understanding of a complex issue or object and can extend experience or add strength to what is already known through previous research. Case studies emphasize detailed contextual analysis of a limited number of events or conditions and their relationships. Researchers have used the case study research method for many years across a variety of disciplines. Social scientists, in particular, have made wide use of this qualitative research method to examine contemporary real-life situations and provide the basis for the application of ideas and extension of methods. Researcher Robert K. Yin defines the case study research method as an empirical inquiry that investigates a contemporary phenomenon within its real-life context; when the boundaries between phenomenon and context are not clearly evident; and in which multiple sources of evidence are used (Yin, 1984, p. 23).

Critics of the case study method believe that the study of a small number of cases can offer no grounds for establishing reliability or generality of findings. Others feel that the intense exposure to study of the case biases the findings. Some dismiss case study research as useful only as an exploratory tool. Yet researchers continue to use the case study research method with success in carefully planned and crafted studies of real-life situations, issues, and problems. Reports on case studies from many disciplines are widely available in the literature.

This paper explains how to use the case study method and then applies the method to an example case study project designed to examine how one set of users, non-profit organizations, make use of an electronic community network. The study examines the issue of whether or not the electronic community network is beneficial in some way to non-profit organizations and what those benefits might be.

Many well-known case study researchers such as Robert E. Stake, Helen Simons, and Robert K. Yin have written about case study research and suggested techniques for organizing and conducting the research successfully. This introduction to case study research draws upon their work and proposes six steps that should be used:

Determine and define the research questions Select the cases and determine data gathering and analysis techniques Prepare to collect the data Collect data in the field Evaluate and analyze the data Prepare the report

Step 1. Determine and Define the Research Questions The first step in case study research is to establish a firm research focus to which the researcher can refer over the course of study of a complex phenomenon or object. The researcher establishes the focus of the study by forming questions about the situation or problem to be studied and determining a purpose for the study. The research object in a case study is often a program, an entity, a person, or a group of people. Each object is likely to be intricately connected to political, social, historical, and personal issues, providing wide ranging possibilities for questions and adding complexity to the case study. The researcher investigates the object of the case study in depth using a variety of data gathering methods to produce evidence that leads to understanding of the case and answers the research questions. Case study research generally answers one or more questions which begin with "how" or "why." The questions are targeted to a limited number of events or conditions and their inter-relationships. To assist in targeting and formulating the questions, researchers conduct a


6

literature review. This review establishes what research has been previously conducted and leads to refined, insightful questions about the problem. Careful definition of the questions at the start pinpoints where to look for evidence and helps determine the methods of analysis to be used in the study. The literature review, definition of the purpose of the case study, and early determination of the potential audience for the final report guide how the study will be designed, conducted, and publicly reported. Step 2. Select the Cases and Determine Data Gathering and Analysis Techniques During the design phase of case study research, the researcher determines what approaches to use in selecting single or multiple real-life cases to examine in depth and which instruments and data gathering approaches to use. When using multiple cases, each case is treated as a single case. Each case conclusions can then be used as information contributing to the whole study, but each case remains a single case. Exemplary case studies carefully select cases and carefully examine the choices available from among many research tools available in order to increase the validity of the study. Careful discrimination at the point of selection also helps erect boundaries around the case. The researcher must determine whether to study cases which are unique in some way or cases which are considered typical and may also select cases to represent a variety of geographic regions, a variety of size parameters, or other parameters. A useful step in the selection process is to repeatedly refer back to the purpose of the study in order to focus attention on where to look for cases and evidence that will satisfy the purpose of the study and answer the research questions posed. Selecting multiple or single cases is a key element, but a case study can include more than one unit of embedded analysis. For example, a case study may involve study of a single industry and a firm participating in that industry. This type of case study involves two levels of analysis and increases the complexity and amount of data to be gathered and analyzed. A key strength of the case study method involves using multiple sources and techniques in the data gathering process. The researcher determines in advance what evidence to gather and what analysis techniques to use with the data to answer the research questions. Data gathered is normally largely qualitative, but it may also be quantitative. Tools to collect data can include surveys, interviews, documentation review, observation, and even the collection of physical artefacts. The researcher must use the designated data gathering tools systematically and properly in collecting the evidence. Throughout the design phase, researchers must ensure that the study is well constructed to ensure construct validity, internal validity, external validity, and reliability. Construct validity requires the researcher to use the correct measures for the concepts being studied. Internal validity (especially important with explanatory or causal studies) demonstrates that certain conditions lead to other conditions and requires the use of multiple pieces of evidence from multiple sources to uncover convergent lines of inquiry. The researcher strives to establish a chain of evidence forward and backward. External validity reflects whether or not findings are generalizable beyond the immediate case or cases; the more variations in places, people, and procedures a case study can withstand and still yield the same findings, the more external validity. Techniques such as cross-case examination and within-case examination along with literature review helps ensure external validity. Reliability refers to the stability, accuracy, and precision of measurement. Exemplary case study design ensures that the procedures used are well documented and can be repeated with the same results over and over again. Step 3. Prepare to Collect the Data Because case study research generates a large amount of data from multiple sources, systematic organization of the data is important to prevent the researcher from becoming overwhelmed by the amount of data and to prevent the researcher from losing sight of the original research purpose and questions. Advance preparation assists in handling large amounts


7

of data in a documented and systematic fashion. Researchers prepare databases to assist with categorizing, sorting, storing, and retrieving data for analysis. Exemplary case studies prepare good training programs for investigators, establish clear protocols and procedures in advance of investigator field work, and conduct a pilot study in advance of moving into the field in order to remove obvious barriers and problems. The investigator training program covers the basic concepts of the study, terminology, processes, and methods, and teaches investigators how to properly apply the techniques being used in the study. The program also trains investigators to understand how the gathering of data using multiple techniques strengthens the study by providing opportunities for triangulation during the analysis phase of the study. The program covers protocols for case study research, including time deadlines, formats for narrative reporting and field notes, guidelines for collection of documents, and guidelines for field procedures to be used. Investigators need to be good listeners who can hear exactly the words being used by those interviewed. Qualifications for investigators also include being able to ask good questions and interpret answers. Good investigators review documents looking for facts, but also read between the lines and pursue collaborative evidence elsewhere when that seems appropriate. Investigators need to be flexible in real-life situations and not feel threatened by unexpected change, missed appointments, or lack of office space. Investigators need to understand the purpose of the study and grasp the issues and must be open to contrary findings. Investigators must also be aware that they are going into the world of real human beings who may be threatened or unsure of what the case study will bring. After investigators are trained, the final advance preparation step is to select a pilot site and conduct a pilot test using each data gathering method so that problematic areas can be uncovered and corrected. Researchers need to anticipate key problems and events, identify key people, prepare letters of introduction, establish rules for confidentiality, and actively seek opportunities to revisit and revise the research design in order to address and add to the original set of research questions. 4. Collect Data in the Field The researcher must collect and store multiple sources of evidence comprehensively and systematically, in formats that can be referenced and sorted so that converging lines of inquiry and patterns can be uncovered. Researchers carefully observe the object of the case study and identify causal factors associated with the observed phenomenon. Renegotiation of arrangements with the objects of the study or addition of questions to interviews may be necessary as the study progresses. Case study research is flexible, but when changes are made, they are documented systematically. Exemplary case studies use field notes and databases to categorize and reference data so that it is readily available for subsequent reinterpretation. Field notes record feelings and intuitive hunches, pose questions, and document the work in progress. They record testimonies, stories, and illustrations which can be used in later reports. They may warn of impending bias because of the detailed exposure of the client to special attention, or give an early signal that a pattern is emerging. They assist in determining whether or not the inquiry needs to be reformulated or redefined based on what is being observed. Field notes should be kept separate from the data being collected and stored for analysis. Maintaining the relationship between the issue and the evidence is mandatory. The researcher may enter some data into a database and physically store other data, but the researcher documents, classifies, and cross-references all evidence so that it can be efficiently recalled for sorting and examination over the course of the study. Step 5. Evaluate and Analyze the Data The researcher examines raw data using many interpretations in order to find linkages between the research object and the outcomes with reference to the original research questions. Throughout the evaluation and analysis process, the researcher remains open to new opportunities and insights. The case study method, with its use of multiple data collection


8

methods and analysis techniques, provides researchers with opportunities to triangulate data in order to strengthen the research findings and conclusions. The tactics used in analysis force researchers to move beyond initial impressions to improve the likelihood of accurate and reliable findings. Exemplary case studies will deliberately sort the data in many different ways to expose or create new insights and will deliberately look for conflicting data to disconfirm the analysis. Researchers categorize, tabulate, and recombine data to address the initial propositions or purpose of the study, and conduct cross-checks of facts and discrepancies in accounts. Focused, short, repeat interviews may be necessary to gather additional data to verify key observations or check a fact. Specific techniques include placing information into arrays, creating matrices of categories, creating flow charts or other displays, and tabulating frequency of events. Researchers use the quantitative data that has been collected to corroborate and support the qualitative data which is most useful for understanding the rationale or theory underlying relationships. Another technique is to use multiple investigators to gain the advantage provided when a variety of perspectives and insights examine the data and the patterns. When the multiple observations converge, confidence in the findings increases. Conflicting perceptions, on the other hand, cause the researchers to pry more deeply. Another technique, the cross-case search for patterns, keeps investigators from reaching premature conclusions by requiring that investigators look at the data in many different ways. Cross-case analysis divides the data by type across all cases investigated. One researcher then examines the data of that type thoroughly. When a pattern from one data type is corroborated by the evidence from another, the finding is stronger. When evidence conflicts, deeper probing of the differences is necessary to identify the cause or source of conflict. In all cases, the researcher treats the evidence fairly to produce analytic conclusions answering the original "how" and "why" research questions. Step 6. Prepare the report Exemplary case studies report the data in a way that transforms a complex issue into one that can be understood, allowing the reader to question and examine the study and reach an understanding independent of the researcher. The goal of the written report is to portray a complex problem in a way that conveys a vicarious experience to the reader. Case studies present data in very publicly accessible ways and may lead the reader to apply the experience in his or her own real-life situation. Researchers pay particular attention to displaying sufficient evidence to gain the readers confidence that all avenues have been explored, clearly communicating the boundaries of the case, and giving special attention to conflicting propositions. Techniques for composing the report can include handling each case as a separate chapter or treating the case as a chronological recounting. Some researchers report the case study as a story. During the report preparation process, researchers critically examine the document looking for ways the report is incomplete. The researcher uses representative audience groups to review and comment on the draft document. Based on the comments, the researcher rewrites and makes revisions. Some case study researchers suggest that the document review audience include a journalist and some suggest that the documents should be reviewed by the participants in the study.

4. Give the importance of frequency tables and discuss the principles of table construction, frequency distribution and class intervals determination

Ans: Frequency tables provide a “shorthand” summary of data. The importance of presenting statistical data in tabular form needs no emphasis. Tables facilitate comprehending masses of data at a glance; they conserve space and reduce explanations and descriptions to a minimum. They give a visual


9

picture of relationships between variables and categories. They facilitate summation of item and the detection of errors and omissions and provide a basis for computations.

It is important to make a distinction between the general purpose tables and specific tables. The general purpose tables are primary or reference tables designed to include large amount of source data in convenient and accessible form. The special purpose tables are analytical or derivate ones that demonstrate significant relationships in the data or the results of statistical analysis. Tables in reports of government on population, vital statistics, agriculture, industries etc., are of general purpose type. They represent extensive repositories and statistical information. Special purpose tables are found in monographs, research reports and articles and reused as instruments of analysis. In research, we are primarily concerned with special purpose.

Components of a Table

The major components of a table are:

A Heading: (a) Table Number

(b) Title of the Table (c) Designation of units

B Body 1. Sub-head, Heading of all rows or blocks of stub items

1. Body-head: Headings of all columns or main captions and their sub-captions. 2. Field/body: The cells in rows and columns.

C Notations: Footnotes, wherever applicable. Source, wherever applicable.

Principles of Table Construction

There are certain generally accepted principles of rules relating to construction of tables. They are:

1. Every table should have a title. The tile should represent a succinct description of the contents of the table. It should be clear and concise. It should be placed above the body of the table.

2. A number facilitating easy reference should identify every table. The number can be centred above the title. The table numbers should run in consecutive serial order. Alternatively tables in chapter 1 be numbered as 1.1, 1.2, 1….., in chapter 2 as 2.1, 2.2, 2.3…. and so on.

3. The captions (or column headings) should be clear and brief. 4. The units of measurement under each heading must always be indicated. 5. Any explanatory footnotes concerning the table itself are placed directly beneath the table and

in order to obviate any possible confusion with the textual footnotes such reference symbols as the asterisk (*) DAGGER (+) and the like may be used.

6. If the data in a series of tables have been obtained from different sources, it is ordinarily advisable to indicate the specific sources in a place just below the table.

7. Usually lines separate columns from one another. Lines are always drawn at the top and bottom of the table and below the captions.

8. The columns may be numbered to facilitate reference. 9. All column figures should be properly aligned. Decimal points and “plus” or “minus” signs

should be in perfect alignment.


10

10. Columns and rows that are to be compared with one another should be brought closed

together. 11. Totals of rows should be placed at the extreme right column and totals of columns at the

bottom. 12. In order to emphasize the relative significance of certain categories, different kinds of type,

spacing and identifications can be used. 13. The arrangement of the categories in a table may be chronological, geographical, alphabetical or

according to magnitude. Numerical categories are usually arranged in descending order of magnitude.

14. Miscellaneous and exceptions items are generally placed in the last row of the table. 15. Usually the larger number of items is listed vertically. This means that a table’s length is more

than its width. 16. Abbreviations should be avoided whenever possible and ditto marks should not be used in a

table. 17. The table should be made as logical, clear, accurate and simple as possible.

Text references should identify tables by number, rather than by such expressions as “the table above” or “the following table”. Tables should not exceed the page size by photo stating. Tables those are too wide for the page may be turned sidewise, with the top facing the left margin or binding of the script. Where tables should be placed in research report or thesis? Some writers place both special purpose and general purpose tables in an appendix and refer to them in the text by numbers. This practice has the disadvantages of inconveniencing the reader who wants to study the tabulated data as the text is read. A more appropriate procedure is to place special purpose tables in the text and primary tables, if needed at all, in an appendix.

Frequency Distribution and Class Intervals

Variables that are classified according to magnitude or size are often arranged in the form of a frequency table. In constructing this table, it is necessary to determine the number of class intervals to be used and the size of the class intervals.

A distinction is usually made between continuous and discrete variables. A continuous variable has an unlimited number of possible values between the lowest and highest with no gaps or breaks. Examples of continuous variable are age, weight, temperature etc. A discrete variable can have a series of specified values with no possibility of values between these points. Each value of a discrete variable is distinct and separate. Examples of discrete variables are gender of persons (male/female) occupation (salaried, business, profession) car size (800cc, 1000cc, 1200cc)

In practice, all variables are treated as discrete units, the continuous variables being stated in some discrete unit size according to the needs of a particular situation. For example, length is described in discrete units of millimetres or a tenth of an inch.

Class Intervals: Ordinarily, the number of class intervals may not be less than 5 not more than 15, depending on the nature of the data and the number of cases being studied. After noting the highest and lower values and the feature of the data, the number of intervals can be easily determined.

For many types of data, it is desirable to have class intervals of uniform size. The intervals should neither be too small nor too large. Whenever possible, the intervals should represent common and convenient numerical divisions such as 5 or 10, rather than odd division such as 3 to 7. Class intervals must be clearly designated in a frequency table in such a way as to obviate any possibility of misinterpretation of confusion. For example, to present the age group of a population, the use of


11

intervals of 1-20, 20-50, and 50 and above would be confusing. This may be presented as 1-20, 21-50, and above 50.

Every class interval has a mid point. For example, the midpoint of an interval 1-20 is 10.5 and the midpoint of class interval 1-25 would be 13. Once class intervals are determined, it is routine work to count the number of cases that fall in each interval.

5. Write short notes on the following: (a) Type I error and type II error. (b) One tailed and two tailed test (c) Selecting the significance level

Ans: (a) In statistics, the terms type I error (also, α error, false alarm rate (FAR) or false positive) and type II error (β error, miss rate or a false negative) are used to describe possible errors made in a statistical decision process. In 1928, Jerzy Neyman (1894-1981) and Egon Pearson (1895-1980), both eminent statisticians, discussed the problems associated with "deciding whether or not a particular sample may be judged as likely to have been randomly drawn from a certain population" (1928/1967, p. 1), and identified "two sources of error", namely:

Type I (α): reject the null hypothesis when the null hypothesis is true, and Type II (β): fail to reject the null hypothesis when the null hypothesis is false

In systems theory an additional type III error is often defined:

Type III (δ): asking the wrong question and using the wrong null hypothesis.

In 1930, they elaborated on these two sources of error, remarking that "in testing hypotheses two considerations must be kept in view, (1) we must be able to reduce the chance of rejecting a true hypothesis to as low a value as desired; (2) the test must be so devised that it will reject the hypothesis tested when it is likely to be false."[

When you conduct a test of statistical significance, whether it is from a correlation, an ANOVA, a regression or some other kind of test, you are given a p-value somewhere in the output. If your test statistic is symmetrically distributed, you can select one of three alternative hypotheses. Two of these correspond to one-tailed tests and one corresponds to a two-tailed test. However, the p-value presented is (almost always) for a two-tailed test. But how do you choose which test? Is the p-value appropriate for your test? And, if it is not, how can you calculate the correct p-value for your test given the p-value in your output?

(b) What is a two-tailed test?

First let's start with the meaning of a two-tailed test. If you are using a significance level of 0.05, a two-tailed test allots half of your alpha to testing the statistical significance in one direction and half of your alpha to testing statistical significance in the other direction. This means that .025 is in each tail of the distribution of your test statistic. When using a two-tailed test, regardless of the direction of the relationship you hypothesize, you are testing for the possibility of the relationship in both directions. For example, we may wish to compare the mean of a sample to a given value x using a t-


12

test. Our null hypothesis is that the mean is equal to x. A two-tailed test will test both if the mean is significantly greater than x and if the mean significantly less than x. The mean is considered significantly different from x if the test statistic is in the top 2.5% or bottom 2.5% of its probability distribution, resulting in a p-value less than 0.05.

What is a one-tailed test?

Next, let's discuss the meaning of a one-tailed test. If you are using a significance level of .05, a one-tailed test allots all of your alpha to testing the statistical significance in the one direction of interest. This means that .05 is in one tail of the distribution of your test statistic. When using a one-tailed test, you are testing for the possibility of the relationship in one direction and completely disregarding the possibility of a relationship in the other direction. Let's return to our example comparing the mean of a sample to a given value x using a t-test. Our null hypothesis is that the mean is equal to x. A one-tailed test will test either if the mean is significantly greater than x or if the mean is significantly less than x, but not both. Then, depending on the chosen tail, the mean is significantly greater than or less than x if the test statistic is in the top 5% of its probability distribution or bottom 5% of its probability distribution, resulting in a p-value less than 0.05. The one-tailed test provides more power to detect an effect in one direction by not testing the effect in the other direction.

(c) Selecting a Significant Level: The hypothesis is tested on a pre-determined level of significance and such the same should have specified. Generally, in practice, either 5% level or 1% level is adopted for the purpose. The factors that affect the level of significance are:

The magnitude of the difference between sample ; The size of the sample; The variability of measurements within samples; Whether the hypothesis is directional or non – directional (A directional hypothesis is one

which predicts the direction of the difference between, say, means). In brief, the level of significance must be adequate in the context of the purpose and nature of enquiry.

6. Explain Karl Pearson Co-efficient of correlation. Calculate Karl Pearson coefficient for the following data: X(height-cm)

174 175 176 177 178 182 183 186 189 193

Y(weight-kg)

61 65 67 68 72 74 80 87 92 95

Ans: Karl Pearson’s Co-Efficient of Correlation: Karl Pearson’s Co-Efficient of Correlation is a mathematical method for measuring correlation. Karl Pearson developed the correlation from the covariance between two sets of variables. Karl Pearson’s Co-Efficient of Correlation is denoted by symbol r. The formula for obtaining Karl Pearson’s Co-Efficient of Correlation is:


13

Direct method

y/N)x/N X xy / N – ( Covariance between x and y =

SDx = standard deviation of x series = x(2 x/N)/ N) – ( 2

SDy = standard deviation of y series = y(2 y/N)/ N) – ( 2

Shortcut Method using Assumed Mean

If short cut method is used using assumed mean, the formula for obtaining Karl Pearson’s Co-Efficient of Correlation is:

dy/N)dx/N X dxdy / N – (Covariance between x and y =

SDx = dx(2 dx /N)/ N) – ( 2

SDy = dy(2 dy /N)/ N) – ( 2

Steps in calculating Karl Pearson’s Correlation Coefficient using Shortcut Method

Assume means of x and y series Take deviations of x and y series from assumed mean and get dx and dy Square the dx and dy and find the sum of squares and get dx2 and dy2. Multiply the corresponding deviations of x and y series and total the products to get dxdy.

If the deviations are taken from the arithmetic mean dx = 0 and dy =0 and the formula becomes

Shortcut Method using Arithmetic Mean

If short cut method is used using actual mean, the formula for obtaining Karl Pearson’s Co-Efficient of Correlation is:

Interpreting Co-Efficient of Correlation


14

The Co-Efficient of Correlation measures the correlation between two variables. The value of Co-Efficient of Correlation always lies between +1 and –1. It can be interpreted in the following ways.

If the value of Co-Efficient of Correlation r is 1 it is interpreted as perfect positive correlation.

If the value of Co-Efficient of Correlation r is -1, it is interpreted as perfect negative correlation.

If the value of Co-Efficient of Correlation r is 0 < r < 0.5, it is interpreted as poor positive correlation.

If the value of Co-Efficient of Correlation r is 0.5 < r < 1, it is interpreted as good positive correlation.

If the value of Co-Efficient of Correlation r is 0 > r > -0.5, it is interpreted as poor negative correlation.

If the value of Co-Efficient of Correlation r is –0.5 > r > -1, it is interpreted as good negative correlation.

If the value of Co-Efficient of Correlation r is 0, it is interpreted as zero correlation.

Probable Error

Probable Error of Correlation coefficient is estimated to find out the extent to which the value of r is dependable. If Probable Error is added to or subtracted from the correlation coefficient, it would give such limits within which we can reasonably expect the value of correlation to vary.

If the coefficient of correlation is less than Probable Error it will not be significant. If the coefficient of correlation r is more than six times the Probable Error, correlation is definitely significant. If Probable Error is 0.5 or more, it is generally considered as significant. Probable Error is estimated by the following formula

PE = 0.6745 (1- r2/ N)

X Y dx dy dx2 dy2 dxdy 174 61 -01 -19 01 361 19 A(175) 65 00 -15 00 225 00 176 67 01 -13 01 169 -13 177 68 02 -12 04 144 -24 178 72 03 -08 09 64 -24 182 74 07 -06 49 36 -42 183 A(80) 08 00 64 00 00 186 87 11 07 121 49 77 189 92 14 12 196 144 168 193 95 18 15 324 225 270 63 -39 769 1417 431

Covariance between X and Y= ∑dxdy/N – (∑dx/N X ∑dy/N)

r = ∑dxdy/N – (∑dx/N X ∑dy/N) / √ (∑ dx2 /N) – (∑dx/N)2 X √(∑ dy2 /N) – (∑dy/N)2

= 0.8454511 (putting the values we get approximately)

mb0034 set 2

Documents