- The respondent: This is the person who provides responses to questions asked by the researcher regarding a particular variable. The respondent can be a source of error if they fail to give a response either because they are unwilling or due to lack of knowledge in the subject area. They could also be a source of error by giving biased responses either by deliberately giving false answers or by unconscious misrepresentation.
- The situation: current situation poses different factors that may affect the results of a research. A good example is the presence of a third party present during an interview. The third party may interfere with the responses by making the respondent uneasy and therefore fail to give their true representation.
- The researcher: the person conducting the research may also be a source of error if they are careless in processing the replies from respondents or if they do capture complete and accurate information from the respondents. They could also cause errors by using the wrong mathematical formula, adding wrong entries or even conducting faulty data analysis.
A non-response error is the error that occurs when some of the respondents in a sample do not provide responses. It can be avoided by:
- Avoiding short data collection periods: The survey collection period should be long enough for the respondents to take part at their own time depending on their busy schedule.
- Sending reminder to potential respondents: this serves to ensure that most of the respondents are available to provide their responses.
- Ensure confidentiality: Surveys that require confidential information from the respondents should assure them that the information will remain totally private.
- Speed: Thousands of people are able to receive questionnaires and provide responses quickly through email. The software toll then performs real time interpretation.
- Convenience: The online software tool helps researchers to formulate questionnaires, format the email invitation and upload distribution lists.
- Simplicity: The software tools are user friendly and do not require a person who is tech savvy.
- Economical: Most vendors offer free versions with limited services or full featured versions at a low cost.
- Target population: the tools are available on a variety of sites where they accessible to a large number of people.
The four rules that guide coding and categorization of data include:
- Getting to know your data – this helps researcher to understand the quality of data and to determine whether it is likely to add value to the data analysis. The data should have been collected in an unbiased way.
- Focusing on the analysis – this helps the researcher to review the purpose of the evaluation and helps formulate questions that the analysis should answer. The analysis could be focused on topic, events, time, individuals or even groups.
- Categorizing the information – this helps the researcher to gives codes to themes and ideas that he/she may find and create categories as they work with the data.
- Identifying patterns between categories – this helps the researcher to sort the data into large categories and then deduce relative importance and relationships between these categories.
Characteristics to consider include:
- Good listener – the observer should be alert and able to get the concept to make proper conclusions.
- Sensitive – should empathize with the respondent so as to grasp the true responses that the respondent wishes to put across.
- Ability to give open-minded feedback – the observer should not give feedback that would otherwise make the respondent feel inferior.
- Ability to accept the validity of various styles even when he/she might not totally agree with them.
- Quick decision maker – the observer should make decisions quickly on relevant and irrelevant information.
Computer-assisted telephone interviewing is similar to the use of computer administered telephone surveys in that or both surveys, communication involves use of the telephone to conduct the interview and obtain answers to research questions from respondents. However, the difference is that CATI provides several advantages:
- More efficient in data collection since the respondent answers directly to the computer
- More accurate since the computer is able to organize the questions
- The responded is guided to answer relevant questions that are not out of topic.
A data file might suffer from missing data because of values missing for the variable under observation possibly due to non-response, dropouts or errors made during data entry. The techniques to handle missing data include:
- Removing all observations that have missing data
- Removing independent variables for which there are missing values
- Imputing values for the missing data for instance the mean of the variables.
Type 1 error occurs as a result of rejecting a null hypothesis while it is in fact true. The implication is there is a false alarm, according to testing results, that a condition has been fulfilled while in the real sense it has not been fulfilled.
Type 2 error occurs when a null hypothesis is not rejected while in the real sense, it is false. Type 2 error implies failure to have confidence in a true condition.
The two errors are related in that changing the position of the null hypothesis causes type 1 and type 2 errors to switch roles.
A correlation coefficient is used to measure the extent to which 2 or more variables are related in a group of random variables. In this case, the variables are not altered but only measured to find out the degree of relationship between them. The researcher should therefore interpret the coefficient of correlation based on the strength of the relationship as indicated by the correlation coefficient and the significance of the relationship based on probability levels. The relationship is more significant when p-level is smaller and stronger when the correlation is large. When the correlation coefficient=0, this means there is no relationship between the variables. When r=1.00 or r=-1.00, then there is a perfect relationship.
In a regression analysis, the r2 value is the coefficient of determination which is the actual measure of the strength in a relationship. Calculation of coefficient of determination is done by squaring the correlation coefficient. If for instance the correlation coefficient r=0.5, then the coefficient of determination r2=0.52=0.25. The interpretation is that the two variables share 25% of common characteristics.
For Backward elimination, the researcher starts with all predictors in the model, and then proceeds to remove predictors with the highest values until all p-values are small. For forward selection, the researcher starts with no variables, then computes p-values and adds them starting with the lowest p-values until no new predictors can be added. Stepwise regression combines both forward and backward selection where for every step a variable is added or removed.
"Biochemia Medica." What we need to know when calculating the coefficient of correlation? http://biochemia-medica.com/content/what-we-need-know-when-calculating-coefficient-correlation (accessed May 17, 2014).
Dixon, Wilfrid Joseph, and Frank Jones Massey. Introduction to statistical analysis. 3d ed. New York: McGraw-Hill, 19681969.
Fisher, Ronald Aylmer. Statistical methods for research workers. 13th ed. New York: Hafner, 1958.
"Measurment & Sources of Error." Management Sea. http://www.managementsea.com/measurment-sources-of-error/#.U3dhjvmSzG4 (accessed May 17, 2014).
"Type I and II Errors." Type I and II Errors. http://www.ma.utexas.edu/users/mks/statmistakes/errortypes.html (accessed May 17, 2014).
"What are the Qualities of A Good Observer." SLN. http://www.studylecturenotes.com/social-research-methodology/what-are-the-qualities-of-a-good-observer (accessed May 17, 2014).