Description of assessment methods
The administration of exams in schools and institutions is based on the ability of that particular exam to measure the content and learn-ability of the learner without any bias or prejudice. Therefore the validity and reliability of a test is of pertinent importance to the proper analysis of the student’s understanding of a particular area of specialization. There are a variety of methods used in determining the validity and reliability of a test or exam. Some of them as advanced by Heffner (2004) rely on the ability of the genre of exam to follow specific format and criterion.
There are a number of ways of measuring validity. These include; content validity, construct, face, concurrent validity and the predictive validity (Heffner, 2004). These constructs measure different levels of understanding. For instance, content validity measures the ability of the test to broadcast or reflect a complete grasp of the topic under testing. A test which manifests the characteristics of a fully-fledged content rich test must contain a variety of questions. This means that the test must have a math question, questions that contain a verbal reasoning aspect, use of analytical ability and every other part of the construct of intelligence (Davison, 2007). There is no easy way to determine content validity except by the examiner checking or perusing through the test to ensure that all the content that was contained in the course outline is fully tested and replicated in the whole exam. All aspects of a good test which assesses or tests all the cognitive faculties are maintained. This way content validity could be easily achieved.
Construct validity on the other hand is determined to be present if a test has a thorough review or measurement of the theoretical, non-observable abstract of non-observable trait or construct. It is worked out after over a period of time on the basis of accumulation of proof or evidence. Construct validity is determined through the two methods, commonly known as convergent and divergent validation analysis (Heffner, 2004). If a test has a high correlation with another test that measures the same construct, then it is said to contain convergent construct validity. A low correlation between two tests as expostulated in the statement preceding is a measure of the divergence of the test. One can as well use the factor analysis method which is more complex in statistical analysis of the content of the exam. There are also other methods that can be employed, i.e. internal consistency analysis and the developmental changes and experimentation.
Face validity is also another way of determining the validity of an exam. Heffner (2004) argues that face validity refers to whether or not the test looks valid from a superficial observation by the parties who undertake it. If a test has comprehensive face validity, then it will look valid and levelled for the people undertaking it. It should not be overly exaggerated in terms of content and complexity. On the contrary it should have the right mixture of content and tests the levels of the learners.
Concurrent validity on the other hand is maintained through a variety of ways. This refers to the exams or tests ability to vary with the measure of the same construct or opposing construct (Davison, 2007).This type of validity is determined through a comparison of a predetermined measure that is valid. The validity is determined through a comparison of the current cohort of students to a cohort that is of past class. Through this way, the concurrent validity of a test is determined by analyzing the overall standing of the student’s performance to the previous class performance. If the exam was failed with large diversions or passed with large diversions, then the concurrent validity could as well be breached. It therefore follows that concurrent validity should not show a lot of diversion from the established norm. Another way of determining validity is through the predictive validity method. In this the performance is used to predict future performance. For example the MCQ exams could be a predictive test for a higher or advanced learning. This way the validity of exam is known by determining the overall standing of the student as a prediction of the test.
While these methods try to analyze the validity of a test, test reliability is determined by a number of ways. Reliability measures the consistence of a test to give the same results after a short time of retest (Heffner, 2004). The reliability of a test thus measured through the process of test-retest method. Here the subjects being examined are presented with a test and then the same test is retested after a while. If the test is reliable then the scores of individuals should be similar to the first results. Therefore in this case the MCQ exams should be retested to gauge whether the participants will cast the same results as the first time. If they do then the exam is reliable. Another method is through the inter-rater reliability and the parallel forms reliability. In the inter-rater type, the exam should be rated by party A at one time and party B also rates the exam in terms of validity of content or coverage (Heffner, 2004). It is presumed that if the test is reliable then the two parties should cast similar results. If this does not happen then the test is not reliable.
Measuring reliability and validity is a very unique activity. Even though we advance various methods of measuring them, there are other systematic and abstract qualities that cannot be measured. These therefore render the whole service more aversive in terms of ensuring quantitative measuring and evaluation.
Use of oral examinations
Oral examinations are used frequently in the testing of individual’s oral skills (Joughin and Collom, 2003).Oral skills cannot be tested through other means better than this one. Even though one can ask the learner to write the oral skills in an exam, the impact created on them is not comparable to when they are tasked to talk and practice talking before the person(s) interviewing them. This therefore is the most effective method of improving language skills. For instance when one wants to test skills such as pronunciation, oral communication and ideas organization in a speech format the method that would be preferable to use would be the oral examination. Therefore this is the case where oral tests will be utilized.
Marking of an oral exam can be pretty challenging for a teacher or examiner who is not adequately prepared. However the problem of ensuring that the awarding of marks is uniform, reliable and unmarred by the usual prejudice or bias is done through a preparation of a model upon which marks are entered as the students make their presentations (Joughin and Collom, 2003). This is done through a number of ways. One of the best ways is through an organization or creation of a table upon which performance is marked. A table containing the relevant skills that the learner should strive to achieve or portray in their presentation is set up. The corresponding marks for each skill presented and how it is achieved is then entered during the presentation. Joughin (2010) proposes that this form ensures that marking is simplified and made impartial. The students will also be made aware of the fact that their presentation is pegged on how well they utilize the specific presentation or oral skills.
Presentations of topics can form a platform of determining the oral skills of a student. Communication is also deciphered from their understanding and how well they pull their points to a common convergence thus agitating for a particular stand. The tests can also be measured through an interview. In this one the student is presented with or fielded with questions that require use of wit I understanding their nature. The teacher or examiner is the leader in the interview. The questions asked are purely meant to gauge the students understanding of the topic. In this regard, the learners ability to answer the questions or handle the topic is used to rate their ability (Li, 2007). Therefore an oral test will purely rely on the learners’ ability to express themselves, and ability to manifest an understanding of the topics of inquiry.
According to SBA Consultancy Team (2005) reliability of the exam is maintained through consistency in the type of questions being asked to all the students. The learners’ responses are then rated on the same scale without bias. This is done based on the already predetermined marking sheet. So the responses are gauged through a thorough analysis. Upon compilation and tabulation the results will show the rate or level of grasp depicted by the learners. Oral exams are very useful in the determination of the grasp or understanding of particular topics. While one may be very poor in writing skills, they may be very good in the speaking skills thus giving a middle-level ground to vividly tell or show what they know of particular cohort of information (Joughin, 2010).It is therefore a very good method that should be graded according to what the student says in terms of concreteness, or mastery of content. Biasness should be shunned completely and evaluation entered on an impartial platform.
Davison, C., 2007. Views From the Chalkface: English Language School-Based Assessment in Hong Kong. Language Assessment Quarterly, 4(1), 37-68.
Heffner, L.C., 2004.Research Methods, Allpsych online. Retrieved from http://allpsych.com/researchmethods/index.html-research methods
Joughin, G., 2010. A Short Guide to Oral Assessment.Leeds Metropolitan University in association with the University of Wollongong.
Joughin, G. and Collom G., 2003.Oral Assessment.The Higher Education Academy.
http://www.heacademy.ac.uk/resources/detail/resource_database/id433_oral_assessment.Retrieved: August 30, 2011.2.
Li, V., 2007.“The New Assessment Paves Its Way.” Retrieved from http://plate.fed.cuhk.edu.hk/netter/?p=16
SBA Consultancy Team, 2005.2007 HKCEE English examination: Introduction to the school-based assessment component. Hong Kong: HKEAA.