Types
There are two types of test scores: '' raw scores'' and ''scaled scores''. A raw score is a score without any sort of adjustment or transformation, such as the simple number of questions answered correctly. A scaled score is the result of some transformation(s) applied to the raw score, such as in relative grading. The purpose of scaled scores is to report scores for all examinees on a consistent scale. Suppose that a test has two forms, and one is more difficult than the other. It has been determined by equating that a score of 65% on form 1 is equivalent to a score of 68% on form 2. Scores on both forms can be converted to a scale so that these two equivalent scores have the same reported scores. For example, they could both be a score of 350 on a scale of 100 to 500. Two well-known tests in the United States that have scaled scores are the ACT and the SAT. The ACT's scale ranges from 0 to 36 and the SAT's from 200 to 800 (per section). Ostensibly, these two scales were selected to represent a mean andScoring information loss
When tests are scored ''right-wrong'', an important assumption has been made about learning. The number of ''right'' answers or the sum of item scores (where partial credit is given) is assumed to be the appropriate and sufficient measure of current performance status. In addition, a secondary assumption is made that there is no meaningful information in the ''wrong'' answers. In the first place, a correct answer can be achieved using ''memorization'' without any profound understanding of the underlying content or conceptual structure of the problem posed. Second, when more than one step for solution is required, there are often a variety of approaches to answering that will lead to a ''correct'' result. The fact that the answer is correct does not indicate which of the several possible procedures were used. When the student supplies the answer (or shows the work) this information is readily available from the original documents. Second, if the ''wrong'' answers were ''blind'' guesses, there would be no information to be found among these answers. On the other hand, if ''wrong'' answers reflect interpretation departures from the expected one, these answers should show an ordered relationship to whatever the overall test is measuring. This departure should be dependent upon the level of psycholinguistic maturity of the student choosing or giving the answer in the vernacular in which the test is written. In this second case it should be possible to extract this order from the responses to the test items. Such extraction processes, the Rasch model for instance, are standard practice for item development among professionals. However, because the ''wrong'' answers are discarded during the scoring process, analysis of these answers for the information they might contain is seldom undertaken. Third, although topic-based subtest scores are sometimes provided, the more common practice is to report the total score or a rescaled version of it. This rescaling is intended to compare these scores to a standard of some sort. This further collapse of the test results systematically removes all the information about which particular items were missed. Thus, scoring a test ''right–wrong'' loses 1) how students achieved their ''correct'' answers, 2) what led them astray towards unacceptable answers and 3) where within the body of the test this departure from expectation occurred. This commentary suggests that the current scoring procedure conceals the dynamics of the test-taking process and obscures the capabilities of the students being assessed. Current scoring practice oversimplifies these data in the initial scoring step. The result of this procedural error is to obscure diagnostic information that could help teachers serve their students better. It further prevents those who are diligently preparing these tests from being able to observe the information that would otherwise have alerted them to the presence of this error. A solution to this problem, known as Response Spectrum Evaluation (RSE), is currently being developed that appears to be capable of recovering all three of these forms of information loss, while still providing a numerical scale to establish current performance status and to track performance change. This RSE approach provides an interpretation of every answer, whether right or wrong, that indicates the likely thought processes used by the test taker.Powell, Jay C. (2010) Testing as Feedback to Inform Teaching. Chapter 3 in; Learning and Instruction in the Digital Age, Part 1. Cognitive Approaches to Learning and Instruction. (See also
*References
{{Reflist School examinations Tests Standardized tests Educational psychology Psychological testing