Evidence on the efficacy of an assessment tool is necessary in order to justify the decisions we make based on the scores from it. Validity evidence can be collected from several sources such as the stages before and after test administration. In the present research study, validity evidence of several types on a Turkish as a Second Language (TSL) Academic Listening Test is presented in order to establish the efficacy of it. This paper presents cognitive, contextual and scoring validity (reliability) evidence from the first and second versions of the test and investigates whether the modifications made after the first administration have had a positive effect on the quality of the test. The study concludes that although the changes made in the first version of the test strengthened the validity claims in terms of cognitive and contextual requirements, the reliability scores of the test worsened in the second version. This reminded us that although it is necessary to build the foundations of a test firmly by operationalizing the necessary contextual features and cognitive processes, this will not thoroughly guarantee the technical quality of the items. Scoring validity should be established carefully as well. This study exemplifies a thorough attempt in establishing the validity of a TSL test from multiple perspectives and aims to be an exemplary study for further test development in TSL.
|Number of pages||25|
|Journal||Bogazici University Journal of Education|
|Publication status||Published - 17 Dec 2018|