The Validity Evidence of Toefl Test as Placement Test

This study is aimed at evaluating the validity of TOEFL test as instrument for placement test. The three sections of the test including listening, structure, reading, and vocabulary were used to measure 96 freshmen basic competences in English, before attending courses. Based on the result of TOEFL test, students were classified into four classes from the highest to the lowest competency levels, namely A, B, C, and D. To measure the correlation level, the two sets of score were compared by using Pearson product moment correlation, the TOEFL score and students' achievement score. The score of students' achievement was derived from the final test of Integrated Course. There were 2 language skills and 2 language components that were tested: listening, reading, grammar and vocabulary. The medium positive and significant correlation level (0.41) indicated that TOEFL test was quiet valid to be used as instrument for placement test. However, students' characteristics and background should be considered to achieve the high and significant correlation level.


INTRODUCTION
In order to improve the quality of education, since 2015, University Kanjuruhan of Malang has required freshman to take placement test before joining Integrated Course. Integrated Course is the required course that students should take in the first semester. The main purpose of placement test is to place students based on different levels of competences. Students with the same proficiency level will be put in the same class.
Putting students in the same competence level of classes will ease the job of teaching. It will be easier for the lecturers to manage class and to design the most suitable materials and teaching method. If there is problem in teaching and learning process, the lecturer can decide the effective strategy that works for most students. Moreover, for students, it reduces peer pressure, and students will be comfortable learning with classmates who have the same competences.
TOEFL Test is commonly used as placement test in a University, especially in English Department Program. There are two types of TOEFL test, namely Computer Based (CB) TOEFL Test and Paper Based (PB) TOEFL Test. Both CB and PB TOEFL test consist of four sections including listening section, structure section, reading section, and writing section. In this study, the TOEFL test that is used for placement test is PB TOEFL test that consisted of three sections, listening section, structure section, and writing section.
There are some considerations in using TOEFL test as placement test: (1) TOEFL test is one of standardize tests that is ready to use so that the lecturer could save time and energy, (2) TOEFL test is constructed in multiple choice item test format so that it is easy for the lecturer to score VWXGHQWV ¶ language competence, and (3) TOEFL test used standardize scoring method so that the lecturer does not need to construct criteria for scoring.
Despite its benefits, some researchers argued the validity of TOEFL test as instrument for placement test. Belfield & Crosta (2012) and also Scott-Clayton (2012) found negative correlation between placement test score and VWXGHQWV ¶ DFKLHYHPHQW Additionally, Saxon & Morante (2014) criticize the practice of placement test, he noticed that: (1) students do little preparation before test, (2) students lack of knowledge on the material as well as the format of the test, and (3) placement test does QRW PHDVXUH VWXGHQWV ¶ SRWHQWLDO LQ OHDUning.
Regarding this critique it is necessary to review the validity of TOEFL test as the instrument for placement test. Commonly the institution classified students only based on score derived from the placement test without knowing the accuracy of this policy. Moreover, less studies have been done to investigate the validity placement test decision. Based on this research gap, the purpose of this study is to review the validity of TOEFL test as instrument for placement test.

Methods
This research used correlation design. There were seven research steps: (1) documenting placement test scores; (2) conducting data analysis using descriptive analysis and normality test Kolmogorov-Smirnov; (3) documenting post-test result; (4) analyzing post test data using descriptive analysis and normality test (Kolmogorov-Smirnov); (5) determining appropriate correlation statistic technique; (6) conducting hypothesis test; and (7) taking conclusion. There were four classes in the population with distinct characteristics; thus, this research applied stratified random sampling. The researchers took 40 samples from the total 102 subjects in the population. 10 samples were randomly taken from each class.

FINDINDS
Integrated course is the course that students should take in the beginning of the study. The purpose of this course is WR GHYHORS VWXGHQWV ¶ proficiency in the intermediate level of English. To make it work, students were classified into classes with different competency levels. In this case, students who had the same proficiency level were placed in the same class. The purpose of this policy is to ease the job of teaching and to help students to learn easily. Placing students with the same proficiency level will make them comfortable in learning. The test used to classify students based on different levels of proficiency was Paper Based TOEFL Test.

Validity Evidence of Paper Based TOEFL Test
Validity in testing and assessment means ³measures accurately what it is intended to measure´ (Hughes, 1989: 22). In this study, two variables to be measured were score of placement test and achievement test. These two variables basically measured the same thing that was students ¶ proficiency. In this case, PB TOEFL test was considered valid if it measured VWXGHQWV ¶ ODQJXDJH SURILFLHQF\. In learning ODQJXDJH VWXGHQWV ¶ SURILFLHQF\ is regarded as the degree of skill with which a person can use a language, such as how well a person can read, write, speak, or understand language (Richards & Schmidt, 2010).
The support for validity evidence in this study was criterion validity. In this case the validity evidence is the strength of the predictive relationship between the test score and that performance on the criterion (Fulcher & Davidson, 2007: 5). Despite the fact that there are four language skills (listening, speaking, reading, and writing) and three language components (vocabulary, grammar, and pronunciation), the two language skills (listening and reading) and two language components (vocabulary and grammar) were used as language performance criterion in this study.
The Paper Based TOEFL test used in this study consisted of sections: listening, structure, and reading and vocabulary. There were 140 items test that students needed to complete, and they were given 140 minutes to finish, In Table 1, based on Paper Based TOEFL score, students were classified into four classes, from the lowest to the highest competence levels, namely D, C, B, and A. Students with the range score 297-343 belonged to class D, the low proficiency class; students with the range score 344-363 belonged to the C class, who were considered mediocre class; students with the range score 364-380 belonged to B class, the good class; while students with the range score 381-463, were A class, the excellent class.
To check whether Paper Based TOEFL score really represents VWXGHQWV ¶ proficiency, WKH VWXGHQWV ¶ VFRUH RQ 3DSHU %DVHG 72()/ Test LV FRPSDUHG ZLWK VWXGHQWV ¶ DFKLHYHPHQW score after following Integrated Course for 6 months. In Integrated Course students learned to develop their skills in English, including listening, speaking, reading, and writing; as well as the language components including vocabulary, grammar integratedly. However, IRU WKH SXUSRVH RI WKLV VWXG\ WKH VWXGHQWV ¶ score on achievement was derived from the score of listening, reading, grammar, and vocabulary.  Class D consisted of 24 students, the lowest score was 53 while the highest score was 89.
Most students in this class had low proficiency; they needed to learn Basic English. Some students explained that they did not get proper English lesson in Senior High School, thus it was difficult for them to follow the lesson. Moreover, there were 4 students who got score below 70. Based on the observation done by researcher, VWXGHQWV ¶ characteristics influenced their achievement. Those students who got score below 70 were not motivated and tended to be passive in learning. They were not willing to help themselves in learning. However, there were 3 students who got score above 80; they had high motivation in learning. They tended to be active in learning.
Class C 7DEOH 7KH 6XPPDU\ RI 6WXGHQWV ¶ 6FRUH LQ &ODVV & Class C consisted of 25 students with the range score of 347-363 for Paper Based TOEFL test score. This ZDV WKH PHGLRFUH FODVV VWXGHQWV ¶ proficiency was considered slightly better than students on D class. However, different from class D, class C sowed the lower standard deviation score, meaning that variant score was low. It could be inferred that students in this class showed generally the same progress in learning. Students who got good score on placement test mostly also got the good score on final test of Integrated Course. Most students in this class still needed to learn the basic English, especally grammar and vocabulary. It was hard for them to understand reading text and spoken monologue and dialogue.

CONCLUSION
To sum up, this study outlined two important findings. Firstly, there were moderately strong positive correlation (0.417) between placement test (PBT TOEFL Test) VFRUH DQG VWXGHQWV ¶ ¶ DFKLHYHPHQW VFRUH RQ Integrated Course. It showed by the tendency that the increase score on placement test (PBT TOEFL test) followed by the increase on VWXGHQWV ¶ achievement score in Integrated Course. It can be inferred that PBT TOEFL test is valid instrument for placement test, the VWXGHQWV ¶ FODVV UHIOHFWHG VWXGHQWV ¶ SURILFLHQF\ before attending course.
Secondly, in some cases students from excellent class did not perform better in learning, and vice versa, students from the low class performed better in learning. There was DQRWKHU IDFWRU FRQWULEXWHG VWXGHQWV ¶ achievement on learning that was VWXGHQWV ¶ characteristics and background. Students from the lowest class could perform better if they had high motivation in learning, and conversely, students from the excellent class got low score on achievement if they were passive in learning. Moreover, students who received good and appropriate English lesson on High School performed well on test, and vice versa, students from the lowest class generally performed low on placement test because they did not get Basic English lesson on Senior High School. Therefore, for the future researcher it is necessary to address the HIIHFW RI VWXGHQWV ¶ FKDUDFWHULVWLFV and background on the validity of TOEFL test as placement test.