In practices of direct assessment of writing ability, the variability of human decision-making during scoring poses great challenges to the validity of assessment (Kane, 2006). The variables causing differences in individual raters’ scoring interpretations have been widely investigated (e.g. Eckes,2012; Wolfe et.al, 2016). However, the issue of how raters negotiate to resolve discrepancies has not received attention although rater negotiation is a widely used score resolution method. As it has been emphasized by scholars interested in the argumentative behavior of raters (e.g. Trace et. al., 2017), asystematic analysis of score negotiations will enable us to analyze the dependability of score negotiations. The purpose of this study is twofold: to present a thorough analysis of the argumentative structure of rater discrepancy resolution discussions with a view to understanding their underlying dynamics, and to investigate whether the elements of the argumentative structure of negotiations differ from research settings to authentic score resolution practices. In line with this aim, rater negotiations following a written test at the language school of an English-medium university were analyzed within the framework of Argumentation Theory by Toulmin (1958) and Walton (2005, 2016). The negotiation data were obtained from 99 recorded rater discussions among 30 EFL teachers, and transcribed, coded and categorized into argumentative discussion moves. A Rater Negotiation Scheme (RNS) was developed through a recursive data analysis and categorization process, and it was validated through field-testing in authentic settings. The findings have implications both for research on rater negotiations and arguments on the reliability of the method.