#Better framing in the front end of why these measures make sense (likely using the points from RFP for the special issue)
I would need to address Reviewer 3's comments point-by-point, so here they are(re-ordered):#Discuss when "standardized measurement" improves outcomes, when it does, and perhaps how.#Discuss systematic bias due to gaming metrics incentivized by rewards attached to the measurement (old and new). Specifically, note that this behavioral change might have nothing to do with the underlying phenomenon of interest. (Think "Rewarding A While Hoping For B", etc., multitask, etc.)#Justify the choices in the reduction of the measurement space. Measurement is often reductive: What is left out? Is it important? Discuss the difference between the conceptual phenomenon and how it is operationalized.
#Consider possible downsides of each individual metric
#Consider possible downsides in using the entire battery of measures
#Test alternatives to this measurement approach
#Discuss when "standardized measurement" improves outcomes, when it does, and perhaps how.
#Discuss systematic bias due to gaming metrics incentivized by rewards attached to the measurement (old and new). Specifically, note that this behavioral change might have nothing to do with the underlying phenomenon of interest. (Think "Rewarding A While Hoping For B", etc., multitask, etc.)
#Justify the choices in the reduction of the measurement space. Measurement is often reductive: What is left out? Is it important? Discuss the difference between the conceptual phenomenon and how it is operationalized.
#Test the effects of using a framework on various outcomes.
Of these, 1,2, and 3 are reasonable suggestions and could be addressed. Point 4 is more problematic but probably possible in some sense. Point 5 could be interpreted as the downside of the framework as a whole (see below) and then could be possible. Points 6 and 7 might be excused by explaining that the editors and I have agreed that this won't be a testing paper. Nevertheless, the paper will show examples of not using the measurement framework through-out.
Also for reference, here are the measures from the paper: