UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) has published this month a report called On the Road to Assessing Deeper Learning: The Status of Smarter Balanced and PARCC Assessment Consortia.

The report gives a general stamp of imprimatur to ongoing progress (what did you honestly expect?) cloaked in the usual hedged language:

Study results indicate that PARCC and Smarter Balanced summative assessments are likely to represent important goals for deeper learning, particularly those related to mastering and being able to apply core academic content and cognitive strategies related to complex thinking, communication, and problem solving.Any challenges in implementation, the report foresees, will not be substantive but rather ``technical, fiscal, and political’’.

The CRESST report sounds one note of caution: ``Absent strong representation in the new assessments, students' deeper learning likely will be compromised.’’ Therein lies the rub: will the ``assessments call for deeper learning and reflect 21st century competencies’’? (CRESST report, p.5)

As we at ccssimath.blogspot.com are also interested in the assessments being developed by PARCC and SBAC, we thought we’d follow CRESST’s lead and release our own status report.

***

We suggest readers review the CRESST report either before or while following along with our analysis.

The CRESST report states ``the summative [as opposed to formative] components currently are receiving the lion’s share of attention’’ and, as we have only seen sample summative assessment tasks, we will focus on them as well.

PARCC and SBAC have each published several dozen sample tasks, which are tied to various criteria in addition to specific CCSSI standards. We attempt to briefly summarize those criteria below.

PARCC’s questions are divided into 3 types:

Type I are ``tasks assessing concepts, skills and procedures’’

Type II are ``tasks assessing expressing mathematical reasoning’’

Type III are ``tasks assessing modeling/applications’’

We will analyze a PARCC sample item in another blog post.

SBAC has, so far, been more detailed in its assessment criteria, and categorizes tasks by a combination of claims and targets. The claims are:

Mathematics Claim #1: Concepts and Procedures ― Students can explain and apply mathematical concepts and interpret and carry out mathematical procedures with precision and fluency.

Mathematics Claim #2: Problem Solving ― Students can solve a range of complex well-posed problems in pure and applied mathematics, making productive use of knowledge and problem solving strategies.

Mathematics Claim #3: Communicating Reasoning ― Students can clearly and precisely construct viable arguments to support their own reasoning and to critique the reasoning of others.

Mathematics Claim #4: Modeling and Data Analysis ― Students can analyze complex, real-world scenarios and can construct and use mathematical models to interpret and solve problems.

Each claim has a series of targets, which are too numerous to list here, but we direct interested readers to SBAC’s General Item and Task Specifications for three assessed grade ranges: 3–5, 6–8, and high school. Each target is assigned one or more of four depth of knowledge (DOK) levels, as described in the CRESST report.

Under the requirements of NCLB and the federal Elementary and Secondary Education Act, weighted scores for questions categorized under each claim will be aggregated to form a ``Total Mathematics Composite Score’’.

Common Core will be center stage by the 2014-15 school year, a year-and-a-half away, and the consortia need to have rolled out a finished product. As is our practice, we prefer to cut through the ubiquitous rhetoric and edu-speak to analyze questions that students may actually see. We’ll start with a specific example below.

Should the sample tasks provided by PARCC and SBAC be considered works-in-progress and given slack because they are drafts? Or because the consortia have decided to publish a limited number of ``official’’ sample tasks, should we consider the questions fully vetted and accurately representing assessments to come?

The consortia don’t stand to lose face either way: if we give our approval, perhaps they will strut; if we pan the tasks, maybe they’ll explain away the shortcomings. No one in either consortia responds to our emails anyway, so we’ll just proceed.

***

We chose for our first analysis a high school level question from SBAC that it designates 4E. That means the question will address Claim 4, whether ``[s]tudents can analyze complex, real-world scenarios and can construct and use mathematical models to interpret and solve problems.’’ Target E states the task should provide evidence that students can ``[a]nalyze the adequacy of and make improvements to an existing model or develop a mathematical model of a real phenomenon.’’ Target E in turn is assigned both DOK3 ``[a]pplications requiring abstract thinking,’’ and DOK4 ``[e]xtended analysis or investigation that requires synthesis and analysis across multiple contexts and non-routine applications.’’ These are the two highest DOK levels.

The question is aligned, according to SBAC, to two CCSSI standards:

F.BF.1a ``Determine an explicit expression, a recursive process, or steps for calculation from a context.’’

F.LE.1b ``Recognize situations in which one quantity changes at a constant rate per unit interval relative to another.’’

From the combination of Claim 4, Target E and DOK3&4, we can infer that such a task should be of nearly the highest possible caliber that high school students would be assigned, and therefore would come closest to evidencing ``21st century’’ ``college and career ready’’ skills.

Here is the SBAC sample task:

We first consider in context some basic pitfalls to be on the lookout for:

¶Can questions misstate facts? The ``rule’’ as stated in this problem is wrong. It should read ``A driver must count a minimum of two seconds from when the car in front of him or her passes a fixed point...’’

¶Can pictures and text be contradictory? In the diagram, a segment representing following distance is drawn between the back of the leading car to the front of the following car. But that’s not what’s written. An accurate diagram would show the distance front-to-front, middle-to-middle, or end-to-end, or in the alternative, the problem would accurately explain the picture.

¶Can questions contain confusing language that obfuscates the underlying mathematical tasks? This is possibly the biggest pitfall of all. Much ado is made of the scoring debacles that occur when exam questions contain errors in math, but less attention is given to problems in questions’ wording. With the movement away from strict calculation questions to ``real life’’ questions that ask for explanations and the like, ambiguities and other problems in wording will most certainly become a regular feature of exam critiques.

Simple example. Two sentences in the SBAC question are redundant and add needless wordiness and repetition: ``As the speed of the cars increases, the minimum following distance also increases.’’ Then: ``Explain how `the two-second rule’ leads to a greater minimum following distance as the speed of the cars increases.’’ Students may waste time poring over both sentences just to see what each actually says, only to discover they are the same.

Also, the last part of the task is poorly written both for its bad word choices and faulty parallelism. ``As part of your explanation, include the minimum following distances, in feet, for cars traveling at 30 mph and 60 mph.’’ Why would students ``include’’ distances? The question should ask students to ``calculate’’ the distances. Faulty parallelism is a particularly insidious problem, as we’ve written before even on the poor wording in CCSSI. We get it, but students may not, that the task requires answers for two separate scenarios: both cars traveling at one speed, and then both cars traveling at a different speed. The way it’s worded, students might think one car is going 30 mph and the other car 60 mph.

Finally, the task talks about an increase in speed. There’s no ``increase’’ connected to this problem, except in an abstract sense; in fact, there are just two distinct but simple scenarios. The word ``increase’’ is a red herring. A better wording might be: ``Calculate the minimum following distances, in feet, for two cars traveling at 30 mph and for two cars traveling at 60 mph.’’

Who will be the final arbiter when the wording of a question is at issue? One word in one question might swing the results for thousands of students between a pass and fail. With so much at stake, we anticipate a whole new field of litigation that centers on such issues.

¶Can math tasks cause unforeseen problems when students know more than the question assumes? Hand-graded questions will be accompanied by rubrics, but graders need to be prepared for students that know more than they do.

We Googled ``safe following distance’’ and it seems some DMVs suggest a ``three-second rule’’. Some high school driving classes have yet a third ``rule’’: stay behind the car in front of you one car length for every 10 mph. But all of these safety ``rules’’ would make a physicist cringe.

The ``two-second’’ and other such rules exist for simplicity, but they're by no means accurate. As high school classes in physics have become an endangered species, and at the crossroads of Common Core, do we want to set a bad precedent by instilling in millions of students false ideas about motion, momentum and kinetic energy? We support bringing physics back to every high school, if only because anyone studying physics can tell you that because of kinetic energy, a car’s stopping distance (and therefore a safe following distance) increases in proportion to the square of the increase in speed, not linearly.

Target E says students can ``make improvements to an existing model.'' This is certainly one of those situations.

Add to stopping distance the linearly proportional distance traveled during reaction time, and the combined mathematical model resulting from the sum of quadratic and linear equations is truly complex, yet an appropriate scenario for high school students to analyze and understand.

¶Can math questions be phony? The SBAC task states as fact ``Drivers use this rule to determine the minimum distance to follow a car traveling at the same speed.’’ This, of course, is nonsense. Under the rule, a driver need never calculate the actual distance, so how is this question ``real life’’? Distance and time are distinct (except in relativity), so let’s not mislead students or give them a phony, contrived task.

¶Can math questions be culturally biased? For students who ride the subway or bus to school, and have perhaps never left their neighborhood or been in a car, it might cause a problem. Besides, in some states, 14-year-olds can drive anywhere; in New York City, there are many licensing restrictions, off-limits roads and bridges, and impediments to driving, including why drive a car when there’s no place to park? Also, ask drivers who commute on the Santa Monica freeway whether there is space to adhere to a rule of following 2 seconds behind another car. And as the poorly-named Gowanus Expressway experiences a perpetual bumper-to-bumper traffic jam, it’s also likely that some New York City drivers have never had occasion to apply such a ``rule’’.

***

These pitfalls aside, let’s now consider whether the task assesses ``deeper learning’’.

What mathematical skills are actually involved? We count four:

1. Understanding a linear relationship

2. Knowing and applying the formula D = R x T

3. Multiplying fractions

4. Conversion of units

Each of the skills is, at the latest, taught in middle school; therefore, the problem belongs in middle school assessments. Part of the blame, though, lies in CCSSI’s F.LE.1b, in which tasks where one quantity changes in direct proportion to another are linear functions, and therefore are completely misplaced in high school. Such tasks with cars moving at constant speeds belong in sixth or seventh grade, when students are learning to graph straight lines. To call this a ``complex’’ scenario is a debasement of American math education.

Here is SBAC’s rubric:

And here is the answer quality that SBAC expects:

Who, exactly, is going to write an essay like that? How many students will successfully shoehorn (and then solve and explain) a ``minimum following distance between two cars’’ task into a question that really asks, ``When a car is traveling at 30 mph, how far does it travel in 2 seconds?’’ That the two questions are mathematically one and the same is not at all obvious, even if the math is easy. Finally, who will write the entire calculation as a solitary equation or formula, without units, no less?

The rubric reveals expectations that are unrealistic in terms of the level of discourse, and the notion of what comprises a fully-credited answer is too restrictive.

For instance, a succinct, correct (and intuitive) response would also be ``when the speed doubles, the following distance must double, too’’, but it seems that would be graded as wrong under the rubric because it’s not general enough.

Also, does Target E really allow for a student to stand up and correct SBAC? Not according to the rubric, it doesn't. It seems students will only be credited for correcting contrived mistakes, not real ones.

Finally, for completely understanding the problem and writing a detailed explanation, but for failure to do the correct conversion, students will lose half-credit because they got a wrong answer. Only a few students won’t get stuck on the calculation and dimensional analysis, which leads to, gasp, multiplying and simplifying fractions.

***

We understand the inherent challenges in creating readable, unbiased, and effective assessments, but this SBAC task is not even close to adequate. A well-formulated task builds on itself to lead to more complexity as a way to usefully differentiate between those that understand the basics and those that can make a more complex analysis. The SBAC question has no way to make such distinctions; it is all, half, or nothing.

If assessments are going to be posed usefully, they should go more like this:

Given the fact pattern and diagram.

1. If the two cars are both traveling at 30 mph, how many feet behind should the second car be in order to pass the same point 2 seconds later (or to comply with the ``two-second rule’’)?

2. If the two cars are both traveling at 60 mph, how many feet behind should the second car be in order to pass the same point 2 seconds later?

3. How does the speed of the cars in Part 2 compare to the speed of the cars in Part 1?

4. How does the following distance between the cars in Part 2 compare to the following distance between the cars in Part 1?

5. What kind of mathematical model (function) expresses the relationship between the following distance and the cars’ speeds?

6. Under the ``two-second rule’’, if D is the following distance between the cars (in feet) and R is the cars’ speed in mph, write the equation that expresses the relationship.

7. Graph the relationship between speed and following distance.

8. Explain how you can quickly determine the following distance at 50 mph.

The question posed this way is stepped, so that the assessment can more accurately gauge how far along the analytical process the student has progressed.

***

The CRESST report’s conclusion states ``How well [the consortia’s assessments] capture DOK4, which represents the heart of goals for deeper learning, will depend on the nature of the performance tasks that are included as part of each consortium’s system.’’ We agree with CRESST that the devil is in the details, but unlike CRESST, we’re looking at those details, and if the most challenging assessment questions are going to have multitudinous shortcomings like the SBAC example, Common Core’s goals of ``deeper learning'' will be an unreachable pot o’ gold at the end of the rainbow.

***

POSTSCRIPT

Linear functions are commonplace, important and useful, and students should be exposed long before high school to a wide variety of scenarios (applications) in which there is a linear relationship between two quantities. We’ve said before that multiple applications is the most effective way for students to abstract concepts.

Here is one such example where the linear relationship is not obvious at first glance, and that is what makes the question particularly intriguing:

Although the relationship initially is not obvious (when the height of a triangle remains constant, the change in area is directly proportional to the change in the base), we like straightforward tasks like this, unfettered by verbal irrelevancies, that cut to the chase: questions that require understanding of the underlying mathematical skills, but also a degree of insight in a non-obvious task as well as problem solving abilities, both of which we consider to be real ``college and career ready’’ skills.

A very well-done analysis, and one with which many educators would -- I hope! -- agree.

ReplyDeleteObviously, this sort of work is of the utmost importance at the moment, when standards are being established that may last a long time.

How can we guarantee that work of this sort gets noticed where it needs to be noticed?

The physics of the 2-second rule is not so bad. Unless the car in front of you hits a tree, stopping distance is not what matters. If you see them slam on the brakes, and you manage to slam on the brakes within 2 seconds, you will not hit them, even if traveling at high speed on ice. After working through why this is so, you will see that measuring the interval as shown (not including the length of either car) is indeed the right way to measure it.

ReplyDeleteBut generally I agree with all of your analysis and agree 100% with Doug1943's comments, especially the import of his last question.