Archive for the ‘Testing, Measurement, Assessment & Evaluation’ category

Tests and Their Value

August 15, 2011

 Tests and Their Value

Teacher in Kuwait

 May 2009

 

I think that I started to change my mind about the efficiency of tests. Although I used to strongly believe that high-stake tests is the best way to check  a students’ understanding to make sure that he is doing his work. But through this course, we knew since the beginning that we will not have a test and still all of us are doing our work efficiently. Moreover,  we are really attentive to the presentations of each other. We also discuss our topics extensively in class. We also research each others topics because we are interested.  The most important thing is that we do our work efficiently but we are still relaxed because we know that we will not have a test. I assure you that the amount of knowledge and learning that happened in this class is so much more than what happened in other class in which tests.

Rubrics – Origin, Purpose, Characteristics and Bloom Based Applications

August 15, 2011

Rubrics

 Origin, Purpose, Characteristics and Bloom Based Applications

By

Lawrence P. Creedon

 

Rubrics have to do with assessment and evaluation. They are an alternative to the more traditional, subjective procedure of awarding a grade in response to teacher judgment, quotas or other competitive procedures. Rubrics are neither subjective nor competitive. In using rubrics teacher subjective judgment is kept to a minimum and quotas are not used in controlling the number or percent of grades that can be awarded in a given category. Individual competence and mastery are the determinants.

 

With rubrics teacher subjectivity is reduced. Grading systems that adhere to a predetermined formula are eliminated. With rubrics the quantity of the work performed and the mechanics of language usage and construction in reporting on what has been learned are not interwoven with the quality of the work performed. Each is evaluated separately.

 

Rubrics are criterion referenced. Competence is defined as the capacity to do what needs to be done. The definition of standards and benchmarks as applied to rubrics is not the same as that associated with standardized tests and the No Child Left Behind, United States federal education law. However, rubrics  are standard and benchmark based.  The standard is stated in terms of mastery and benchmarks indicate the level of mastery.

 

Origin of Rubrics

The use of rubrics in education is of recent origin; however, as with many so-called initiatives in education the concept is not new. Historically the term is derived from the Latin term rubrica meaning “red earth.”  During the middle ages it was common for important passages within official documents to be highlighted in red ink Red markings within liturgical documents indicated the rule or religious precept that was being promulgated.  In legal documents, text in red often indicated a heading in a code of law that led to rubric coming to mean any brief, authoritative rule.

www.music.miami.edu/assessment/rubricsDef.html

 

Characteristics of Rubrics

  1. The purpose of a rubric is to clearly and distinctly identify the terminal behavior expected from a learner as the result of a learning experience.
  2. Rubrics are criterion referenced. The criteria for performance are stated in the rubric.
  3. Rubrics focus on competence. Competence is defined as the capacity to do what needs to be done.
  4. Rubrics can be holistic, multiple or task specific. As holistic, a rubric focuses on a complete concept, topic or issue. As multiple, several subordinate rubrics to a holistic rubric are developed with each multi rubric related to the same issue, topic or concept. As task specific, a rubric is limited in scope and breadth to one specific task related to a holistic rubric or one component of a multiple rubric.
  5. Rubrics ought to be developed in conformity with a taxonomical framework such as found in Bloom’s six category cognitive domain.
  6. Rubrics ought to be developed so as to address a variety of intelligences in cognitive, affective and psychomotor domains.
  7. Rubrics ought to be developed at the outset of the learning experience, and shared before instruction begins with those for whom they are intended.
  8. Learners themselves ought to participate in the development of the rubrics consistent with their personal level of competence and maturity.
  9. Instruments used for evaluation [tests] ought to be an outgrowth of the rubrics developed for the learning experience.
  10. The terms used in a rubric ought to be clearly defined, specifically related to a level of cognition and be constant in application from rubric to rubric.
  11. Rubrics are applicable at all levels and in all curriculum areas.

 

While not recommended, but if necessary in order to come into compliance with traditional grading procedures, rubrics can be converted to a letter or percentage grade.

 

Limitations of Rubrics

Rubrics are not the “Silver bullet” of assessment. And, they are not universally embraced. For example AlfieKohn [www.alfiekohn.org], among others, has offered a thoughtful critique of rubrics.[ English Journal, Alfie Kohn March, 2006, Vol 95, no.]   However, Kohn does not address the process of applying Bloom’s taxonomy to rubrics. It is this application that will be addressed in the remainder of this paper.

 

Bloom’s Cognitive Taxonomy Applied to Rubrics

Benjamin Bloom et. al. Six Category Cognitive Domain Taxonomy [1956] provides an excellent vehicle for the development and application of rubrics. Bloom’s taxonomy was and continues to be an effort to minimize arbitrariness and subjectivity in grading. As stated by Bloom and that of his colleagues the purpose of their taxonomy was to:

…develop a theoretical framework which could be used to facilitate communication among examiners….After considerable discussion, there was agreement that such a theoretical framework might best be obtained through a system of classifying the goals of the educational process, since educational objectives provide the basis for building curricula and tests and represent the starting point for much of our educational research.

The “framework” referred to by Bloom is now understood to mean rubrics.

 

Bloom and his associates had two intentions in offering the taxonomy. First, they intended to:

…provide taxonomy of educational objectives so as to provide for classification of the goals of our educational system. It is expected to be of general help to all teachers, administrators, professional specialists, and research workers who deal with curricular and evaluation problems… For example, some teachers believe their students should “really understand,” others desire their students to ‘internalize knowledge,” still others want their students to “grasp the core or essence” or “comprehend.” Do they all mean the same thing? Specifically, what does a student who “really understands” do which he does not do when he does not understand? Bloom, p. 1.

 

Second, they intended that the taxonomy should be of direct help to classroom teachers responsible for curriculum building. The original 1956 Bloom taxonomy handbook states:

Teachers building a curriculum should find here a range of possible educational goals or outcomes in the cognitive area (“cognitive” is used to include activities such as remembering and recalling knowledge, thinking, problem solving, creating).  Comparing the goals of their present curriculum with the range of possible outcomes may suggest additional goals they may wish to include…. Bloom, pp. 1 and 2. 

 

The relationship between the taxonomy of Bloom and his colleagues and the contemporary focus on rubrics is clear. Later on in this piece it will be shown that today Bloom’s taxonomy has application that extends far beyond that intended by its original framers.

 

While Bloom’s taxonomy is not the only structure suitable for creating rubrics, adherence to Bloom minimizes the subjective nature of rubrics. A commercial reality is that many listings of rubrics are nothing more than a regurgitation of more traditional approaches to test building. In such situations all that has been changed is the name and it is an example of the same old wine in new bottles.

 

Subjectivity such as teacher judgment can be a major limitation of rubrics that do not follow a cognitive structure such as that found in Bloom. At the outset it should be cautioned that teacher subjectivity is not always a limitation and to be avoided. Often when teacher judgment is offered in holistic fashion concerning a student it can be more valid and reliable than other forms of evaluation.

 

However, when subjectivity characterizes a rubric the whole process has been reduced to another form of a traditional approach to grading such as awarding letter or percent grades, quotas, or manipulating grades to fit the bell curve. Such procedures are in sharp contrast to rubric based evaluation with their clearly and distinctly defined criterion for competence and mastery.

 

In this context, the purpose of schooling is learner mastery of that which is being taught and not competition among learners. It is self-fulfillment for all and not survival-of-the-fittest for a few. Rubrics do not foster an oligarchy.  In the words of Alfie Kohn the purpose of schooling is to maximize success and not to ensure that there will be failures. Rubrics properly understood and applied can contribute to the quest.

 

Bloom’s Six Category Cognitive Domain

The six categories of Bloom’s cognitive taxonomy are:

  1. Knowledge/ Information
  2. Comprehension
  3. Analysis: Compare and Contrast
  4. Synthesis
  5. Evaluation
  6. Application.

.

The six divide into low order and high order cognitive categories. Low order has to do with Knowledge/Information and Comprehension. High order relates of the other four. It is commonly recognized that for the most part schools deal in low order activities, while their promotional literature as found in mission statements, etc proclaim that they focus on high order pursuits. For the most part standardized tests are low order cognitive. The best examples of this are the tests mandated by NCLB.

 

In their taxonomy Bloom and his colleagues use the term knowledge as opposed to information when considering the data input to the learning experience. Bloom defined knowledge as: those behaviors and test situations which emphasize the remembering, either by recognition or recall, of ideas material, or phenomena. Bloom, p. 62.  However, in reality when Bloom and his associates discuss knowledge they go far beyond information. They actually use the term knowledge in synoptic fashion to summarize the total impact of all six cognitive categories on the learner. To them knowledge referred to the seamless whole. Bloom and associates postulated four categories of knowledge as knowledge specifics, terminology, facts, and universals and abstractions Bloom pp. 62-75.

 

In contrast, information is limited to data only which relates most closely to what Bloom terms as knowledge of specifics and terminology.  At the outset it does appear as if he and his colleagues are equating information with knowledge. They are not.

 

I prefer to use the term information when referring to the first step in the cognitive process and reserve the word knowledge to indicate the sum total of the whole six category taxonomically based learning experience.

 

Bloom’s understanding of comprehension and analysis – compare/contrast is straight forward. No further clarification will be offered here.

 

When considering synthesis, Bloom goes beyond the common understanding of the term as referring to bringing varying and possibly contrasting views together into a new whole. To Bloom, synthesis is the only original aspect of the whole cognitive process. It is the creative component. Synthesizing is advancing what is to a new level. It relates to the ancient Greek concept of thesis, antithesis and synthesis. It is at the level of synthesis that new knowledge is created.

 

To Bloom and his associates evaluation was not an activity or an event that took place solely at the end of a learning experience. Rather, evaluation was to be on-going throughout the entire learning experience. W. Edwards Deming in his concept of Total Quality Management in industry promoted the notion of evaluation being an on-going process rather than a single terminal exercise. While not colleagues, Deming and Bloom agreed on the place and purpose of evaluation. In the last years of his life Deming attempted, with little success, to bring the concepts of TQM to bare on the education establishment.

 

Evaluation as proposed by Bloom, Deming and a host of others is in sharp contrast to the high stakes standardized test approach of NCLB. Through NCLB the learner sinks [gets held back] or swims [gets promoted or graduates] based on the results of standardized tests.

 

Application as understood by Bloom implies that no real learning has taken place unless what has been learned can be put into practice.  Bloom takes issue with the notion of knowledge for knowledge’s sake. It honors the notion that learning without application is not learning in any meaningful sense.

 

The six categories of Bloom’s taxonomy are not rigid. Beyond the first two of information and comprehension they do not necessarily unfold in any predetermined order. Teacher judgment can be decisive in determining the order of the six categories beyond the first two. The order will depend on the topic being considered, the purpose of the learning experience, the learning resources at hand and the learning style of the learner.

 

Clear and Distinct Rubrics

Bloom and his colleagues identified a list of action words associated with each of their six cognitive categories. In doing so what they did was to anticipate the whole rubric based evaluation process.

 

It is in using the most appropriate action word associated with a specific cognitive category where a rubric becomes clear and distinct. Action words or verbs that are clear and distinct are factors in defining mastery and competence. Qualitative words such as little, most and some are too subjective to be effective in rubrics and should not be used. To say that an assignment has been completed is not precise enough. It is important to specify at what level of mastery and under what conditions an exercise was completed. Lists of action terms associated with each of Bloom’s six categories are readily available commercially. Or, an enterprising faculty as a professional development activity can develop its own. Among those that are on the market is Quick Flip Questions for Critical Thinking developed by Linda G. Barton and available through Edupress, Inc, POB 883, Dana Point, CA 92629.  Also, Curriculum and Project Planner for Integrating Learning Styles, Thinking Skills, and Authentic Instruction, Imogene Fonte  and Sandra Schurr, Incentive Publications, Nashville, TN, 1996..

 

A clear and distinct rubric ought to be limited to one specific topic. It should not include several factors such as the process for addressing the task, the format for reporting on what has been accomplished, the correctness of the grammar [including spelling], and finally, the content being considered. Each of these ought to be considered in separate rubrics. Also, when awarding a value to each level of content mastery within a rubric, components such as process, format and language mechanics ought not to be of equal value with content. For example, if a rubric is to have a mastery/competence range of from zero to four points, then conforming to expectations in process, format and language mechanics is not equal to mastery and competence in the content of the rubric. If the learning exercise includes submitting a series of reports the mere submission of the reports is not equal to the actual content of the report. Submitting reports on time and as directed ought to receive no more than the minimum number of points possible. In this example that would be one point.

 

Submitting a report or a series of reports has little to do with the quality of the work accomplished. All it signifies is that a requirement for submitting an assignment has been met. However, if the purpose of the rubric is to address process, format or grammar mechanics, then awarding quality points beyond the minimum for each of these factors would be appropriate. The subject matter of the learning exercise stands alone and through the rubric mastery and competence is identified. As stated earlier, competence is defined as the capacity to do what needs to be done.

 

Scoring Rubrics

A common approach to scoring rubrics is a number-based quality point system usually ranging from zero to four points. The four point system is not an absolute. Alternative approaches are appropriate as long as whatever is used is consistent among all applications and by all users.

 

Zero points indicate non-compliance or lack of understanding of the assignment. At the other end, four quality points indicates exemplary performance as indicated by the four Bloom categories identifying critical thinking, metacognition and application. The four are: analysis- compare and contrast, evaluation, synthesis and application.

 

One point is indicative of low order cognition as in information accumulation and comprehension. Such activities as recalling, describing, identifying and regurgitating are indicative of low order cognition and thus in a rubric would receive one point. It is these two low order categories that are most frequently associated with tests whether they are teacher-made or standardized. Examples are true-false, multiple choice and short answer questions. Some form of recall or regurgitation is being called for.

 

The awarding of two, three and four points in a four point scale indicates that the learner has gone beyond recall and regurgitation to higher order components of critical thinking such as analyzing, comparing and contrasting, evaluating, applying and synthesizing. Two points can indicate an ability to analyze and compare and contrast. Three points can indicate competence in evaluation and application. Four points can be reserved for synthesis or meta cognition. It signals an ability to go beyond what is to a new thought, use, or application.

 

Synthesis, according to Bloom, is the ability to consolidate all that has gone on before in the other categories and create something new. Not necessarily a new discovery for humankind, but rather an epiphany for the learner.

 

Using Bloom’s taxonomy in this manner is consistent with a constructivist approach to learning. Among the characteristics of a constructivist approach is that learners construct their own knowledge from within the context of their own experience. This is in contrast to the attempted infusion of knowledge pre-determined by a source or authority external to self and the learner’s experience. Obviously the concept pf experience needs to be defined, however, to do so is beyond the purpose and scope of this paper. Consulting the work of John Dewey is a good source in this regard.

 

Rubrics and Old Wine

Publishers of educational materials in the “How-To” category have made available a plethora of paperbacks to assist teachers in responding to the learning needs of learners. Many of these publications have a great deal of merit; however, others are of the same old wine in new bottles variety.

Many, if not most in this category, do not fit the description of rubrics as presented here. Many are nothing more than traditional approaches to grading re-packaged in a new format with new terms to describe old approaches. Many do nothing more than recast grading in response to teacher judgment and regardless of the descriptors used come out as A, B, C, etc. Educators are often accused of marketing old wine in new bottles and if that is true frequently the shoes fits for rubrics. It is these that Alfie Kohn and other critics of rubrics seem to address.

 

Applications Beyond Testing

Rubrics have many applications beyond being used in testing. Several are cited here.

  1. Standards and Benchmarks: Earlier in this piece reference was made to rubrics providing a vehicle for expressing standards and benchmarks.
  2. Individual Accountability: Learners, under teacher direction and guidance, can create rubrics for themselves indicating the road that must be traveled in order for each learner to accomplish his/her personal best.
  3. Peer Assessment: Two or three classmates or colleagues can establish a “critical friends” support team where they critique each other consistent with individually or peer developed rubrics.
  4. Formative Assessment or Supervision: Provide a technique that can be applied in a formative program of professional development or supervision.
  5. Critiquing a Professional Development Activity or a Faculty Meeting:  Provide an alternative to the traditional procedure for soliciting “feedback” at the conclusion of a professional development activity or faculty meeting.
  6. Parent Input: Provide a vehicle for soliciting parent input and feedback at either a private meeting with a student’s parents or in conjunction with a meeting with the parent community.

 

Questions to Consider

In an article related to the topic of grades and grade inflation published in the Chronicle of Higher Education, Alfie Kohn raised several questions that can be applied to rubrics and their application. Kohn posits that the current debate over grade inflation and standardized test scores is misdirected. He asserts that the focus ought to be on grading itself. In so doing Kohn has by inference focused the spotlight on rubrics and their application. Kohn has proposed that the debate on grading and grade inflation should focus on such questions as:  [Kohn, ibid.]

  1. What unexamined assumptions keep traditional grading in place?
  2. What forms of assessment might be less destructive?
  3. How can professors and teachers minimize the salience of grades in their classrooms so long as grades must still be given?
  4. If the artificial inducement of grades disappeared, what sort of teaching strategies might elicit authentic interest in learning?

 

Included in a response to questions 2-4 can be a consideration of rubrics.

 

Note: Examples of rubrics developed by former Framingham graduate students can be found on separate Creedon monograph.

 

Ipse dixi.t

Lawrence P. Creedon

Federated States of Micronesia, 2006.

www.larrycreedon.wordpress.com

 

AN OUTCOME

December 19, 2009

AN OUTCOME.
Lawrence P.Creedon
An outcome is a statement of the anticipated and/or actual result of a learning experience.
It is results oriented. It is specific clear and distinct. An outcome indicates level of
development, cognitive capability, affective response or motor skill proficiency. It is
specific in that its purpose, and domain are clearly identified. It is clear in that ambiguity
and subjectivity are removed or consciously kept to the lowest possible denominator
considering that it is the creation of a human being. It is distinct in that it lays out the
factors and conditions under which it can be accomplished, and measured. It is specific,
clear and distinct in that it related directly to the learning experience at hand.
- An outcome is a specific, clear, and distinct statement of the anticipated and/or
actual result of a learning experience in which the end is the beginning of a new
experience- Framingham Krakow Group 02/08/2008
Domains of an outcome include:
1. Cognitive development as illustrated in Bloom’s six category taxonomy.
2. Affective developed as illustrated in Krathwohl’s affective taxonomy
3. Motor skill development
An outcome relates to behavior and performance. It stipulates what it is that needs to be
done.
An outcome stipulates a specific area of
1. Development
2. Achievement
3. Competence – The ability to be what needs to be done
4 Mastery
An outcome stipulates the factors impacting on learning experience, such as:
1 What is the domain being addressed
2. What are the conditions under which measurement [evaluation and/or
assessment] will apply?
3. Specifically who are the learners involved in the learning experience?
4. What prerequisites outcomes will have impacted on the experience under
consider.
Closing Comment
Focusing on outcomes first has the appeal of common sense. An outcome, – What do you
want to be and how will you know when you can do it or that it has been done? – has an
evolutionary development. It has a past. For example, it is an outgrowth of the work of
Robert Mager in the 1960s usually identified as the “Objectives movement.” It is
consistent with “Outcome or Mastery Based Education.” Most recently it is associated
with “Backward Design.” Most basically it is common sense.
Lawrence P. Creedon
Krakow Poland
1-30-08
:

Intrinsic and Extrinsic Motivation: Incentives and Rewards

December 19, 2009

1
Intrinsic and Extrinsic Motivation
Incentives and Rewards
Lawrence P. Creedon
Human beings are purposeful creatures; they do things for a reason – an incentive. The
aim of an incentive is to provide value for an activity.
Among the laments of educators is that some of their students lack motivation to learn. In
response it is common for teachers to look for quick fix solutions to the problem. The
often heard appeal is: “What can I do to motivate my kids to want to learn?”
One response seems to be that there must be a silver bullet program out there somewhere
that when found and applied will eliminate the problem and teachers will find themselves
working among motivated students. One such program in the marketplace today and
readily available via the Internet is that of CraigSeganti
http://www.craigseganti@classroomdiscipline101.com.
A reality is that there is no such silver bullet program. If there was would it not be in
universal practice, thus the problem would have been eliminated?
The basic issue comes down to one of distinguishing between extrinsic and intrinsic
motivation. While teachers talk of promoting intrinsic motivation the reality is that most
efforts at motivation are extrinsic. Schools have a destructive power on motivation is an
assertion has been made time and time again for generations. For example in 1938 John
Dewey writing in Experience in Education bemoaned the focus on what we now refer to
as low order cognitive skill development [Epitomized by NCLB} at the expense of
students loosing their “soul” [By soul Dewey meant desire, motivation to learn]. In
September 2008 in the journal Educational Leadership Stephen Wolk asserted that
“Joyful learning can flourish in school – if you give joy a chance.” Wolk went on to refer
to John Goodlad who writing in A Place Called School [1984] decried the “Boredom”
prevalent in schools. These examples relate to motivation, either intrinsic or extrinsic.
There is a major difference between intrinsic and extrinsic motivation Any first year
college student taking an introductory course in psychology can define the “good cop-bad
cop” difference between the two. In intrinsic motivation behavior and performance is
rooted in an inner personal desire to engage in an activity. In extrinsic motivation
engagement is stimulated by anticipation of a reward or fear of the consequences. A case
is point is the No Child Left Behind craze. Schools authorities shower all sorts of
extrinsic rewards on learners for doing well on the required NCLB standardized tests.
The rewards include pizza parties, outings, cash payments, etc. The consequences of poor
performance can include non-promotion or graduation. Ironically while reinforcing
extrinsic rewards the same schools will avoid facing the reality of what they are doing by
promoting in their talk and in their literature forms of intrinsic motivation. The reality is
that when push comes to shove extrinsic tactics prevail. Dewey would say they have sold
their “soul.”
2
Three Categories
Incentives can be cast in three categories: Remunerative, Moral and Coercive.
Remunerative: Rewards in the form of money or other material rewards. A prime
example is the current [2008] financial crisis facing the United States where the
allegation is that excessive financial rewards to the captains of commerce and finance
have threatened the collapse of the United States economy. A controversial $700 billion
rescue plan has been put in place by the U.S. Congress as this piece is being written. That
trumps a pizza party!
Moral: Doing the right thing. Exemplary examples of this include those of St. Francis of
Assisi, Mother Teresa and the United States Peace Corp. St. Francis [1181or 2 – 1226]
was born into a wealthy family. Ultimately he rejected it all and dedicated his life to
assisting the destitute of his time. He totally surrendered all worldly goods, honors and
privileges. Mother Teresa [1910-1997] followed a similar path and for her work received
the Nobel Peace Prize in 1979. [Was that reward?] The United States Peace Corp
founded in 1962 during the United States presidency of John F. Kennedy has sent over
200,000 volunteers to more than 170 countries around the world to assist people in need
of basic human services. Volunteers come from all walks of life and serve in the Peace
Corp for two years without compensation beyond a subsistence allowance. The Peace
Corp is associated with a proposal offered by the recognized father of American
psychology William James. In the final essay of his life – “The Moral Equivalent of War”
[1902] — James proposed what is now known as the United States Peace Corp. [My
grandson is currently serving in the Peace Corp in Tanzania, Africa].
Coercive: Failure to act in a predetermined, prescribed way will result in unfavorable,
unpleasant consequences for the offending person and in some situations all those
associated with the offender. Unfortunately school is a place where coercion is frequently
commonplace. In some situations in an effort to coerce an offender an entire group
associated with the offending person will be “punished.” The thought here is that group
pressure to conform will be leveled against the offender. Programs that include coercion
and promote themselves as having a solution to “discipline” problems in school are
readily available. One program cited above is that of Craig Seganti. Another is
Classroom Management – A Guide to Success [1992] by Bonnie Williamson.
In Search of the Motivated Learner
Without question those in practice today behind the classroom door have a difficult and
challenging job. Bureaucratic and administrative demands are ubiquitous. Some less than
kind critics of the situation have referred to such “leaders” as “cement managers.” Added
to the stress of all of this is the presence in the classroom of some students who
aggressively as well as passively resist directing their attention to the instructional
program.
Frequently lack of motivation and discipline issues go hand-in-hand. When asked to cite
issues that are of concern to them, teachers include the non-motivated learner. In going
directly to a solution for this problem teachers will ask for insight into effective reward
programs that will motivate students to learn and to behave. A few school systems have
3
embarked on providing cash payments to students who attend school regularly, do not
cause behavior issues and achieve as required by government imposed standards.
[Washington Post August 27 2008]
Alfie Kohn in Punished by Rewards [1993] addressed the question of rewards as an
answer to the question of motivation and to some behavior issues. What follows is a brief
synopsis of a few of the points made by Kohn in this regard.
Kohn asserts that “If children’s enthusiasm is smothered it is a direct result of something
that happens in our schools” [p. 142-3]. The antidote frequently offered at school is to
reward enthusiasm. Granted many factors contribute to lack of enthusiasm, but part of the
conventional wisdom is that rewards will build enough of a fire under the non-enthused
so as to heighten achievement and lessen discipline problems.
Kohn lists three factors in opposition to rewards.
1. Factor One: He points out that the youngest school clients – those in kindergarten and
early childhood education programs – do not need or even think of rewards for learning.
Their desire to learn is inherent. This is supported by a large body of research including
that of Mel Levine in his book The Myth of Laziness [2003]. The question is: What
happens in school after early childhood? Does the school experience contribute to
smothering an inherent desire to learn?
Support for the view espoused by Kohn and others is not universal. For example, James
Dobson founder of the six million member group Focus on Family sees it otherwise.
According to Dobson: [Tauber, Classroom Management, p.49].
To say that children have an innate the love of learning is as muddle-headed as to
say that children have an innate love of baseball. Some do, some don’t.
Dobson believes that coercion and at times Biblically condoned corporal punishment is
appropriate.
What do you say?
2. Factor Two: According to Kohn at any age rewards are less effective than intrinsic
motivation for promoting learning. Kohn states [p.144]:
Children are more likely to be optimal learners if they are interested in what they
are learning … It cannot be assumed that motivation causes achievement to go or
down.
There are few correlation studies available on this point so it cannot be clearly confirmed
or denied. However, there is a causal relationship. Reduced intrinsic motivation does
produce achievement deficits. Furthermore, wanting to do well in order to gain some
perceived good, may actually interfere with achievement.
Motivation is not a one-size-fits all concept. The source and the nature of motivation
including the role played by brain activity are important. Intrinsic motivation is personal
while extrinsic motivation can be collective.
4
Do you employ any of these tactics in your practice?
Related to this is the notion that school and school learning ought to be “fun.” In this
context ‘fun” is a poor choice of word. School as a place of fun — No. However, school
as a place of joy — Yes. In particular high schools are often depicted as sad places and as
“prisons for young spirits and minds.”
[www.edweek.org/ew/articles/1994/09/28/04]l
Marcella Spruce in her Education Week article cited above depicted high school as a
sad place. She went on to comment:
My students used to say, “Can’t we do something fun today?” I don’t think
that’s what they meant, exactly, but it was the only vocabulary that they had.
What they meant, I think, was, can’t we set aside the aura of this sad place,
where failure is defined in numbers and the adults are constantly downcast,
and the temperature is always icy-cold or fiery-hot, and we are shuffled
wantonly from place to place; can’t we for a moment connect with someone
beyond here? With ourselves, for example… I have no idea when high schools
became sad… Certainly, the one I went to dripped with failed dreams and lost
causes.
Can you identify with Ms Spruce in this regard?
3. Factor Three: Rewards for learning Undermine Intrinsic Motivation
On this point Alfie Kohn has observed:
For all our talk about motivation I think we often fail to recognize a truth that
is staring us right in the face; if educators are able to create the conditions
under which children can become engaged with academic tasks, the
acquisition of intellectual skills will probably follow. We want students to
become rigorous thinkers, accomplished readers and writers and problem
solvers who can make connections and distinctions between ideas. But the
most reliable guide to a process that is promoting these things is not grades
or test scores; it is the student’s level of interest….
Now consider the converse: performing well, jumping through the hoops,
doing all the homework studying for the tests, making the grades, grooming
the transcript, pleasing the adults —and hating every minute of it. This
profile fits millions of children.
In this very brief interpretation of three points related to rewards made by Kohn in
Punished by Rewards the bottom line is that “extrinsic motivation undermines
intrinsic motivation.”
What do you think? How does what you think influence your practice?
So What?
What does all this mean to you? Certainly no silver bullet was offered here. No “How
to…” insights were presented. However, has it caused you to think about the extent
to which you either identify with or reject the views of those cited here? If you
advocate the use of rewards as incentives in your practice has what has been shared
here caused you to reflect on what you do? Will what has been shared here influence
what you do? If so, how?
5
Ipse dixit!
Larry Creedon
For Saipan, October, 2008.
http://www.larrycreedon.wordpress.com

A Few Thoughts on Assessment (Testing)

December 19, 2009

1
A Few Thoughts on Assessment (Testing)
Lawrence P. Creedon – Helen Ross
Rationale:
In significant measure Benjamin Bloom’s cognitive domain taxonomy addresses the issue of
testing. To Bloom and his colleagues testing ought to focus on the creative aspects of
learning. The purpose of testing was not the recall and/or regurgitation of low order cognitive
information and comprehension. It had to do with higher order cognition such as analysis,
comparison and contrast, evaluation and application. The most creative aspect of testing
came in synthesis. Synthesis rests on the ability to take what has gone on before in low order
and higher order cognitive pursuits and then through synthesis of what was and is create
something new. Not necessarily new for humankind, but new for the learner. This is
laddering and weaving in a spiraling manner [Bruner] It relates to Vygotsky’s Zone of
Proximal Development. To constructivists it is how human beings come to know. It is how
the learner “creates” new knowledge.
In addressing the issue of testing Bloom commented:
Perhaps the most important condition [of testing] is that of freedom. This should include
freedom from excessive tension and the pressures to adopt a particular viewpoint. The
student should be made to feel that the product of his efforts need not conform to the view of
the instructor, or the community or some other authority….[lpc note: this implies the
textbook.] If the effort is to be creative, the student should also have considerable freedom of
activity – freedom to determine his own purposes, freedom to determine the materials or
other elements that go into the final product, and freedom to determine the specifications
which the synthesis should meet[Emphasis added] Creativity seems to be fostered by such
conditions. Too much control and too detailed instructions, on the other hand, seem to stifle
productivity. [loom, Taxonomy of Educational Objectives, paperback, 1956, p. 171]
Assumptions and Pre-requisites
1. For graduate students assessment or testing ought to be for the purpose of
synthesizing and applying new “knowledge” to individual practice. If what has been
“learned” cannot be synthesized into new “knowledge” for the learner then the
experience remains of limited value. It tends to perpetuate what is or has been rather
than addressing what ought to be.
2. An assessment experience ought to be rooted in student expectations for the learning
experience as well as in instructor “knowledge” of what it is that needs to be known
[ygotsky, Zone of Proximal Development]
3. For assessment purposes learning outcomes ought to be stated as rubrics. Students
ought to participate in the development of rubrics.
4. Grades ought not to be given in response to the normal curve. The normal curve is not
a factor in coming to know or in becoming competent. What others know or what “I”
know in relation to others does not add to my knowledge base or competence.
Competence is the capacity to do what needs to be done. Individual goals do not
submit to the normal curb. Excellence and competence is the capacity to do what
needs to be done regardless of how many others can do it. If all are competent, then
all are worthy of the highest grade available. [an “A”] r example, the goal of the
2
hospital is to make all people well. Hospitals do not determine wellness [rades]on the
normal curve.
Alternatives for Assessment
In developing assessment exercises for a given learning situation the quest ought to begin by
asking the question: How will this experience affect what I do, my practice?
If that question cannot be answered then the quality and effectiveness of the experience has
been minimal.
Alternatives
1. Students can address the above question is a wide variety of authentic assessment
alternatives including essay, oral report, roll playing, simulation, etc.
2. Students can form “Critical Friend” partnerships or groups. In this approach students
can, after the formal learning experience has ended, hold themselves accountable for
modifying their practice in response to the input of the learning experience. Critical
friends can work together on implementing specific “leanings” from the learning
experience. If the issue is one of that the students cannot be “trusted” to follow up
after the formal learning experience is over then that in itself is an indication that
competence and “best practice” is not a primary concern of the “learner.” The
message cannot be conveyed that learners cannot be trusted to learn and then to
practice what they have learned.
3. Students can develop individual or small group “Action Plans” addressing how the
learning experience will impact on what they do and how they practice. Using the
critical friends technique students can assess their effectiveness in implementing their
action plan.
4. Students can engage in a “Reflective Practitioner” exercise as individuals, pairs or in
small groups. In this context the “Reflective Practitioner” approach has four
components: 1. Describe what is currently being practiced, 2. Do research on what the
literature has to say about the topic or issue, 3. Synthesize a plan for what ought to be,
4. Carry out the plan.
5. Students can use their pre course expectations and rubrics and do an exercise in
determining to what extent the expectations have been addressed and to what extent
have the experiences and exercises in the rubrics have been met.
Essentially each of the above has to do with the same thing. Students ought to have the
freedom to determine own purposes, (expectations) freedom to determine the materials or
other elements that go into the final product, (conditions for learning) and freedom to
determine the specifications which the product of the learning experience should meet.
[rubrics] [Bloom, ibid., p.173].
Conclusion
The purpose of a learning experience is not to accumulate and regurgitate information. Rather, it is to
take the sum total of what the learning experience has had to offer[both good and bad], SYNTHESIZE it
with what is and develop a PLAN of ACTION for application of what has been learned to individual
practice. Assessment ought to be approached in this context.
Lawrence P. Ross, Helen L. Ross
June 2003
3

Six Types of Tests

December 19, 2009

1
Six Types of Tests
By
Lawrence P. Creedon and Helen L. Ross
A listing of all the various types of tests available for one purpose or another is all but
endless. However, here very brief attention will be given to six that are the most
frequently used in education.
Teacher Made Tests
Prior to the standardized testing movement encapsulated in No Child Left Behind federal
legislation these are the most frequently used tests in schools. As the name states teacher
made tests are created by teachers for use in their classrooms. They do not pretend to
meet the requirements of test validity or reliability. They reflect what an individual
teacher feels is important. There is a strong possibility that they will be low order
cognitive dealing in the recall of information and then indications of comprehension
(understanding). Some authorities assert they while they are the least valid from the point
of view of measurement, they are the most reliable in indicating what learners “know.”
For the most part to “know” must be understood at the low order cognitive level.
Norm Referenced Standardized Achievement Tests
These are the commercially prepared standardized tests that almost all schools in the
United States used annually prior to the advent of NCLB. The term “Norm Referenced”
means that at some point in time in the recent past (up to about ten years) a commercial
company developed a test usually related to basic skills, tested it out against a sample
population of students, used the results of the testing sample to establish norms for the
grade level or subject area and then marketed the test to school districts. Scores were
reported against a normal curve. For example fifteen percent of the students taking the
test would do well and score above the curve. Fifteen percent would not do well and
score below the curve. The middle seventy percent would hover around the middle, or the
norm. For decades school districts used these tests for internal purposes. Then they began
to use them to compare one school or school district to another. Since the beginning of
their use developers have cautioned against using norm referenced tests as instruments
for assessing individual achievement. However, for the most part individual school
districts ignored that caution and misused the test results. It was common for school
districts to boast about how well they did on a given norm referenced standardized test.
While test developers asserted that this was misuse of the tests, marketing prevailed and
the “bottom line” of the company was sell the test and make a profit. Examples of norm
referenced tests are: The Iowa Test of Basic Skills, The Stanford Test of Basic Skills, and
The California Test of Basic Skills. Educational Testing Service of New Jersey is a leader
in the field.
Criterion Referenced Tests
A CRT is not a norm referenced. There is no sampling population that all scores are
measured against. In a CRT the criterion is established ahead of time usually by “experts”
and the purpose is for each learner to be able to meet the criteria or perform at the level
indicated by the criteria. In a CRT, the learner is being measured or assessed against a
2
criterion or standard, and not a norm. If everyone can perform consistent with the criteria
then everyone is recognized as having achieved. If achieving the criteria translates to an
“A” then everyone receives an “A.” A criterion referenced test is non-competitive among
the learners. It is related to achieving the criteria. Rubrics lead logically to criterion
referenced tests. A simple example of a criterion referenced test is a test for getting an
automobile driver’s license. If the candidate meets the criterion he/she is awarded the
license. It does not matter how many other have a license. It does not rank candidates
against a curve.
Domain Tests
Domain tests are related to a specific body of information or knowledge. A domain test
focuses in on a specific content area. It goes in depth into a specific body of knowledge.
It should be more than low order cognitive information gaining and regurgitation. An
example of a domain referenced test would be an advance placement test in a specific
content area.
Diagnostic Tests
The purpose of a diagnostic test is frequently related to identifying learning needs of
individual learners. It is best administered to one learner at a time. It is not norm,
criterion, or domain referenced. Its purpose is to identify the learning style and needs of a
specific learner. A diagnostic test can also be used to identify strengths and weaknesses
with a curriculum or an instructional program. The results of a diagnostic test provide the
data upon which a prescriptive and corrective plan of action can be based. Many
diagnostic tests are best administered by especially trained professionals; however, there
are “general” use diagnostic tests available. Caution needs to be taken in making
judgments based on these tests.
Psychological Tests
These tests are usually reserved to be administered and assessed by professionals
especially trained in psychological testing. Psychologists would be an example. An IQ
test would be an example of a psychological test. Caution needs to be taken in making
judgments based on these tests.
Footnote: The high stakes standardized tests associated with NCLB are somewhere
between norm referenced and criterion referenced tests. Norm referenced in that they
reflect past and traditional norms for grade levels. Criterion referenced in that they are
developed by “experts” at the state department of education level. Under NCLB every
state must develop its own standards and tests. In reality they are cut from the same mold
and must meet federal guidelines.
Helen.L. Ross/ Lawrence . P. Creedon
Fast Train
July 2005
3

RUBRICS: Characteristics, Categories, Features Applications, and Cautions

December 19, 2009

1
RUBRICS
Characteristics, Categories, Features Applications, and Cautions
by
Lawrence P. Creedon
To make a name for learning when other roads are barred,
take something very simple and make it very hard.
The above witticism may sum up what some think about the whole discussion related to
rubrics and their application in education. Rubrics can relate to assessment or to
achievement. Assessment can be value influenced and can be a subjective judgment as to
proficiency or accomplishment. Achievement is more objective and relates to mastery
and competence.
The Argument
The argument goes that teachers and professors have been marking, grading and ranking
students for centuries using as achievement identifiers or grades symbols such as A,B, C,
etc., numerical scores, percentages, and percentiles. The argument is: It has worked, so
leave it alone. We all know what they mean. And, if it is not broken, don’t fix it.
Umbrage
I, along with a host of others, take umbrage with that long prevailing approach to
measuring and “rewarding” achievement in school. Achievement is not considered its
own reward. My concern begins with challenging the assertion that: We all know what
they mean and it works. I disagree! Convention, longevity and tradition no matter how
revered are not benchmarks of quality and of what ought to be. The traditional
convention of If not broken, don’t fix is no longer relevant to what measurement related
to achievement ought to be.
Focus of This Piece
This brief piece will do no more than introduce aspects of rubrics. It is not offered as an
apology for rubrics, a description of them, a comparative analysis with other approaches
to measurement, or an assessment as to the application and worth of rubrics. All that can
be found elsewhere. However, one caveat seems appropriate and that is that hopefully the
topic and application of rubrics in education would not qualify as an appropriate example
illustrating what Professor Harry G. Frankfurt [Emeritus, Princeton University] has
termed bullshit in his compact book: On Bullshit, Princeton University, 2005.
This piece is limited to chronicling characteristics, categories, features, applications, and
cautions related to rubrics.
Leg on a Platform
Every school ought to have an organic platform for education. Organic in the sense that
in a document the living, vibrant, developing, and responding community called the
school can express its identity in a written platform for education. Rubrics form one of
the four legs of that platform. The other three are learning expectations for one and all,
2
objectives to guide curriculum and instruction, and anticipated outcomes to indicate endsin-
view.
3
Characteristics of Rubrics
1. Rubrics are an alternative way of addressing assessment and/or recognizing
achievement.
2. Rubrics are one of four components in the learning process. The other three are
expectations, objectives, and anticipated outcomes.
3. Rubrics are consistent with stated expectations and anticipated outcomes for the
learning experience.
4. Rubrics ought to be made available to all concerned and made available to those
affected at the beginning of the learning experience.
5. Rubric “scores” for individuals are not intended to be comparative who those of
other learners.
6. Rubrics are not intended to be competitive or provide a vehicle for ranking among
individuals.
7. Rubrics intended to measure achievement ought to minimize subjective judgment.
8. Rubrics are intended to focus on achievement, mastery and competence.
9. Rubrics ought to be purpose specific focusing separately on content, cognitive
development, instructional strategies, affective responses, behavioral responses,
or syntactical concerns.
10. Rubrics can be structured so as to be prescriptive and/or diagnostic
11. Rubrics can be developed by instructors, learners, peers, a combination of these or
an external source such as a learning program publisher or organization.
Categories
Rubrics are not of the one-size-fits-all variety. Each is task or outcome specific.
Before embarking on the development of a rubric there must be a clear and distinct
indication as to what is being measured: evaluated or assessed. Evaluation and
assessment are not synonyms. Rubrics are specific and each relates separately to
content or subject matter, cognitive development, instructional strategies, affective
responses, behavioral responses, or syntactical expectations. The structure of the
rubric will be reflective of what is being measured. It can be assumed that regardless
of the category of the rubric the ultimate goal is achievement, mastery or competence
in the specific undertaking. . However, in some instances an assessment rubric can be
indicative of a developmental process such as in cognitive or affective development,
behavioral responses or motor exercises. What is important is that the specific
category and thus purpose of the rubric be understood at the outset and shared with all
those affected.
Contrary to popular belief all rubrics do not have to follow the sliding scale model of
going from simple to complex or from introduction to mastery. In some instances the
rubric is a definitive Yes or No; mastery or not; competence or not. To force the
sliding scale or laddering model can simply turn the rubric into another way of stating
A,B, C, etc. When this happens the rubric is the same old wine in a new bottle.
Among the categories of rubrics are those that focus on:
1. Cognitive development consistent with Bloom’s taxonomy
2. Affective development consistent with Krathwhol’s taxonomy.
3. Motor skill development
4
4. Basic academic skill development in tool subjects such as reading, math, language
arts, social studies, and science.
5. Academic subject matter competence in subject areas included in the program of
studies.
6. Artistic appreciation and skill development in such areas art and music.
7. Citizenship development in such areas as personal behavior, peer relationships
and social/societal responsibility.
8. Self fulfillment identification and development
9. Vocational and work competence development
10. Technological competence reflective of the state of the art.
11. Diagnosing factors interfering with achieving mastery or competence
12. Other areas consistent with the developmental and mastery needs of learners.
Specific rubrics within each of the categories cited above as well as others identified by
those involved and affected ought to be developed consistent with the learning needs of
the end user. The more the end user can participate in identifying and developing the
benchmarks toward mastery and competence as proposed in the rubric the better. As
indicated earlier rubrics ought to be developed and promulgated to those affected before
the learning experience begins. Coming to know what needs to be known and how you
will know it is known ought not a secret. Learning is not a game of hide-and-seek.
Naturally as the learning experience unfolds rubrics are subject to revision including
modification and redirection.
Rubrics need not be developed a new for each successive cohort of learners. Also,
collaborative and shared development can provide a meaningful professional
development opportunity for practitioners to work together. Over a period of time a
local depository of rubrics can become part of a school based professional library. In that
manner one more plank has been added to the structure of the school as a learning
community characterized by the term system.
Features
In developing a rubric clarity in the anticipated outcome of the learning experience is
paramount. The following features of a rubric can serve as a check [􀀁] list in developing
and applying rubrics.
1. A rubric is not simply an alternative way for measurement to those practices in
common usage for giving a grade. Rubrics features mastery and competence not
competition or comparison.
2. A rubric ought to clearly and distinctly state a behavior or an achievement that
indicates mastery or competence. It is anticipated that all learners exposed to the
rubric can achieve mastery or competence. If not diagnosis and prescriptive
corrective action is called for.
3. A rubric ought not to posit several levels of mastery or competence. Competence
is the capacity to do what needs to be done, not almost done
4. A rubric ought not to be structured on a sliding, spiraling, or laddered model of
mastery or competence unless the rubric is associated with a developmental
5
activity and then such a structure is inherently meaningful to accomplishing the
learning outcome and ought to be utilized
5. Within a rubric qualitative words or terms [adverbs and adjectives] ought to be
defined in the context of their application to the task at hand. Definitions ought to
be promulgated at the outset of the learning experience.
Application
Among the many things asserted in conventional wisdom is that anything worth doing is
worth doing well. It is also asserted by some that everything can be measured. The first
assertion does not seem to have many opposed to it; however, the second is another story.
We will not join the debate in that regard here. Our focus is on the word well. That is the
term that needs to be defined. The definition ought to be one that is operational. The
question is: Will doing something well meet the non-selective, valid and reliable criteria
for what constitutes mastery and/or competence? That is the end-in-view.
The use of rubrics in education ought not to be limited to the classroom side of the
enterprise. Rubrics are appropriate whenever it is part of the equation to have an end-inview
– a purpose. Certainly that begins with faculty considering the purpose and
significance of education. It continues with all those associated with the enterprise
holding themselves accountable via commonly developed and shared rubrics for
everything that goes on administratively, managerially, curriculum and instruction wise,
in recruitment and professional development, organizational structure, finance and
leadership. In short — everything. Rubrics are not limited to kids. Their application fits
for the entire enterprise. If it is being done, the end-in-view ought to be to do it well and
to be able to measure that through evaluation and assessment stated in rubric form.
Cautions
At times school people seem predisposed to taking an idea or innovation and grinding it
down to its most rudimentary form so as to be readily identified with an existing concept
or practice. As a result innovation stalls, all things pretty much remain as they always
have been, skeptics and nay-sayers declare the innovation as nothing more that old wine
and new bottles and the initiative is dead on arrival. In applying such a dismal
perspective to rubrics a few cautions are in order. Essentially these are a restatement of
points made above:
1. Rubrics are but one leg in the four legs of a platform for education. The other
three are: Expectations, Objectives, and Outcomes.
2. A rubric is not simply an alternative way for measurement to those practices in
common usage for “awarding” a grade.
3. Rubric “scores” are not intended to be comparative who those of others.
4. Rubrics are not intended to be competitive or provide a vehicle for ranking among
individuals
In Summation the Point Is That:
In order not to perpetuate the wisdom in the witticism found in:
To make a name for learning when other roads are barred,
take something very simple and make it very hard.
6
the point is that in classroom utilization content rubrics focus on achievement, mastery
and competence. Process rubrics are more akin to assessment.
Ipse dixit!
Lawrence P. Creedon
Krakow, Poland
Research and Evaluation
January-February 2008

Grading, Assessment and the Concept of Doubt

December 19, 2009

Grading, Assessment and the Concept of Doubt
A response by Creedon to students in Varacaibo, Venezuela, 2007
Lawrenced P. Creedon
Each of you has received comments from me in response to your pre course assignments.
Possibly you expected to find a traditional grade assigned to each assignment such as: A,
B, C, etc. That has not been the case. Your assignments have not been graded by me in
the traditional sense. However, I do think it is safe to say that each assignment has been
reacted to by me and, in most cases, in a timely manner. While not graded, your
assignments have been assessed by me. There is a difference and we ought to explore that
difference in class.
A principle associated with a constructivist approach to learning is that “learning begins
in doubt.” It has been my experience that this expression frequently is taken literally and
associated with disbelief. However, that is not what is meant in a constructivist context.
Rather, the term has more to do with a need to know more, to inquire, to wonder and to
reflect. Among the characteristics of professionals is that they engage in such cognitive
exercises as they practice. That is what is meant by the use of the term “practice” so
common in medicine and law. Physicians and lawyers “practice,.” while teachers teach.
The implication is that, in medicine and law, practitioners are learning while, in
education, teachers are “knowers” – channels of wisdom.
What I have attempted to do is to assess your work. Assess it by asking you questions
about what you have said in your paper. I have attempted to challenge you to think
beyond the obvious. And, I have asked you to relate your comments to your practice. If
you cannot relate the things on which we are considering and focusing in our graduate
study together, then what is the point of what we are doing? You are supposed to be
involved in professional development and that means describing what is, determining its
merit, considering its impact on your practice, engaging in the pursuit of better
alternatives, relating the new to your practice and acting to improve your professional
development and practice. This is what is meant when constructivists speak of doubt. The
purpose of doubt is the restoration of belief. at a deeper, broader and more comprehensive
level.
My comments are intended to encourage you to reflect on what you said and the
implications for what you have said on your practice. You were not graded on an
arbitrary standard. My comments were intended to encourage and motivate you to look
broader and deeper into the views you shared in your paper.
Larry Creedon
Maracaibo, Venezuela
October – November 2007.

Correlation implies causation

December 19, 2009

Correlation implies causation
From Wikipedia, the free encyclopedia.
Just about every introductory statistics book will tell you that correlation does not imply
causation. And indeed, assuming that correlation implies causation is a logical fallacy.
For example:
Teenage boys eat lots of chocolate.
Teenage boys have acne.
Therefore, chocolate causes acne.
This argument is an example of a false categorical syllogism. One argument is that the
fallacy ignores the possibility that the correlation is coincidence. But we can always pick
an example where the correlation is as robust as we please. If chocolate-eating and acne
were strongly correlated across cultures, and remained strongly correlated for decades or
centuries, it probably is not a coincidence. The “fallacy” is ignoring something besides
coincidence.
The “fallacy” ignores the possibility that there is a common cause of eating chocolate and
having acne. Take another example: apparently it is true that ice-cream sales are strongly
(and robustly) correlated with crime rates. The explanation is that high temperatures
increase crime rates (presumably by making people irritable) as well as ice-cream sales.
The statement “correlation does not imply causation” is applied as a warning not to
deduce causation from a statistical correlation. But while often ignored, the advice is
often overstated, as if to say there is no way to infer causal structure from statistical data.
Clearly we should not conclude that ice-cream causes criminal tendencies (or that
criminals prefer ice-cream to other refreshments!), but the previous story shows that we
expect the correlation to point us towards the real causal structure.
If you believe this, then you believe that robust correlations imply some sort of causal
story, whether common cause or something more complicated. Hans Reichenbach
formulated the Principle of the Common Cause, which asserts basically that robust
correlations have causal explanations, and if there is no causal path from A to B (or vice
versa), then there must be a common cause, though possibly a remote one.
Reichenbach’s principle is closely tied to the Causal Markov Condition used in Bayesian
networks. That theory behind Bayesian networks sets out conditions under which you can
infer causal structure, when you have not only correlations, but also partial correlations.
In that case, certain nice things happen. For example, once you consider the temperature,
the correlation between ice-cream sales and crime rates vanishes, which is consistent with
a common-cause (but not diagnostic of that alone).
While “fallacy” has been used in quotes, it is still a logical fallacy. If you only have A
and B, a correlation between them does not let you infer A causes B, or vice versa, much
less deduce the connection. In fact, if you only have these two variables, even the most
powerful inference techniques built on Bayesian Networks won’t help much. But if there
was a common cause, and you had that data as well, then often you can establish what the
correct structure is. Likewise (and more usefully) if you have a common effect of two
independent causes.
Another example illustrating this fallacy was a study which found that British arts
funding levels had an extremely close correlation with Antarctic penguin populations.

Testing: One, Two, Three

December 18, 2009

Testing: One, Two, Three
By Anna Quindlen –Newsweek – Updated: 3:59 p.m. ET June 5, 2005
© 2005 Newsweek, Inc, © 2005 MSNBC.com URL: http://www.msnbc.msn.com/id/8099819/site/newsweek/
June 13 issue – It’s that time of year again, when the sweaters come off, the annuals come out, and the students prepare.
For the test, for the test scores, for the test schedule for next year. The kids of America are drowning in multiple-choice
questions, No. 2 pencils and acronyms. Along with the ABCs, there are the GQEs (Graduation Qualifying Exam), the
SOLs (Standards of Learning), the TAKS (Texas Assessment of Knowledge and Skills) and of course the SAT. A group
called the National Center for Fair & Open Testing estimates that public schools give more than 100 million standardized
tests each year.
Full disclosure: in the interests of informed punditry I recently took a practice SAT test, the first standardized test I have
taken since 1969, and by the end I thought the top of my head would blow off. Perhaps it was the reading-comp section on
Keats. Perhaps it was the fact that I believe geometry is the Devil’s work. Perhaps it was simply that doing any task for
nearly five hours challenges what Mother Theodosia used to call the ants in my pants.
But more than anything I was enraged by the process, and by the forced march that seems to have replaced creative
thought, critical thinking and joyful learning for so many kids. In “High Stakes,” a look at the issue aired recently by
CNN’s documentary unit, one teacher in Florida reported third graders sobbing because they were so unhinged by the
prospect of yet another standardized test. “These kids are just tested out,” the teacher said. Third grade?!
Our education system is broken; accountability and standards will fix it. This is the mantra of government testing
programs, from local certifications to the federal No Child Left Behind program, which might as well be called No Child
Left Untested. That last grew out of something called the Texas Miracle, in which the use of standardized tests in that state
quickly led to marked increases in student scores in a way that seemed too good to be true. And it was. Whistle-blowers
reported that teachers helped some kids to cheat in elementary and middle schools, and that some ninth graders were being
repeatedly held back so their performance wouldn’t depress scores for tests administered in 10th grade. The CNN
documentary reported that Austin High School, for instance, had 1,160 ninth graders in 2000, yet fewer than 300 were
enrolled in 10th grade the next year. Figuring that one out would make an interesting SAT problem.
But even with testing free of that sort of fraud, the useful endpoint of all this remains unclear. If test results were
deconstructed to reveal that phonics, say, was a weak point in a classroom, there might be curricular value, but most of the
time the tests are merely scored up or down for the sake of the system—and the press conferences. Teachers are under so
much pressure to teach to the test that they are sometimes forced to move on hastily and concentrate on the narrow and
tedious, to skip over the interesting side issues or questions that make for dynamic learning.
And what does this metastasizing testing, for every subject, at every level, at every time of the year, do to kids? It has to
mean that students absorb the message that learning is a joyless succession of hoops through which they must jump, rather
than a way of understanding and mastering the world. Every question has one right answer; the measure of a person is a
number. Being insightful, or creative, or, heaven forfend, counterintuitive counts for nothing. This is: (a) benighted; (b)
ridiculous; (c) sad; (d) all of the above. You know the answer.
Of course it is important to know that all students have learned to read, that everyone can manage multiplication. But
constant testing will no more address the problems with our education system than constantly putting an overweight person
on the scale will cure obesity. Proponents trumpet the end to social promotion. They are less outspoken about what comes
next, about what provisions are to be made for a student who is held back twice and then drops out of school. The
bureaucrats who have built their programs on test results seem to have lost sight of any overarching point of education.
Who cares if the light comes on in their eyes if the numbers are good?
I wish more parents could find a way to protest this educational form of child abuse. Some states are beginning to do so;
Utah was willing to face the loss of $76 million in federal education funds because officials there decided not to follow
federal testing standards. The Bush administration insists that support for No Child Left Behind, which is largely a massive
testing program, is nevertheless widespread. Officials point to a national survey that offered respondents this choice: which
is the bigger problem, children passing through U.S. schools without learning to read, or children being forced to take too
many tests? Of course any smart kid would see that there’s something wrong with that draconian choice, and that the
inquiring mind looks for answers somewhere in the middle. The real question for the future is whether, after this barrage of
mindless and endless assessment, there will be any inquiring minds left.
~ ~ ~


Follow

Get every new post delivered to your Inbox.