Rob Barton: May 2011

In looking around for research opportunities, I made an indirect connection between the research a friend of mine was doing and something I had been thinking about at work. I helped with the research on a pilot study. It was using a rubric (Boote & Beile) to measure the quality of dissertation literature reviews. It turns out literature reviews aren't taught consistently, and (no surprise here) are not consistently performed properly either. I'll just say that after reading a stack of dissertations, my confidence in my own ability to do academic research increased dramatically.

The conversation has been had several times about how strange it was for my friend doing a literature review on doing literature reviews, just as last summer I was working on designing instruction about designing instruction. My thoughts turned to whether it would be possible to do something at a level that could improve more of the higher education world, of which PhD studies are a fairly small part. One interesting idea I came up with deals with an analysis of the gap between expected outcomes from current students in higher education and recognized outcomes from graduates of higher education, with cross sections based on type of institution, level of degree, subject area, etc. My guess is that we'd find that students aren't fully realizing the benefits they expect to receive; that is, there is a misalignment somewhere. Think about what you were asked to do in your first job out of college and whether you used anything that was listed as an objective for any of the courses you took. While I still think that is an interesting topic to pursue at some future date, it's not quite dissertation material at this point.

At work, we follow a fairly well structured design process to come up with what we call the competencies and learning objectives for our courses. Competencies are roughly equivalent to educational objectives in Bloom's Taxonomy, and the more focused learning objectives match up with Bloom's instructional objectives. Going with a more wide angle, we also couch everything for a degree program within a conceptual framework, which Bloom would probably call global objectives.

While there is research on Bloom's Taxonomy, much of what I've seen has to do with validating the order, leading to a recent swap in the last two levels since researchers had identified that evaluation is less cognitively demanding than creation. There are other taxonomies out there, which overlap quite a bit, but I have yet to come across any good research that really supports the most famous of educational taxonomies. I believe it, because it makes sense, but who is to say if something is missing or if all the current components are really necessary? One article talked about how the iconic triangle should be flipped on its point so it opens up as you reach higher cognitive levels. Whatever. I don't care about the job aid. I want to know if the taxonomy is right. So the proposed research is to rate the quality of a course designed using Bloom's Taxonomy and compare that to some measure of the actual quality of the finished course developed according to that design.

I haven't pulled this together as a formal literature review yet (but I will be pulling out my Boote & Beile rubric when I do). Just a quick compilation of features from several models out there, and I come up with an unpolished rubric that I hope may be useful in rating the quality of a course design, with design being the first D in ADDIE, not the kind where you're choosing background colors.

While I have not yet found anything substantive related to measuring the quality of the initial design of a course, Merrill does have his 5 Star and e³ rating forms based on his First Principles of Instruction along with some other theories, which are intended to rate a developed course. At that point, however, it's a little late. The course is already created and perhaps already taught. It would be more helpful to rate the design of a course before it's developed. Fix the problems before developing the instruction on a flawed design.

Again, this is not complete or valid in any way. The models I considered, based on our practice at work and general research I'm familiar with in the area, in no particular order, are as follows:

Wiggins & McTighe - Backwards Design Model
Merrill - First Principles of Instruction
Gagne - Nine Events of Instruction
Anderson & Krathwohl - Revised Bloom's Taxonomy
Mager - Preparing Instructional Objectives
Knowles - Principles of Andragogy

The principles or characteristics of effective design that align with the above theories, and that serve as a first draft of rubric principles are:

Problem-based or realistic (publishing work, problem solving, creation)
Alignment between assessment and content (teach what you test and test what you teach)
Appropriate path to complete objectives (flexibility for diverse learners, no orphans, properly scaffolded)
Measurable behavior (specific, discrete, observable)
Level of performance (alignment with content and audience)
Clarity (one verb, understandable, meaningful, maybe combine with measurable)
Metacognition
Bridging prior knowledge and future practice
Any other suggestions as to what's missing?

I'm picturing a four point scale, 0-3 for each item. The 0 is obvious; it simply doesn't meet the requirements. A score of 1 would indicate minimal coverage. A score of 2 would indicate significant alignment but missing higher level components or perhaps most objectives are written correctly and a few incorrectly. The goal is to score a 3, meaning all principles and characteristics are present and appropriately applied.

The rubric itself would take the form of a table, which I'll put together later, but the principles or characteristics would be listed down the left and the points across the top, with a description of what is needed to receive that score in the intersecting boxes.

A graduate-level course, for example, with all multiple choice tests would likely score poorly on problem solving and realism but could score okay on the alignment if you're planning on teaching at the scantron level. One could argue whether completing a multiple choice test is measurable behavior; I'd maybe put that in the 1-2 range, since while the correctness of the student's answers can be easily measured, it's debatable whether it can be considered behavior. The level of performance is obviously wrong if we're talking about giving grad students multiple choice tests, so we're talking maybe a 1 here if the questions get past a superficial level. There's not going to be much metacognition going on.

For an undergraduate-level course, on the other hand, while some of the principles may still be rated low in terms of realism and metacognition, the level of performance may be appropriate. It may even be possible to score decently on bridging prior knowledge and future practice if the scantron content is considered foundational knowledge that will support more in-depth performance in later classes.

Good, bad, too much, not enough? Other principles, theories, or methods that should be included? Has anyone actually done something like this for instructional design? What measures other than inter-rater reliability could be used to validate the score from this type of rubric?

Rob Barton

Thursday, May 19, 2011

Instructional Design Rubric