EDUCATION PRACTICE AND INNOVATION
Critical Reading of “Making Sense of Confusion” by Jason E. Dowd, Ives Araujo, and Eric Mazur1
Physics Department, Boston University, 590 Commonwealth Ave, Boston, Massachusetts, 02215, USA
This post has three parts:
1. An original paper (click here for pdf)
2. Appendix I: Why did I write this paper and how school teachers can use it? (including what differs a science from a religion)
3. Appendix II: What did judges say about this paper and my response to Eric Mazur regarding his response to this paper.
The scientific method developed to study physical phenomena presents a proven instrument for conducting research in any other ﬁeld of science. Yet, vast amount of literature on physics education research does not represent examples of application of that scientific method, even if the researchers are physicists. In this paper the author offers a critical reading of one of recent papers published by Physical Review Letters. The goal of this work is to stimulate a conversation on how the scientific method developed to study physical phenomena can be applied to study phenomena in realm of education.
Physics represents a perfect example of how the scientific method should be applied to study, well, everything. At ﬁrst a scientist observes, collects facts, develops vocabulary, classiﬁes objects and processes, tests some preliminary ideas, but in the end the scientist formulates postulates (a.c.a. axioms, or laws). It usually is impossible to test the postulates by direct measurements; the consequences derived/predicted from the laws, however, can and should be tested, and while the experiments agree with the predictions, we believe in the correctness of the postulates, we keep using the theory. Of course, every theory has limits, hence when experiments contradict the theory, a scientist starts thinking, is it something wrong with the experiments, or the limits of the theory ﬁnally have been reached? The Newtonian Mechanics, the Maxwell’s theory of electromagnetic phenomena, the Einstein’s theory of Special Relativity, the Einstein’s General Theory Relativity, the Euclidian Geometry are some of the bests and clearest examples of such approach. Can the same approach be applied outside of physics, say, to study learning and teaching phenomena? The answer to this question depends on a personal view. In 2002 Richard Hake wrote2: (begin the quote) “There has been a long-standing debate over whether education research can or should be “scientific” (e.g., pro: Dewey 1929, 1966, Anderson et al. 1998, Bunge 2000, Redish 1999, Mayer 2000, 2001, Phillips and Burbules 2000, Phillips 2000; con: Lincoln and Guba 1985, Schon 1995, Eisner 1997, Lagemann 2000). In my opinion, substantive education research must be ”scientific” in the sense indicated below. My biased prediction (Hake 2000a) is that, for physics education research, and possibly even education research generally: (a) the bloody ”paradigm wars” (Gage 1989) of education research will have ceased by the year 2009 (italic by Valentin Voroshilov), with, in Gage’s words, a ”productive rapprochement of the paradigms,” (b) some will follow paths of pragmatism or Popper’s ”piecemeal social engineering” to this paradigm peace, as suggested by Gage, but (c) most will enter onto this ”sunlit plain” from the path marked ”scientific method” as practiced by most research scientists” (end of quote). Thirteen years later this prediction looks overly optimistic.
In many papers, even written by scientists who have been using the scientific method in their ﬁeld, the authors do not seem applying the same way of reasoning when writing a paper on education. At the least, that indicates the fact that the authors do not believe that the same scientific method applied to study physics (chemistry, mathematics) should be applied to study education. At the most, that indicates the fact that the authors do not believe that the same scientific method applied to study physics (chemistry, mathematics) can be applied to study education.
For example, let us read the latest publication by Eric Mazur1 and his colleagues. The main statement I want to make after reading the article is that the methodology (which we call “a scientific method”) which had been developed and being used to study physical phenomena can and should be used for conducting research like the one described in the paper, but the paper does not show the use of that methodology.
Below I will try to support this statement by analyzing the study described in the paper. Clearly, my analysis of the study is based on certain assumptions I made during the reading.
The ﬁrst assumption is that one of the goals of the study was to ﬁnd a correlation between: (a) the fact that students are offered to answer questions designed to generate confusion, and assess how confused they are: and (b) learning outcomes. This assumption is based, in part, on the statement: “We ask the following question: To what extent are course performance, . . . related to confusion?”.
I argue, that if one wants to study such a correlation, one can (and should) use the same methodology which had been developed and being used to study physical phenomena. In the latter approach, one has to compare two (at least) study cases: “Case 1” is when students do not have to answer questions designed to generate confusion and do not have to assess how confused they are; “Case 2” when students have to answer questions designed to generate confusion, and have to assess how confused they are (the “confusion” element becomes a part of a learning experience). The scientific method also demands that the “Cases” should not be different from each other by anything else but the “confusion” element, which means: student body in both “Cases” should be similar by the number of students, by the age, race, background distribution (for large classes it is reasonable to assume that these conditions are satisﬁed), students’ course work should be very similar (except the “confusion” element), faculty involvement should be similar, learning outcomes should be measured by the same measuring. If these conditions are not held the learning outcomes of student might be affected by many uncontrolled parameters and the examined correlation cannot be established.
While reading the paper, however, one cannot ﬁnd any indication on how the introduction of the “confusion” element to some students influenced their learning outcomes, compared with students for whom the “confusion” element has not been a part of the learning process. It is not clear either the authors did not use the scientific method on purpose, or used it but the paper does not provide a clear description of doing that (the further analysis indicates that the former is more probable than the latter). The presence of this ambiguity in the description of the study makes the study scientifically deficient (I consider an ambiguity of a scientific study as a deficiency). Many similar studies experience a similar deficiency. It might have helped for a reader to navigate through a paper if at the beginning of the paper the authors would clearly state if they meant using the scientific method (the one developed to study physics), or they did not mean to use the scientific method on purpose.
Another deficiency of the paper (as well as many other similar publications) is the fact that the use of the scientific method would have eliminated the need for spending time and effort on collecting data which, when scrutinized, do not really support or contradict the hypothesis of a study. Instead, the conclusions of a study could have been derived from a set of well-established facts, a.k.a. “postulates” or laws.
Below I provide several illustrations to the statement made above. Setting the terminology aside, the introduction of the paper tells us that: (a) sometimes students get confused (and we know about that because students express their confusion in words or in actions); (b) students often have their own opinion on how good or how bad they can be when doing physics in general or when solving a specific problem; (c) helping students to redirect on their own thoughts, actions, and feelings may help them to perform better. To this point we see a complete agreement with everyone’s teaching experience.
(a) Every teacher knows that students ask questions; what to do about it and how to manage each question (or how to initiate questioning from students who never talk) is a different conversation.
(b) The fact that different students may have different thoughts about themselves (in the variety of contexts) is also an everyday experience of every teacher (and again, we will not discuss in this paper what the best strategy is for a teacher teaching a class with students who have different self-perceptions).
(c) The correlation between “help” and “performance” can be derived from a more general principle (which is used use as one of the postulates of the Teachology: a practical science of teaching and learning), i.e. for most people (who do not have extraordinary deviations from average abilities) learning outcomes are directly proportional to the volume and variety of learning experiences (below, the “Postulate”).
For example, a teacher teaches a standard course (lectures, labs, discussions, homework). That leads to certain learning outcomes. If we accept the Postulate, we have to make a conclusion that, if the teacher will make students to do something else (reasonably related to the material) and do it on a regular basis over a long period of time, the teacher can expect learning outcomes to be better. In particular, making students (in addition to what they would have done before) watching movies, or reading additional texts, or discussing qualitative questions, or making them to reﬂect on what they read and how they felt will result in better learning outcomes.
One can compare any two teaching strategies by counting the amount of learning activities students will have to perform in each. If the material covered is similar by the topics and the volume, but the use of one strategy results in a visibly larger number of learning activities, that strategy will lead to better learning outcomes.
Ballet trainers, sport coaches, parents use this “rule” every day; people say: “practice makes perfect”, and that works every time as long as the practice provides a sufficient volume and variety of learning experiences.
A question like: “Will it affect learning outcomes if in addition to what students have done in the past they will be forced to do such and such?” does not always represent a research question. If “such and such” is related to the learning material, learning outcomes will be better. If learning outcomes did no improve, hence using “such and such” was the wrong choice, or “such and such” has not been used for a long enough time. The question a teacher should ask is “how can I make students to do “such and such” in addition to what they already do?” This question, however, is not a research question; this is a practical (i.e. social by its nature) question. Of course, the teacher assumes that the additional learning experience (“such and such”) will lead to better learning outcomes. But this assumption is an assertion (“I believe in the Postulate”) and not a scientific hypothesis, even if it looks like such (like, the assumption that “if I take this root I arrive home faster” is not a scientific hypothesis).
Not any possible question should be called a hypothesis, and not any possible activity which leads to an answer should be called a research (please, refer to chapter 4 of my book “Becoming a STEM Teacher” for the extended discussion of this topic).
A research question could have been stated in the following form: “Will learning outcomes improve if we keep the amount of learning activities and the total time of learning practically the same as in the previous course, but rearrange some activities or replace them with different ones?” Unfortunately, as it has been mentioned above, in many papers, including the one under the discussion, there is no available information, which would allow readers to see the specific procedural (technical) differences between the new and the previous learning processes.
Reading the article, however, indicates that in the study described in the paper students - in addition to their regular learning process - had been doing something else: “students were assigned 22 ... and 21 reading exercises”, “in each assignment, the confusion question was posed before the two content-related questions, followed by a ﬁnal opportunity to revise the response. . . ”. The statement that “at least two – and sometimes as many as three or four – researches and instructors reviewed and discussed each content-related question . . . ” also shows that during this particular teaching process students have been treated differently then students not participating in the study (the content-related questions were developed using a higher level of involvement of developers).
Based on what I read, I made a second assumption, namely, that courses taught during the study described in the paper were different from the courses taught before the study by the use of the “confusion” element. Based on this assumption and on the Postulate stated in part (c) I made a conclusion, that the results of the study should be obvious (i.e. should support the Postulate, or the design of the study should be reexamined). If we accept the Postulate, we should expect that the additional practice will be “positively related to a ﬁnal grade”. In a sense, this study supports the effectiveness of the Postulate (like a working clock supports the effectiveness of the Newton’s laws).
Next I would like to address brieﬂy one specific statement from the introduction. “One cannot express confusion without engaging in metacognition, which involves knowledge and cognition about cognitive phenomena”. The purpose of this statement is to begin a discussion about metacognition. It is naturally to expect, however, that a student who can explain reasons for his or her actions “will be more strategic and effective in the educational setting” than a student who can just act without being able to explain why did he or she act the way he or she did. This conclusion is a straightforward consequence from the Postulate formulated above in part (c). An ability to explain the reasons for his or her actions does not come with a birth; it requires a specific type of practice and, of course, a designated time. Hence two students – one who can and another one who cannot explain the reasons for his or her actions – must differ by the volume and variety of learning experiences. No surprise that every research “consistently suggests that enhanced metacognition is positively related to learning outcomes”. In the end students’ results had been positive, which agrees with the following quote from the paper: “Specifically within physics, researches observe that adding metacognitive tasks to reading-comprehension exercises results in higher post-test scores when compared to a group of subjects who do not complete the metacognitive tasks”. This is an example of a statement which often sounds like: “We divided students into two groups, in one group students were instructed to learn “that”, in another group students were not instructed to learn “that”, the result is, students in the ﬁrst group learned “that”, and students in the second group did not ”. The statement itself, however, is wrong; one can and very often expresses confusion without engaging in metacognition; expressing confusion in many cases is just an emotional reaction to inability to understand something which a person feels like to be expected from him or her to be understood. Every human being might experience many different states, like hunger, tiredness, angriness, confusion. Saying “I am confused” is no different from saying “I am tired”, “I am angry”, etc. It does not require any metacognition. Although, one could redeﬁne “metacognition” by including in it any statement people make about themselves, however it would water down the sole meaning of this term and would make it useless. The discussion regarding the effect confusion might have on students’ outcomes leads, basically to a conclusion that sometimes confusions is good and sometimes is not. Every experienced teacher, of course, will agree with this conclusion. However, the mere fact of expressing confusion should not lead to a large change in learning outcomes because it does not involve any additional mental work. The outcome depends on what work has been done to reduce that confusion. An interesting research question is what type of work (step by step guiding, giving away an answer, initiating peer-to-peer conversation, etc.) and under what circumstances would be the most efficient way to decrease or “eliminate” that particular confusion.
The technical realization of the study has been described very clearly and can be used by any instructor who would like to use for his or her purpose qualitative indicators of confusion and confidence.
1. “Making sense of confusion: Relating performance, confidence, and self-efficacy to expressions of confusion in an introductory physics class”, Jason E. Dowd, Ives Araujo, and Eric Mazur, Phys. Rev. ST Phys. Educ. Res. 11, 010107 – Published 3 March 2015, http://journals.aps.org/prstper/abstract/10.1103/PhysRevSTPER.11.010107
2. Hake, R. 2002. Lessons from the physics education reform effort. Conservation Ecology 5(2): 28. [online] URL: http://www.consecol.org/vol5/iss2/art28/
3. “Teachology 99.9: Everything, people who care about education, should know about teaching”, Valentin Voroshilov, http://teachology.xyz/Teachology99.htmIves Araujo, and Eric Mazur1
Reading what people think about teaching is an important part of being a professional educator. The main goal of such a reading should be solidifying the personal views on teaching and learning by comparing them to what written by other educators. A teacher needs to keep in mind that not every paper published in a science journal represents results of a solid scientific study. Since TeachOlogy is only in its infancy as a science, there are many papers which have internal inconsistencies or logical flaws.
For example, I had to write a review on a draft of a paper about how teachers percept different diagrams. The question presented as a research question of the study was: “To see if science teachers and non-science teachers would describe diagrams differently”. Different diagrams (with no criteria presented why they had been selected and others not) had been offered to different teachers. In the end the authors concluded that “all teachers could not describe diagrams at the same level as an expert physicist could”. The conclusion is clearly inconsistent with the study question (there is an easy fix, though; the authors could have studies “the differences between diagram description provided by teachers and an expert”).
Another example, which might be of an interest for a physics teacher, is a paper “Some Consequences of Prompting Novice Physics Students to Construct Force Diagrams” by Andrew F. Heckler (International Journal of Science Education; 2010). After reading a 21 pages paper we learn that 891 students had to solve some problems; some of the students had a prompt “use a force diagram”, and others did not have it. For a teacher, the paper provides a very strong motivation to think about how diagrams may help students to solve problems, and also is a very good source for further reading on the topic. However, this paper does not provide a logical “cause and effect” relation, as a science paper should. Students’ ways of solving problems is influenced the most by instructions and problem-solving examples provided during the instructions.
From the paper we can only learn that some students were taking a “typical” physics course, and others were taking an “honors-leveled” course.
The authors assume this description is clear enough, but in reality, a name or a type of a course has no correlation with the actual instructional techniques. In particular, we do not know if some of the students had been exposed to a problem similar to the one offered during the experiment; we do not know how similar or how different (and for how many students) the offered problems were comparing to the ones solved during taking a course. Hence, such strong factor as “similarity” had not been taking into an account, and the study cannot be used for making any scientific or practical predictions.
It is very common for people (especially for teachers) to feel awe when meeting a university professor, a scientist. “This guy is so smart; the guy has a PhD for God’s sake” (BTW: an example of a dogmatic type of thinking; more on the difference between a science and a religion).
Yes, it is true, but it does not mean one has to believe everything a scientist say, especially if the one is a teacher and the scientist talks about teaching.
A short example above is to demonstrate that if a paper has been published in a journal, it does not mean we should just accept everything said in it.
Things to keep in mind.
1. An experienced teacher might not sound as eloquent as a scientist but may know much more about teaching, especially if the scientist has no real teaching experience at a middle or high school level. When I listen to a speaker, the first question I have is what is his/her teaching experience (to me it means that the speaker knows or not what he or she is talking about).
2. We should admire science, but also should keep in mind that doing science requires basically advanced reading and writing skills – and anyone (if healthy and have time) can do what 99.99 % of scientist do (0.01 % falls on such geniuses as Newton, Einstein, and others). Ordinary people like you and I are capable of getting PhD, as long we put enough effort and time in the work (unfortunately, no everyone has such luxury as time which can be spent of learning).
3. A science is like a religion – a finite number of words put in sentences, often supplied by symbols, pictures, and (for a quantitative science) by sets of numbers, graphs, equations. From a descriptive point of view, there is no difference between a religion and a science: both have postulates (statements which cannot be proved and people just believe in them because of some reasons), both have statements logically derived from other statements (this logic might also be of a mathematical nature). The difference is not between a science and a religion, but between a scientist and a religious person.
A scientist accepts a possibility of his or her believes (postulates) to be overturned (proved to be wrong, or limited), and a scientist is open to a discussion about his or her believes. A religious person cannot accept a possibility of his or her postulates to be limited, wrong, overturned; a religious person will just deny any other postulates or statements if they contradict his or her believes. Hence, when listening to a scientist, a teacher should try to infer information on his or her believes, and (a) to compare with teacher’s own believes (nothing good could come out of a forced collaboration if people have very contradictory believes), (b) to confront some of the postulates a scientist uses as building blocks for his or her theory of teaching (a true scientist is never afraid of such confrontation, and a teacher should not spend time on communication with a “not-true” scientist).
FYI: of course, there is an important difference between a religion and a science; they have different goals. A religion is about morality, social norms, what is right and wrong to do (that is why there are many religions), A science is about truth; about a correct description of the world (that is why there is only one physics, chemistry, etc.).
4. A teacher should read at least a couple of scientific papers a year (the best would be having a subscription to a magazine). However, when reading scientific papers, a teacher should critically analyze each premise, each conclusion, and most of all, if this work can be of any use for a teacher. Writing a short critical essay on a paper which just had been read is also a good experience and useful practice.
I would recommend everyone to read the original paper or Dr. Mazur and and then my paper and provide a critique for both (this helps to advance our critical thinking skills and also to strengthen a personal view on what research in education should be about).
BTW: everyone is welcome to leave feedback at my blog at http://gomarsnow.blogspot.com
Appendix II: What did judges say about this paper and my response to Eric Mazur regarding his response to this paper.
Below I offer the comment on my first draft from the reviewers of the magazine. The first one is fairly technical. But the second one made me feel very at ease, because the reviewer expressed many sentiments similar to my own views (FYI: thanks to the reviewers, the final version of the paper has significant changes from the first draft).
• EXPERT 1
• Technical Points: 2
• Original Creativity: 3
• Words & Grammar: 2
• Relevant to Journal: 4
• Topic Novelty: 4
The article is interesting but needs to have some heavy editing. For example, it uses abbreviations (such as Phys. Rev. special topics – PER.) in the text and in the title (such as Mazur et al.) and goes into different directions. There is a lack of focus from one section to the other. Perhaps the authors can develop and outline so that the manuscript follows this outline.
Also an organizer needs to be included at the beginning of the manuscript. This will help both the authors and readers to understand the manuscript.
Subheadings would also be helpful in making transitions rather than having new ideas jump at the reader all of a sudden. it is highly recommended that the authors have a professional editor work on the manuscript
• EXPERT 2
• Technical Points: 5
• Original Creativity: 5
• Words & Grammar: 5
• Relevant to Journal: 5
• Topic Novelty: 5
This is a real educational research paper. I strongly recommend its publication. I hope this journal can become a forum to attract more papers like this one.
1. I quite agree with the author that most papers published in voice-leading science education journals are in fact a part of an academic game, which only result in negative effects on education. It is not exaggerated to say that every paper one comes across in such journals is rubbish. Even the policies of teaching content orientated journals are not quite right. There should be a forum to correct this problem and the task cannot be accomplished by the voice-leading science education journals, at least not in their current forms.
Most of the science educational papers sound scientific by using statistical methods but in fact they are nonsense. Most science educational specialists do not even understand the very basic fact that science education is in essence science teaching.
They only contented with the superficial understanding of science. The chemical educational specialists are not chemists at all; the physics educational specialists may not be real physics teachers; the mathematical educational specialists may not be mathematicians. All the science education specialists are educational specialists but they even might not be qualified to be science teachers.
2. By the way the sentence
“(b) The fact that different students may have different thoughts about themselves (in the variety of contexts) is also a part of an everyday experience of an every teacher (and again, we will not discuss in this paper what is the best strategy for a teacher teaching a class with students who have different self perceptions).”
Should be read as: (b) The fact that different students may have different thoughts about themselves (in the variety of contexts) is also an everyday experience of every teacher (and again, we will not discuss in this paper what the best strategy is for a teacher teaching a class with students who have different self-perceptions).
3. In the following sentence, “per-to-peer” should be peer-to-peer. “An interesting research question is what type of work (step by step guiding, giving away an answer, initiating per-to-peer conversation, etc.) and under what circumstances would be the most efficient why to decrease or “eliminate” that particular confusion.”
Soon after publishing my essay “Critical reading of “Making sense of confusion” by Eric Mazur et al.” I received a personal letter from the authors. The letter was very informative and helped me realize that some parts of my essay may need further clarification. Obviously, I cannot publish a personal letter without the authors’ permission, but I would like to provide my respond to their respond to my essay, which, hopefully, makes the main statements of my essay clearer.
Dear Jason E. Dowd, Ives Araujo, and Eric Mazur, thank you for reaching out to me.
The fact that the magazine published your paper and rejected mine, and yet you found worthwhile to write a respond, strengthens my view on the importance of an open discussion of the methodology used in the field of PER. I am glad that the goal of my paper - “to stimulate a conversation” - seems to be achieved.
It seems to me that you have misunderstood the main statements I wanted to make in my essay. It might have been my fault as the author of the paper, which was not clear enough to avoid any misinterpretation.
That is why I would like to clarify some of the ideas of my paper.
The main statement I make is that the methodology (which we call “a scientific method”) which had been developed and being used to study physical phenomena can and should be used to conduct research like the one described in your paper.
In my essay I try to support this statement by analyzing the study you described.
Of course, my analysis of your study is based on certain assumptions I made during the reading.
The first assumption is that one of the goals of the study is to find a correlation between: (a) the fact that students are offered to answer questions designed to generate confusion, and assess how confused they are: and (b) learning outcomes. This assumption is based, in part, on your statement: “We ask the following question: To what extent are course performance, …. related to confusion?”.
If my assumption is wrong my essay has no direct relation to your study.
I argue, however, that if one wants to study such a correlation, one can (and should) use the same methodology which had been developed and being used to study physical phenomena. In the latter approach, one has to compare two (at least) study cases: “Case 1” is when students do not have to answer questions designed to generate confusion, and do not have to assess how confused they are; “Case 2” when students have to answer questions designed to generate confusion, and have to assess how confused they are.
The scientific method also demands that the cases should not be different from each other by anything else but the “confusion” element, which means: student body in both cases should be similar by the number of students, by the age, race, background distribution (for large classes it is reasonable to assume that this condition is satisfied), students’ course work should be very similar (except the “confusion” part), faculty involvement should be similar, learning outcomes should be measured by the same measuring tools – otherwise the study does not really help to deeper our understanding of the realm under the study.
While reading your paper it is not clear either you did not use on purpose the methodology which had been developed and being used to study physical phenomena (a.k.a. “the scientific method”), or used it but your paper does not provide a clear description of doing that. Based on this ambiguity, I also made a statement that many other similar studies experience a similar deficiency (I consider an ambiguity of a scientific study as a deficiency).
While reading your paper, I made a second assumption that courses described in your study are different from the courses taught before by the use of the "confusion" element. Based on this assumption and on the principle stated in my essay (“for most people (who do not have extraordinary deviations from average abilities) learning outcomes are directly proportional to the volume and variety of learning experiences”) I made a conclusion, that (again, if my second assumption is correct) the results of your study should be obvious (i.e. should support the general principle, or the design of the study should be reexamined).
Sincerely, Valentin Voroshilov
P.P.S. Every human being might experience many different states, like hunger, tiredness, angriness, confusion. Saying “I am confused” is no different from saying “I am tired”, “I am angry”, etc. It does not require any metacognition. Although, if you redefine “metacognition” by including in it any statement people make about themselves, it waters down the meaning of this term and makes it useless.
Thank you for visiting,
Dr. Valentin Voroshilov
Education Advancement Professionals
Dr. Valentin Voroshilov
Education Advancement Professionals
To learn more about my professional experience:
The voices of my students
"The Backpack Full of Cahs": pointing at a problem, not offering a solution
Essentials of Teaching Science
Dear Visitor, please, feel free to use the buttons below to share your feelings (ANY!) about this post to your Twitter of Facebook followers.
The voices of my students
"The Backpack Full of Cahs": pointing at a problem, not offering a solution
Essentials of Teaching Science
Dear Visitor, please, feel free to use the buttons below to share your feelings (ANY!) about this post to your Twitter of Facebook followers.