The Wilson Quarterly

Recently, President Obama and others have questioned the effectiveness of teaching in American universities. At most universities today, undergraduate and graduate teaching is judged primarily or even exclusively on the basis of teaching evaluations written by a professor’s students. This system invites corruption, and results in it. A professor who receives many unfavorable student evaluations is probably doing something wrong, but a professor who receives many favorable evaluations may not be a good teacher at all.

Many candid student evaluations appear on the nationwide website RateMyProfessors.com, which includes ratings only from students who access it. Viewing the site, one can notice how many professors score high on “easiness,” which is always considered a virtue. Note also that the chili pepper symbol for “hotness” just means that the professor is physically attractive. A disillusioned former Harvard dean makes a refreshing comment about student evaluations:

Two Harvard psychologists showed that the numbers students assign lecturers after watching only thirty seconds of video with no sound correlate very highly with student evaluations of the entire course at the end of the term. Students watching the brief videos had no information about what the instructors were saying and ranked instructors on personality traits such as optimism and confidence, not on teaching quality. This experiment conclusively establishes that student course evaluations are simply consumer preference metrics of the shallowest sort.

This conclusion may be a little too definite — some students do make thoughtful comments — but only a little.

Once, I noticed that my ratings for a class in Greek history at a university in Florida were considerably lower than usual. In a discussion on Thucydides’ History of the Peloponnesian War, I had asked the class whether the Athenian empire, originally a confederacy of cities for mutual defense, became a tyranny by the time of the Peloponnesian War. All students who spoke agreed that the Athenian empire was a tyranny, because when the city of Mytilene tried to secede, the Athenians besieged and conquered it. This is a perfectly defensible opinion, but I wanted to show that another opinion was possible. I pointed out that most cities never tried to secede from the empire, and that Mytilene, unlike most member cities, was an aristocracy, which surrendered after its common people were given arms by the aristocrats and refused to fight the Athenians. The students insisted that this point was irrelevant: how could the cities possibly be free if they could not leave the empire freely? I pointed out that in 1861, our own state of Florida had seceded from the United States, which attacked it and forced it back into the Union. Did this mean that we were not free? The class fell silent. My remark, which I had meant to keep the discussion going, killed it completely. My evaluations for that course included several criticisms of my lack of respect for student opinions. Since then, I have tried not to make similarly devastating arguments in class discussions.

This case illustrates how evaluations conducted by colleges and universities help create an atmosphere in which students’ opinions are given more respect than is always good for their education. In college and graduate school, my own best teachers repeatedly challenged my opinions and sometimes ridiculed them, not necessarily to make me abandon them but to make me defend them more rigorously and to revise them if they were inadequately thought out. If now I deliberately avoid challenging my students’ opinions as much as my best teachers did — and I regret to say that I do — the reason is that I need to teach my students as I find them, and my ability to teach them is reduced if they are alienated, as my students in that class in Florida were. This problem is chiefly caused by the widespread attitude that students are “consumers” who need to be kept happy with the “product” that they are buying from their professors — which they believe is an enjoyable experience taking the course, not a rigorous education. Yet students should realize that the best professors are not always the nicest ones.

The problem is chiefly caused by the widespread attitude that students are “consumers” who need to be kept happy with the “product” that they are buying from their professors: an enjoyable course, not a rigorous education.

Unofficial student evaluations, like those on RateMyProfessors.com, do no such damage. There students feel free to say, as some do, “Sign up for this prof! I never went to class or cracked a book all semester, and got an A!” Students who write or read such an “evaluation” have no illusions that it really evaluates the course or the professor. They know that the course and the professor taught nothing to students who never went to class, and that such students know nothing about what happened in class. Students never write such things on official evaluations, because they know that they could harm the professor’s reputation and jeopardize future easy As for themselves and others. Why deans and departmental chairmen who are supposed to evaluate teaching seldom look for comments like this on RateMyProfessors.com, or at least never hold such comments against professors, is an interesting question.

One defender of student evaluations, Raoul Arreola, insists in his book Developing a Comprehensive Faculty Evaluation System that “there is no [emphasis in original] consistent correlation between the grades a faculty member gives and the ratings he or she receives from a well-designed student rating form.” Yet Arreola also acknowledges the clear evidence that lower-level courses receive lower student ratings than upper-level ones, required courses receive lower ratings than electives, and courses in the natural sciences and mathematics receive lower ratings than courses in the humanities and social sciences. This difference is unlikely to reflect worse teaching, since most professors teach a mixture of lower- and upper-level and required and elective courses, and Graduate Record Examination scores demonstrate that colleges are actually doing a better job of teaching students in mathematics and the natural sciences than in the social sciences and humanities. The reason is more likely to be that, regardless of the quality of teaching, most students are more interested in their upper-level and elective courses than in their lower-level and required courses, and find their courses in the humanities and social sciences easier than those in the natural sciences and mathematics.

The alleged absence of correlation between grades and ratings need not mean that professors cannot buy better student ratings with easy grading, but only that average student ratings, most of which are favorable anyway, combine ratings given for different reasons. A course may have favorable ratings because it really is good or because it is easy (or possibly both); a course may have unfavorable ratings because it really is bad or because it is hard (or possibly both). Other factors, such as the likeability of the professor and the intrinsic interest of the subject, also affect the ratings. But students learn nothing from a course if they never attend it or do the reading for it, and they cannot learn much from a course if they seldom attend it and do little of the reading for it. They also learn little or nothing from submitting a bad or plagiarized term paper, as most of the worst students do. The answer to people who say that a course can be both good and easy is that, for most students, an easy course is one in which they do almost no work and learn almost nothing. Professors who get favorable ratings from students to whom they give good grades for no attendance and no work are guilty of bribery, and should be penalized, not rewarded, as they now nearly always are.

Like most professors and administrators, I have little direct evidence for how my colleagues teach. But much of the evidence I do have from the six places I have taught is disturbing. To judge from the complaints I receive from students who never go to class and never learn the material but expect good grades, many professors give good grades for little or no work. Many of my students also express surprise that I never give multiple-choice tests or that I correct spelling, grammar, infelicities of expression, and minor factual errors in their papers. Evidently, many professors do give multiple-choice tests and neither correct papers nor in some cases even comment on them. My students are sometimes embarrassingly laudatory in their evaluations of what I think should be normal good teaching. I am also troubled by what I find written on blackboards, whiteboards, or sheets for overhead projectors from classes held before mine: simplistic outlines full of generalities on such subjects as the effects of racism or the causes of American imperialism, often containing misspellings, blatant bias, and factual errors. This sort of evidence, though not necessarily this sort of teaching, has become less common with increasing use of PowerPoint presentations.

Although PowerPoint is needed for some subjects — with the disappearance of slide projectors, my wife uses it for all her art history lectures — I have never used it, having learned long ago that when I showed slides many students stopped listening to what I said and paid attention only to the pictures. A recent and intelligent book by Andrew Hacker and Claudia Dreifus on the problems of higher education includes three suggestions for improving teaching, including “Stop PowerPointing,” because “Files of slides etch the day’s outline in stone; new ideas can’t be added, as they can on a chalkboard.” Another suggestion, “Preventing Plagiarism,” is obvious but sound, and not hard to implement. After my current university dropped its subscription to the effective plagiarism detector Turnitin because it cost money and students disliked it, I returned to my former practice of requiring submission of a rough draft of each paper (which I look at without grading it) before the final draft. The third suggestion is “Monitor Laptops,” because “In almost all the classes we attended, at least half the screens displayed games of solitaire, reruns of sporting events, messages to friends.” This problem is real, but I remain reluctant to adopt the suggested solution because a professor who walks around the classroom monitoring laptops cannot concentrate on teaching, and laptops are convenient for taking notes and letting students introduce new facts into discussions. If students never learn the material, I can find out on the examinations and give them the low grades they have earned.

The real reason that grade inflation is a serious problem, and that those who consider it unimportant are wrong, is that grading is almost the only way to make sure that students do the assigned work and learn something. The majority of students will do the least work they think will get them an acceptable grade. If a superficial knowledge of the material, or no knowledge at all, will get them the grade they want, that is all they will bother to acquire. (Graduate student grades are somewhat different, because graduate students need their professor’s recommendations more than grades, and need to worry that if they do poor work those recommendations might lack the customary enthusiasm.) Even some responsible students with busy schedules will spend little time on a class if they know they are assured of a good grade. Only a few students will do all the assigned work just because they are interested in it, especially because many courses include readings, problem sets, or laboratory assignments that are important to understanding the subject but are not particularly interesting in themselves.

The idea that a good professor can make any material seem interesting is almost as silly as the idea that a good professor can make calculus seem as interesting as human sexuality. In a lecture, one can sometimes make a subject sound more interesting than it really is by misrepresenting it, but misrepresenting anything is bad teaching. The truth is that some subjects are inherently less interesting than others for just about everyone. Samuel Johnson famously wrote of Milton’s Paradise Lost (which he admired), “None ever wished it longer than it is.” The same is true of the conjugations of Greek verbs and much other academic material, especially in mathematics and the natural sciences. I remain grateful that I read Paradise Lost and learned Greek grammar in classes, because I doubt I had the perseverance to do those things properly on my own. Most students, if they know the alternative is a poor grade, will do the assigned work, and in the process may well discover why the work is important and find it rewarding. Just lowering the grade for each missed class is pointless, because many students will show up and pay no attention. The only effective enforcement is to read papers and examinations carefully, giving the good ones high grades and the poor ones low grades.

Thus, grade inflation is bad for several serious reasons. It discourages learning, it allows professors to bribe students to give them better evaluations than they would otherwise receive, it unfairly penalizes the students of professors whose grades are less inflated, and it encourages students to choose classes not for educational value but for easy grading. The consensus among professors and administrators is that grade inflation is not good but that nothing much can be done about it. After all, students might be getting better, teaching might be getting better, we want students to concentrate on learning rather than grades, and we want to encourage struggling students by giving them good grades even when they have trouble. (Notice that this last argument, seldom voiced but probably the most widely believed, is inconsistent with the other three.) Besides, the best minds in American higher education have been unable to find any practical way to halt grade inflation. The disillusioned Harvard dean asserts, “The most effective way to combat rising grades would be to initiate serious conversations, at the departmental level, about what constitutes A, B, and C work — ‘therapy,’ as a colleague called it, rather than regulation.” Whenever an academic administrator talks about “initiating serious conversations” or “therapy,” you can be fairly sure nothing effective will be done.

The best minds in higher education should think a little harder about grade inflation. Since almost all grades are now entered online, the grades in a professor’s class could easily be compared to the cumulative grade point averages of the students in the class. If, for example, a professor gives grades that average 3.85 (B+) to a class of students with overall grade point averages that average 2.75 (C+), we can be pretty sure that the professor is an unusually easy grader. If administrators want to reduce grade inflation, they can first send every professor the number (positive or negative) representing the difference between the average grades he has given his undergraduates and the same undergraduates’ grade-point averages. The administration can then announce that next year, after the first tenth of a point that a professor’s grades exceed his students’ grade-point averages, for each additional tenth of a point the professor will have his annual raise reduced by 10 percent. Though such numbers may seem arbitrary, they are much more objective than students’ evaluation scores. Something like this procedure would exert strong pressure on grade inflation, make grades more reliable, improve students’ education, and reduce a source of corruption that benefits lazy professors and lazy students at the expense of conscientious professors and conscientious students. Yet I doubt this idea will be widely adopted, because such corruption is deeply ingrained in today’s colleges and universities.

Many people think that online education (“distance learning”) can revolutionize higher education, allowing students to take courses from the best professors in the world at scarcely any cost. These predictions seem to me greatly to overestimate students’ self-discipline. In theory, a lecture you can watch and hear whenever you like, as often as you like, is splendid. Likewise, in theory you could turn students loose in a good library and let them read there for four years and emerge with an excellent education. In practice, almost all of us, let alone most students who attend universities today, tend to postpone things that we can do anytime, especially things we are not eager to do anyway. We also tend to listen much more carefully to people who are present and looking at us, even in a large classroom, than to images and voices from a screen. Not listening carefully to an online lecture means texting, talking, playing video games, and doing other things that make paying close attention impossible. Taking careful notes, which is usually a crucial part of paying full attention to a lecture, seems hardly worth doing when you can listen to the lecture again anytime. Discussions are also much more involving and helpful in person than online. Many students never even finish courses that they take online, and if they say they have finished, any rigorous test will show that most of them have learned much less than they would from a regular course. The temptation to give extremely easy tests to such students, or to allow them to cheat on the tests, is correspondingly great.

No doubt various methods can be used to mitigate these disadvantages of online courses. For example, students can be allowed to listen to a lecture only when it is actually being delivered by a professor — but then many expensively equipped studios are needed to allow many professors to lecture at the same time, and the students lose the advantage of being able to hear lectures over again. Similarly, technology can allow the professor to see the students’ faces in many small windows on a computer screen — but any professor who tries to pay attention to all the windows is constantly distracted from teaching. Though combining screens can help the professor and students in discussion sessions, most students will still have trouble paying attention to such sessions. Attentive professors may see enough of their students to make it hard for them to cheat on examinations, but what about crib sheets positioned above the computer screen, or a friend giving information behind it? Examinations can be proctored in a classroom by graduate students, but then why not have the graduate students teach the whole course in the classroom?

Certainly, an online course in which students learn nothing can be cheaper than a regular course in which students learn nothing.
For students to learn nothing from no course at all is even cheaper, and equally valuable.

Such are the reasons that online education has failed to take the academic world by storm so far, and will probably fail to dominate it in the future, except as a low-quality substitute for low-quality lecture courses. Certainly, an online course from which students learn nothing can be cheaper than a regular course from which students learn nothing; but for students to learn nothing from no course at all is even cheaper, and equally valuable. The continuing interest in online education in universities today is in part a sign of indifference to whether students learn anything or not. Another factor may be that online education is promoted by computer scientists who design online courses in computer science, a subject that is uniquely suited to computers and attracts people who prefer to study without human contact. Yet even if online courses are an effective way of teaching computer science to some computer scientists, the same methods cannot be successfully adapted to other fields.

What, then, is the best way to improve college teaching? Some people, especially professors of education, think that much more time should be spent on training graduate students to teach. The same attitude leads education professors to think that what high-school teachers need most is training in teaching, not training in the subjects they will teach, with the result that many high-school teachers use the latest methods to teach their subjects poorly. Most graduate students receive some training in teaching now, usually from working as teaching assistants with senior professors and often from brief teaching seminars. A little such training should do them no harm, unless it convinces them to rely heavily on PowerPoint and simplistic outlines. A few basic skills are equally applicable to teaching almost all subjects, and can be taught, more or less: speak clearly and audibly, prepare your classes carefully, organize your material coherently without oversimplifying, avoid overestimating or underestimating your students’ knowledge or abilities, and tell relevant jokes if possible. Beyond such basics, however, very different skills are needed to be an effective teacher of different subjects.

The best basic training for teaching any subject is taking undergraduate and graduate courses in it, then emulating what you found worked best in those courses and avoiding what you found not to work. The best advanced training in teaching is to teach your own courses conscientiously and to use trial and error to refine your knowledge of what works. A certain degree of empathy is also helpful, and most prospective teachers should be able to master the social skills needed to be effective teachers if they try to develop them. Yet they can avoid trying if they rely on easy grading, which will give them acceptable evaluations without effective teaching — just as it gives their students acceptable grades without effective learning.

Although good teaching skills are useful for effective teaching, even brilliantly taught courses can be worthless. In a celebrated experiment done in 1972, an actor “who looked distinguished and sounded authoritative” was hired to impersonate the fictitious “Dr. Myron L. Fox, an authority on the application of mathematics to human behavior,” and to deliver a lecture on “Mathematical Game Theory as Applied to Physician Education” to three different audiences of psychiatrists, psychologists, social workers, teachers, and administrators. The lecturer was told to make his lecture and his answers to questions a farrago of “double talk, neologisms, non sequiturs and contradictory statements … interspersed with parenthetical humor and meaningless references to unrelated topics.” On a questionnaire given to the members of all three audiences, favorable responses heavily outnumbered negative responses, except that a significant minority thought that “Dr. Fox” tended to dwell on “the obvious” (though most of what he said was actually nonsense). “Dr. Fox” was praised for his interest in his subject, the examples he used to clarify his material, his well-organized presentation, and being stimulating and interesting. If these were the responses of people who had the professional expertise to evaluate a lecture that was carefully designed to be meaningless, how reliable can undergraduate teaching evaluations be?

Very few university professors are as ignorant of their subjects as “Dr. Fox,” but a large majority of professors at less prestigious institutions — and a large minority at more prestigious ones — do little or no research. Most are mediocre teachers, who spend little time on their courses and repeat the same lectures from the same notes for years. As the sociologists Christopher Jencks and David Riesman put it some time ago:

Teachers cannot remain stimulating unless they also continue to learn, and while this learning may not focus on small, manageable “research problems,” it is research by any reasonable definition. When a teacher stops doing it, he begins to repeat himself and eventually loses touch with both the young and the world around him. Research in this general sense does not, of course, necessarily lead to publication, but that is its most common result. Publication is the only way a man can communicate with a significant number of colleagues or other adults. Those who do not publish usually feel they have not learned anything worth communicating to adults. This means they have not learned much worth communicating to the young either.

In other words, being a good teacher requires knowing one’s subject well, having an active interest in it, and keeping involved in it. Except for a few professors who suffer from writer’s block, this means writing on their subject.

Studies on the relationship between teaching and research should be somewhat suspect, because they almost always measure teaching by using student evaluations, which we have seen are unreliable, while measuring research by using numbers of publications, regardless of quality. Yet we should still look at such studies as we have. According to one authority, “the most comprehensive treatment of this subject” is a 1987 article that summarized thirty studies of varied samples at a range of dates. It found that “research productivity is positively but very weakly correlated with overall teaching effectiveness (as assessed by students).” But this overall finding masks major differences in the qualities evaluated. Much the strongest correlations were positive ones between scholarly productivity and (in order) (1) “teacher’s knowledge of the subject,” (2) “preparation [and] organization of the course,” (3) “clarity of course objectives and requirements,” and (4) “teacher’s intellectual expansiveness (and intelligence).” By contrast, scholarly productivity showed (statistically insignificant) negative correlations with just three qualities: (1) “instructor’s fairness; impartiality of evaluation of students; quality of examinations,” (2) “teacher’s encouragement of questions and discussion, and openness to opinions of others,” and (3) “teacher’s availability and helpfulness.”

Note that all four of the strongest correlations were with the most important features of any good course: the professor’s knowledge of the subject, preparation and organization, clarity, and intellectual distinction. By contrast, all three of the marginally negative correlations corresponded to the complaints that poor students usually make when they earn bad grades: the professor was unfair or biased, gave an examination that was too hard, preferred his informed views to their uninformed ones, and refused to raise low grades. In fact, even the badly flawed instrument of student course evaluations appears to indicate that the best teachers are those who do research. This article found no basis for the idea that research distracts professors from good teaching.

Such studies apply largely to lectures, which even professors who know little about their subject can prepare from a few textbooks or reference works. Discussions, seminars, and tutorials are harder to prepare, since they require the professor to react spontaneously to students’ ideas, which in seminars and tutorials take the form of reports and papers drawn from sources that the professor may not know well. Scholars who know their field thoroughly from research and writing may scarcely need to prepare at all for seminars and tutorials, though they should still read the students’ reports and papers and comment on them with care. Professors who have done little research will either need to spend long hours learning whatever is likely to come up in class, or do a poor job of conducting the discussion, tutorial, or seminar. If well conducted, however, discussions, tutorials, and seminars are often better ways of teaching than lectures, above all because they involve the students more and force them to pay more attention to the subject. The more the professor knows about the material, the better the tutorial or seminar will be.

Professors who know their subject well and have published on it are much more likely to teach it well than professors with a weak grasp of the subject who have never published on it, regardless of their teaching skills. The latter sort of professor may still get good teaching evaluations for easy grading, receptiveness to student ideas, and being available to chat. But none of these things necessarily makes a good teacher. If you want to improve college teaching, you should hire good scholars.

* * *

Warren Treadgold is National Endowment for the Humanities professor of Byzantine Studies and Professor of History at Saint Louis University. He has also taught at UCLA, Stanford, Hillsdale College, UC Berkeley, and Florida International University. This essay was adapted from a chapter in his forthcoming book, The University We Need.

Cover photo courtesy of Gettysburg College

Read Next

Russia, the U.S., and the backstory behind the breakdown