By Kylie Jue
It’s 11 p.m. Your program is crashing, and the assignment is due at midnight. You know where your error is, but you don’t know how to fix it. A simple Google search pulls up tons of working code, but copying that directly into your program would be violating the Honor Code.
Yet it seems like there are always some students who give in to the temptation.
Stanford’s Office of Community Standards’ annual case reports demonstrate a steady increase in the number of Honor Code violations in computer science (CS) each year.
However, the enrollment in CS classes has also risen significantly over the last five years. In the 2013-14 academic year, over 1,500 students took CS 106A: Programming Methodology, the first course of the introductory programming series in the CS department, and computer science has grown into the largest major on campus.
“Enrollments are on the rise,” said Mehran Sahami ’92 M.S. ’93 Ph.D. ’99, professor of computer science and associate chair of education for the CS department. “If you imagine the same percentage of Honor Code violations is taking place but now the enrollment in the class is twice what it used to be, you’re going to see twice as many violations.”
Moss: a measure of software similarity
Few students realize that each assignment they submit is run through a program called Measure of Software Similarity (Moss). Moss is used to detect similarities between code submissions and is considered a standard across most U.S. institutions. Used by computer science departments worldwide, Moss was developed by Stanford computer science professor Alex Aiken.
Developed 20 years ago during Aiken’s time as a professor at UC-Berkeley, Moss is designed to cut down the manual labor of searching for plagiarism. However, Aiken emphasized that Moss is not a system for detecting plagiarism directly.
“What Moss detects is similarity,” Aiken said. “Why the code is similar is a human judgment that has to be made.”
Since code might be similar for various reasons, including shared libraries or starter code provided for the course, an instructor must check to see whether or not the similarity returned by Moss constitutes plagiarism. Moss simply helps professors determine which programs should be flagged with a higher priority for further review.
“It’s kind of smart about what ‘the same’ means. It knows, for example, that if you go through and change all the variable names, it doesn’t matter,” Aiken said. “It knows some things about programming that allow it to make some judgments about when two things are really the same and when they’re not.”
Aiken explained that Stanford uses Moss to compare submissions not only against other students’ work in the class but also against code available online and submissions from previous years and quarters. The program looks for aspects such as similar structure, identical bugs and other unusual commonalities.
According to Stanford computer science lecturer Julie Zelenski ’89 M.S. ’96, Moss also accounts for algorithms used by a significant number of students.
“It also knows how to recognize when something is just so idiomatic that everyone did it the same way,” Zelenski said.
Zelenski also explained that since the introductory classes are so large, a graduate student has been hired to build tools to manage the logistics of finding high-priority matches in Moss for 106A and 106B before passing the data on to the course instructor.
Computer science lecturer Marty Stepp also spoke about the importance of having a teaching assistant (TA) to examine the Moss results first.
“He does all of that part for us, basically gets [the MOSS results] all ready for us to look over, produces a nice executive summary for us of the data and places it in front of us in an easy way for us to examine,” Stepp said. “We look over the top matches and decide which ones are concerning to us, and that works really well because he can help us have a faster turnaround.”
Stepp also emphasized that the student does not play a role in deciding whether or not a case should be pursued.
“The instructor should make that decision so it’s important to note that his role has nothing to do with innocence,” Stepp said. “[The TA] doesn’t make that call for us. We wouldn’t ever farm that out to a third party.”
Stepp also explained that although the section leading program plays a large role in grading assignments for students in 106A and 106B, section leaders are not responsible for looking for plagiarism. Instead, they are told to contact the instructor if misconduct is suspected, and according to Stepp, most of the time section leaders do not have the experience to accurately recognize an Honor Code violation.
Dealing with violations
If a violation is discovered, students usually have two options: choosing the Early Resolution Option (ERO) or allowing the case to go to a hearing.
When a case is brought to the Office of Community Standards (OCS), a student is assigned a judicial officer to advocate for the student. If the case moves to trial and eventually a hearing, a panel of five students and one faculty member determines whether or not the student’s actions constitute a violation of the Honor Code. If a student is found guilty by a five-to-one vote, the panel assigns a sanction to the student.
According to Stanford’s OCS, standard first-time sanctions are usually a quarter of suspension and 40 hours of community service unless the panel finds that another response is more appropriate.
First instituted in 2009, the ERO gives students the opportunity to immediately accept responsibility for their actions and skip the judicial hearing process. Zelenski explained that the ERO gives students an incentive to take responsibility for their actions rather than lying to avoid a sanction. In addition, while hearings can take two months or longer, the ERO can be resolved within a few weeks.
“The vast majority of the students who I’m sending [to OCS] are choosing to pursue an ERO alternative [rather] than go to a full hearing,” Zelenski said.
Even before sending a student to the OCS, different instructors handle potential honor code violations differently, Sahami said.
“Depending on the results they get back [from Moss], some [instructors] will just report them immediately without talking to the student. Some prefer to talk to the student,” Sahami said. “We try to be consistent with respect to how we handle the Honor Code – it’s just a matter of time constraints and personal preference.”
Sahami himself asks to meet with students and gives them the chance to take responsibility in advance before telling them about the Moss match.
“We have a conversation of what happened, and what I look at are things like how contrite is the student,” Sahami said. “Virtually every time I’ve talked to a student they’re very straightforward about what happened. In some cases, it sometimes takes them a little while to realize that honesty is the best policy, and they shouldn’t make up some story to get out of the situation.”
Sahami also explained that while he is required to report the case to the OCS, he is on a student’s side once he or she admits to violating the Honor Code.
“I think the issue of academic honesty should also be one that’s not just punitive but is really about learning,” Sahami said. “I’m in favor for leniency on the side of students who take responsibility.”
As a new instructor in the CS department this year, Stepp explained that he in fact has not yet dealt with any Honor Code violations, although he has not finished looking through assignments from this quarter.
“I’m lucky enough to say that I have not forwarded any cases so far in my last two quarters here at Stanford,” Stepp said.
“I think that there are always programs that come up in our listings that I feel are suspicious,” he added. “But I feel I have to be very confident in a match in order to make an accusation, and I haven’t done so.”
In terms of repercussions, an Honor Code case does not show up on a student’s transcript, but a notation is kept on administrative record for reference in case of future violations.
According to Zelenski, common forms of punishment include a failing grade in the course, community service, suspension or probation. For repeated or unusual cases, expulsion is also a rare consequence.
Why it happens: instructor perspectives
Despite Stepp’s violation-free stint on the Farm, Zelenski explained that in her experience teaching CS 107: Computer Organization and Systems, she usually encounters one or two cases for each of the seven assignments.
Zelenski spoke about some of the main causes for Honor Code violations in CS, including the stress created by midnight deadlines and code that does not work. She explained that even if a program is well written, a small bug can cause big problems that feel like failure.
“When you have a program that doesn’t work, it’s usually really obvious and really hard to cope with,” Zelenski said. “The arbiter of correctness is obvious and apparent.”
“Having our deadlines at midnight probably doesn’t help because in some sense that’s maybe when you’re most weak – when you’re tired,” she added.
Furthermore, copying and pasting working code from an online source makes cheating in CS especially easy compared to in other subjects.
“It’s ridiculously easy,” Zelenski said. “It can all happen in just the space of a few minutes when [a student is] just starting to feel panicked and desperate.”
In general, the availability of code on the web has become an inevitable problem, especially since that form of plagiarism does not require the consent of two parties. Although Zelenski regularly searches the web for online versions of assignments, she explained that catching every instance is difficult and spoke about a case from this quarter in which around 10 students in CS 107 used parts of code that had been uploaded to the software hosting site GitHub.
“At the beginning of the quarter and a couple times along the way I just do the obvious searches on all the places that code gets put,” Zelenski said. “If I ever find it and it’s from a student of mine, I send them a very nice note and say I really want this down.”
In February, Zelenski discovered that a former CS 107 student had uploaded the CrashReporter assignment online. She requested the code be taken down, and the student complied, but not before a non-Stanford student forked the repository and copied it to a completely different location. When she later discovered the second version of the code, Zelenski assumed students would not find it.
“You have to be looking for the code in ways [in which you] will find code so I [thought], no one’s going to find it,” Zelenski said. “Well, it turns out I would be wrong about that – that there were a number of students who found it and totally used it.”
When 10 Honor Code cases were revealed by the Moss results, Zelenski tried to ask the non-Stanford student to take down the program. After a lack of response, she eventually worked with the student who originally owned the code to use a Digital Millennium Copyright Act (DMCA) procedure to remove the code from GitHub.
Despite the hassle, Zelenski explained that getting assignments off the web was worth the time. She used the incident as proof that, given the opportunity, some students will plagiarize.
“If you get the code off the web, you just bring down the incident rate,” Zelenski said.
Stepp further emphasized the effects of having code available online, especially for introductory classes. He explained that the problem is compounded by the fact that assignments in CS classes are often kept constant for several years, and new ones are difficult to create.
“It takes more time to produce a high-quality computer science homework assignment than, say, a math assignment,” Stepp said. “On a math assignment, you can produce some new problems by changing some of the numbers and some of the variables, and you have a new equation to solve.”
Stepp also discussed the fact that students who took the class in the past are more likely to share their solutions than students currently taking it.
“I think students who took a class a long time ago are more cavalier about handing out their solution,” Stepp said. “But if they just wrote it and someone else didn’t write it and they want to take it, I think the friend is less likely to [share the code].”
Zelenski believes that another source of Honor Code violations is students’ belief that they can beat the system. With Moss mostly unadvertised in Honor Code discussions at the beginning of courses, many students do not understand how easy it is for instructors to catch plagiarism.
“I do not think that the computer science students at Stanford are somehow more vulnerable, less committed to integrity than our political science students or [other majors],” Zelenski said. “There’s a certain kind of confluence of motive and opportunity and desperation, and I think there’s also the fact that we can find it.”
Stepp also emphasized that, despite stereotypes in the discipline, collaboration is not the same as plagiarism.
“We do want people to work together and talk to each other – we just don’t want them to share their solutions with each other,” Stepp said. “There’s a line. I think most people know what that line is, and it’s unfortunate that some people cross the line, but I think it’s possible to have a decent amount of collaboration with other students without crossing that line.”
Why it happens: student perspectives
Unsurprisingly, students told a similar story about the effects of stress in contributing to Honor Code violations. Two students, Jeff Taylor and Walter Smith – both names have been changed to protect their identities – spoke about their experience with the ERO system in particular.
Taylor and Smith had been working as a pair on a group CS final project, but when Smith allowed a friend outside the team to look at their code, they received an unexpected email from their instructor after assignments were submitted.
Unknown to either student, the friend had copied part of Taylor and Smith’s code and submitted it as his own. The friend claimed that the submission had been an accident, and the case ended with an ERO. Although neither Taylor nor Smith were charged with an Honor Code violation, Smith’s friend reportedly received a failing grade on the assignment.
When asked about the case, Smith and Taylor shared their thoughts about why students might choose to plagiarize.
“There are very few times in which you break the Honor Code for anything but grades. It’s just pressure,” Smith said. “The deadline is set at 12 a.m., and you’ve been working on it for 10 hours, and you just have one bug left, but you’re helpless, and so you go to someone for help just to get this one bug.”
Taylor spoke about the fine line between getting help from a friend and cheating in writing code.
“There’s a very fuzzy line from getting help and cheating in CS,” Taylor said. “If you need to find a bug, instead of spending three hours doing it, you can go to someone and do it in five minutes. And if you get in the habit of doing that, you will basically have other people writing your code. A lot of it is pressure or procrastination.”
“In the case of CS specifically, the getting help expedites the process so much,” he added. “And it can snowball into cheating.”
Smith observed that international students in particular feel pressure to do well from family at home.
“International students seem to face this [pressure] a little bit more than people from [the United States],” Smith said. “There’s more pressure from back home to do well in academics, to do really well. If you get a B, it’s unacceptable at any level. While here, people realize that it’s Stanford and that getting a B here is not that bad. But back home no one knows what Stanford is or the grading scheme.”
Yet neither student felt that cheating was ever justified, and both believed that plagiarizing without learning the material would only hurt a student in the long run.
“The popular opinion is that it’s morally wrong to cheat because you should learn the material for yourself,” Taylor said. “In other countries, cheating is an option.”
Curbing cheating in the future
With the rising number of online sources for code, instructors have continued to investigate ways to discourage students from plagiarizing. Stepp attributed his lack of Honor Code cases to his slight alterations to the assignments as a new lecturer.
Stepp has also added popups to the class websites to caution students against plagiarism whenever they download or submit an assignment. According to Stepp, research has shown that reminders do in fact affect the probability of a student’s choosing to cheat, especially if the reminder comes before the student starts an assignment.
“It’s really hard to know what factors have an impact and which ones don’t, but there has been educational research that shows if you remind students about these things at key moments that it can make a difference,” Stepp said.
Jessie Duan ’15, a student on the Board of Judicial Affairs, spoke about how the Board has been discussing raising awareness about Honor Code violations. One potential effort includes talking to freshmen about plagiarism.
“One thing we have been focusing on is raising awareness,” Duan said. “We’ve been talking about reaching out to freshman dorms more, making it more obvious from the moment you get accepted into Stanford [that] the Fundamental Standard is this very important thing.”
Each computer science lecturer and professor emphasized that the department has gained a bad reputation for Honor Code violations simply because it has the tools to search for plagiarism.
“I think that it is true that computer science departments as a whole tend to do more checking than other departments,” Aiken said. “A lot of what you find depends on whether you’re looking.”
Stepp also wanted to dispel the misconception that students need to work alone on CS assignments.
“There is a stereotype that we don’t want anyone to talk to anyone and that computer science is isolated and that you’re not allowed to have any contact with other people, and you have to figure it out all by yourself with no help whatsoever,” Stepp said. “And I think that’s a little bit unfortunate because it’s not true.”
Contact Kylie Jue at kyliej ‘at’ stanford ‘dot’ edu.