A way of detecting software plagiarism which uses spaces and tabs instead of code comparisons has found 36% of computing students at Dublin City University copying programs from fellow undergraduates.
The method, described in BCS academic publication The Computer Journal involves inserting into a program a 34-bit field including a student identity, the student's year of entry to the university, the assignment identity and other information.
The data is coded using a space to represent zero and a tab to represent one.
"The data appears as blank space, and is unlikely to be changed if the program is modified," said researchers Charlie Daly and Jane Horgan of the School of Computing at Dublin City University.
"In addition, generally, text editors do not show excess white space at the end of a line. This fingerprint, or watermark, can be used to ascertain who originally submitted the program and who submitted a copy.
"Existing source code comparison methods are reasonably successful at grouping similar submissions but they have no way of distinguishing between the author and the copier.
The researchers added, "This method works even when the program has been extensively modified, as long as the watermark remains undisturbed.
"It can detect copying in very short programs. With such programs students may come up with similar solutions by chance. Source code comparison systems cannot distinguish chance similarities from cases of copying.
"The method requires no manual intervention. By contrast, code comparison techniques merely highlight suspicious cases, which then need to be examined to determine whether plagiarism has occurred. Naturally, this is subjective, as well as tedious and time consuming."
The method only works if the copier makes an electronic copy rather than keying in the program again, because the watermark will not be keyed.
The method was tested on 46 programming exercises completed by 283 students in the first year of a computing degree course at Dublin City University.
It found that 101 students (36%) had copied at least one of the exercises. Most copied one or two, but some were more active, including one student who copied 19 of the 46 exercises. In addition, 48 students (17%) allowed others to copy their programs.
The copying increased during the year. Daly and Horgan said this might be down to students leaving the work to the last minute and therefore becoming more tempted to copy.
"The results are of additional interest in view of warnings about plagiarism given to students at the beginning of the course," they said.
The research found that crime does not pay: the copiers got an average mark of 37%. The people they copied from got an average of 58%, and those who were not involved in any way got 51%.
The difference in marks between the last two groups is not surprising, according to Daly and Horgan. "Copiers may be dishonest but they are not stupid, and will choose their suppliers carefully," they said.