A way of detecting software plagiarism which uses spaces
and tabs instead of code comparisons has found 36% of computing
students at Dublin City University copying programs from fellow
undergraduates.
The method, described in BCS academic publication The Computer
Journal involves inserting into a program a 34-bit field including
a student identity, the student's year of entry to the university,
the assignment identity and other information.
The data is coded using a space to represent zero and a tab to
represent one.
"The data appears as blank space, and is unlikely to be changed
if the program is modified," said researchers Charlie Daly and Jane
Horgan of the School of Computing at Dublin City University.
"In addition, generally, text editors do not show excess white
space at the end of a line. This fingerprint, or watermark, can be
used to ascertain who originally submitted the program and who
submitted a copy.
"Existing source code comparison methods are reasonably
successful at grouping similar submissions but they have no way of
distinguishing between the author and the copier.
The researchers added, "This method works even when the program
has been extensively modified, as long as the watermark remains
undisturbed.
"It can detect copying in very short programs. With such
programs students may come up with similar solutions by chance.
Source code comparison systems cannot distinguish chance
similarities from cases of copying.
"The method requires no manual intervention. By contrast, code
comparison techniques merely highlight suspicious cases, which then
need to be examined to determine whether plagiarism has occurred.
Naturally, this is subjective, as well as tedious and time
consuming."
The method only works if the copier makes an electronic copy
rather than keying in the program again, because the watermark will
not be keyed.
The method was tested on 46 programming exercises completed by
283 students in the first year of a computing degree course at
Dublin City University.
It found that 101 students (36%) had copied at least one of the
exercises. Most copied one or two, but some were more active,
including one student who copied 19 of the 46 exercises. In
addition, 48 students (17%) allowed others to copy their
programs.
The copying increased during the year. Daly and Horgan said this
might be down to students leaving the work to the last minute and
therefore becoming more tempted to copy.
"The results are of additional interest in view of warnings
about plagiarism given to students at the beginning of the course,"
they said.
The research found that crime does not pay: the copiers got an
average mark of 37%. The people they copied from got an average of
58%, and those who were not involved in any way got 51%.
The difference in marks between the last two groups is not
surprising, according to Daly and Horgan. "Copiers may be dishonest
but they are not stupid, and will choose their suppliers
carefully," they said.