About Thomas Lancaster

I am an experienced Computer Science academic, best known for research work into academic integrity, plagiarism and contract cheating. I have held leadership positions in several universities, with specialty in student recruitment and keen interest in working in partnership with students. Please browse around the blog and the links, and feel free to leave your thoughts.
Website: http://thomaslancaster.co.uk
Thomas Lancaster has written 136 articles so far, you can find them below.


Emerging Issues In Plagiarism Prevention And Detection – My View From 2004

Writing About Plagiarism In 2004

My hard disk contains quite an archive of material I’ve prepared, but which has never seen the light of the day. Some of it is good, some of it deserves to be formally completed, some of it I could never quite work into a shape that I was happy with at the time.

I want to share with you some extracts from a partial paper I wrote in early 2004. All the quoted text is presented, unedited, just I left it in the draft over 13 years ago. Had this paper been completed in a form I was happy with, the choice of words would likely have gone through further fine tuning.

To put these extracts into their historical context as part of my research journey, I completed my PhD on plagiarism detection in 2003. Later that year began working as a Lecturer in Computing at the University of Central England (now known as Birmingham City University).

My working title for the paper was “Fresh Issues in Plagiarism Prevention and Detection” and the paper was constructed to:

discuss the issues that will be relevant to plagiarism prevention and detection in the near future

as well as to:

inform the directions in which it is necessary for future research to proceed

The planned paper ended up taking a back seat with the pressures of adapting to the pressures of a new teaching intensive environment. My subsequent research efforts ended up going in a different direction.

In hindsight, perhaps this paper did deserve to have been completed. My experience is that this type of paper tends to be well-received.

An earlier paper of mine, Plagiarism Issues In Higher Education, which I wrote alongside my PhD, is one of my most cited papers. That is despite this being one of the first papers I wrote – and also one of the easiest to write. I presume that, being introductory in nature, meant that this paper was accessible by a wide audience.

Now, I tend to publish material of this type as blog posts. Perhaps not the best strategy if the results also suitable for citation…

Eight Plagiarism Issues

The draft I wrote in 2004 included updated ideas from my PhD thesis together with observations I’d made during the intervening year.

Here’s what I said in the draft paper…

Eight main issues have been identified that are worthy of further investigation. These include both issues of academic and practical interest.

The issues are:
 
outsourced submissions – has work submitted by a student actually been produced by that student?
 
ownership of work – is it both legal and ethical to submit work from students to detection services.
 
tool usability – there are many technical solutions available to find out if work is similar to another source, but are those tools produced to ease tutor workload?
 
extent of cheating – conflicting evidence exists stating how common cheating is, can parity exist between different subjects and different researchers?
 
policy – how far is reducing the level of plagiarism and the methods to deal with plagiarisers related to appropriate from the upper echelon of an academic institute?
 
earlier exposure – are students plagiarising due to practices accepted in further education being condoned in higher education and, if so, can what are the solutions?>
 
transparency – how far can students see that a due-process is being followed for plagiarism prevention and detection?
 
open source detection – does an institution committing themselves to commercial detection technology hinder them in long term planning?

 
Although my writing style has developed since and I would likely use more supportive language, many of these issues are still equally relevant today.

The issue referred to as outsourced assessment, of course, has been much developed in the form of research into contract cheating. The particular example given, that is authenticating if the author of an assessment solution and the student submitting this for academic credit are the same person, has still not been solved for anything other than very specific cases.

User Experience

I’d like to pick up on one of the 2004 issues as worthy of more immediate attention.

Whether or not software tools for plagiarism detection are optimised for user experience continues to be questionable. The fact that similarity reports are often misinterpreted – and that users cannot always differentiate between similarity and plagiarism, suggests otherwise.

Much valuable progress has been made since 2004 on working with students as academic integrity partners. That includes supporting students in developing their academic writing by providing them with controlled access to appropriate software tools, such as those that show similarity. I have seen far too many tweets where students are boasting about getting their similarity (plagiarism) score down to an unrealistic level.

Improving the usability of support tools, for instance by making the results more readable and the practical steps to take more intuitive, is now important for students too.

The user interfaces for originality checking software tools do not seem to have evolved, in any real sense, since the first commercial providers came onto the market. There is an opportunity for thought leadership here.

One of the major challenges for academics investigating possible non-originality is taking the output from a tool and converting this into a format considered acceptable for a university academic misconduct panel. Often, panels still require information in a printed format and I know of academics who have had to spend many hours laboriously marking sources up by hand. This is an area which is ripe for improving the user experience.

There is certainly the opportunity for a PhD to look at redefining these user interfaces. If you would be interested in working on that area, under my supervision, please contact me and let me know.

I also believe that the potential exists for artificial intelligence techniques to be used to provide personalised help for a user accessing a similarity report. Such AI could be used to consider whether or not similarity is likely to represent plagiarism and where in the document a user should focus their priority (whether this is a student learning academic writing who has forgotten to cite their source, or a tutor investigating possible plagiarism).

Plagiarism Prevention And Detection Issues Of 2017 And 2018

What are the main issues that exist for individuals researching plagiarism prevention and plagiarism detection today? Is it appropriate to consider issues previously identified, such as my ideas from 2004, during the production of a more up to date list?

Which of the many issues then most deserve to be quickly addressed?

Do feel free to share your thoughts using the Comment box at the end of the post.

(and now that I have extracted some value from them, the rest of my outdated materials from “Fresh Issues in Plagiarism Prevention and Detection” can safely be moved to the Recycle Bin)

Beyond Contract Cheating – Towards Academic Integrity

I explored the issue of contract cheating and the growing movement towards academic integrity as part of a teaching and research seminar at the University of St. Andrews.

You can see the slides used in the presentation on my SlideShare account. They are also embedded below.

This was one of the most respectful audiences that I’ve ever presented for, with an extended period of questions and answers afterwards. I was also pleased to see students in the audience.

From the discussion, it was clear that many in the audience hadn’t really considered that contract cheating happened. I hope the discussion we had about assessment design techniques was useful. I don’t think that it’s ever possible to completely design out cheating from assessments, but it is possible to make this a strategy that students wouldn’t consider effective to use.

Contract Cheating – The Threat To Academic Integrity And Recommendations To Address Essay Mill Use – Video

Here is a short video introduction to why contract cheating is a problem (it only last 1 minute and 39 seconds).

The video uses some of the recommendations from the Quality Assurance Agency (QAA) report on contract cheating, released in October 2017. I was part of the team steering the report and have been speaking about it in media interviews. It’s great to see the national push asking universities to address this form of academic misconduct.

If you find the video useful, feel free to go ahead and share it. The direct link to the YouTube page is here.

What I don’t do in the video is define contract cheating or go into a lot of detail about it. I deliberately wanted to keep this one short and shareable.

The video looks at why contract cheating is an issue, some recent numbers about the extent of contract cheating (the source in the video says that 7% of students have contract cheated at least once) and to look at solutions, particularly regarding the movement to work with students and promote academic integrity.

If you prefer to read, or want more information, a longer version of the same contract cheating story is on my Linked blog.

Cutting The Costs Of Open Access Research

Is it feasible to run a high quality open access journal with operating costs of just $6 USD (£4.50 GBP) per paper?

Other open access journals often charge upwards of $500 USD to get a paper reviewed and published, but $6 USD per paper is the model that has been proposed by Kyle Niemeyer.

$6 USD Per Paper?

I came across an interesting presentation from Kyle given at SciPy 2017 and also documented in a more traditional paper where he discussed the design and development of the Journal of Open Source Software (JOSS). This particular journal is used to archive software packages and largely exists within the $6 USD per paper cost range, although there’s no reason that a similar technique wouldn’t work for more traditional papers.

Kyle calculates the $6 USD per paper figure, which depends on the journal publishing 100 articles per year, as follows:

  • Crossref membership (needed for DOIs and journal indexing) = $275 USD per year + $1 USD per paper = $375 per year
  • Web hosting using Heroku = $19 USD per month = $228 per year

Total = $603 USD per year

(or $6.03 USD per paper)

The system looks to largely be dependent on GitHub.

As expected, many attendees at the open source conference where JOSS was discussed expressed positive views of the idea:

Subsequent discussion has however noted that there are some sacrifices needed to get the $6 USD per paper cost.

For instance, this requires heavily on volunteer labour, including from those people developing the software to “run” the journal in the first place. A lot of free work is put in by reviewers and editors, although that’s true of many open source journals. There may also be issues with creating redundancy in the system, which is something that’s important for the long-term archiving of academic papers.

At present, charges aren’t made directly to authors. The journal is relying on funding that has been put into it to cover the running costs. For this to be more sustainable in the longer-term, consideration to funding would need to be made, including all of the legal entity issues that come with handling money and the needs to guarantee service.

Alternative Approaches

There may also be ways to cut the costs still further. Martin Paul Eve suggests that CrossRef membership with 50 DOIs included could be possible for €75 per year (£66 GBP, $90 USD). He also recommends the use of the CLOCKSS archival service at $200 USD per year, which may solve the issue of needing reliable long-term service and archiving. He also suggests the use of Open Journal Systems, which could remove some of the technical complexity.

One idea that I’d like to see explored further would be more use of peer-to-peer hosting to archive academic papers. (Legal) torrent style services could be used which would also introduce some further redundancy into the system.

There could well be an interesting student project looking at putting these different approaches together in a way that is both cost-effective and allows for a new open access journal to be set up with the minimum possible technical complexity.

Taking all of these issues into account, it would be challenging for a journal to maintain a $6 USD publication point. But it should be possible to substantially cut the costs of open access publishing from the figures that researchers are charged by many journals today.

Plagiarism and Assessment

I regularly discuss issues relating to the assessment of student work when I give presentations on plagiarism, contract cheating and academic misconduct. Since good assessment design is essential to engage students and reduce the potential for cheating, I would find it very difficult to talk about plagiarism and not incorporate assessment into the mix.

It does seem that such an approach is not always true in general. Some work on plagiarism does incorporate assessment. However, work on assessment does not seem to as regularly to incorporate plagiarism.

 

Academic Papers Referring To Plagiarism And Assessment

The table below shows the number of matches on Google Scholar for the search terms assessment, plagiarism and assessment plagiarism. Patents and citations are excluded, so these searches generally map to academic publications.

all since 2013 since 2016 since 2017
assessment 5,570,000 996,000 371,000 104,000
plagiarism 312,000 31,400 29,100 13,100
assessment plagiarism 61,600 17,700 15,200 5,510

 

The overall figures suggest that 19.7% of papers on plagiarism also talk about assessment. However, only 1.1% of papers on assessment also talk about plagiarism.

This is, however, something of a simplistic measure, as academic papers use the word assessment to refer to subjects other than work with students. Topics cover such areas as the assessment of fish stock data sets, clinical assessments and the assessment of global warning. Looking through the first few pages of results, I’d estimate that around 1 in 10 uses of assessment actually refer to academic assessments.

By the same token, the rough numbers listed for plagiarism and assessment plagiarism are rather crude. Plagiarism, for instance, is used in other contexts, for instance when talking about plagiarism in books, in popular culture and as part of research misconduct. But this is relatively fair. I believe that it is fair to say that papers relating to plagiarism refer to assessment around twice as often as papers relating to assessment refer to plagiarism (20% compared with 11%).

The good news is that assessment and plagiarism research does seem to have more closely interlinked.

Making similar assumptions to those above:

  • since 2013, 56% of papers relating to plagiarism refer also to assessment, compared with 18% of papers on assessment referring to plagiarism
  • since 2016, 52% of papers relating to plagiarism refer also to assessment, compared with 41% of papers on assessment referring to plagiarism
  • since 2017, 42% of papers relating to plagiarism refer also to assessment, compared with 52% of papers on assessment referring to plagiarism

(the latter data set is relatively small, as 2017 is still in progress, so I would recommend treating that final result with caution)

The trend to relate these two areas does seem to be one that it moving in the right direction.

 

Academic Paper Titles Referring To Plagiarism And Assessment

To get an alternative measure, I repeated the search on Google Scholar looking for the words plagiarism and assessment in the paper titles.

You can do this using the useful intitle: search term, as below:

all since 2013 since 2016 since 2017
assessment 831,000 71,600 73,900 19,000
plagiarism 5,460 1,770 602 224
assessment plagiarism 60 30 8 2

(note that these figures suggest that papers on assessment were withdrawn between 2013 and 2016, but that is likely to be a glitch based on the way that Google estimates the size of large data sets like these – the overall trends still seem reasonable)

 

A quick verification of the matches suggests that the 10% figure for the proportion of the assessment results relating to education stills holds.

The results here are interesting in that, although the indications are that assessment and plagiarism are becoming increasingly mentioned in the same papers, this is not a strong link (it is rare to see both terms mentioned in the paper titles).

Looking at all four columns in the time period, the results are relatively similar:

  • between 0.9% and 1.7% of papers referring to plagiarism in the title also refer to assessment
  • between 0.07% and 0.4% of papers referring to assessment in the title also refer to plagiarism

There are few strong links between plagiarism and assessment in research papers. Where these strong links exists, they are almost always a paper on plagiarism that also incorporates assessment (not the other way around).

With that said, the relatively small number of papers demonstrating that they have closely considered plagiarism and assessment would look perfect to review for a focused literature review.

 

Research Flaws and Opportunities

The numbers here are very rough and ready. The approximation of the percentage of assessment papers relating to educational assessment is exactly that (a rough estimate) and may change from year-to-year. But I feel that there is enough here to illustrate general trends.

(there may also be some simple fixes for this – for instance, I wonder what the results would show if the word education was added to every search?)

Google Scholar, by its nature, is not a perfect system. It doesn’t record every paper, or with the same level of detail. And, sometimes non-papers slip in (I noticed a small number of assessment briefs with accompanying plagiarism statements in there).

It would be interesting to look at a corpus of abstracts to more accurately investigate the research links between plagiarism and assessment.

It would be useful to collect the results on a year-by-year basis to investigate trends, rather than rely on the general groups of dates that Google Scholar offers by default.

It would also be useful to examine alternative wording. For instance, is the term academic integrity linked with assessment research?

And, of course, similar techniques could be used to analyse research links between any two terms, even those completely outside of education.

Maybe I shall try some of those areas out when I have more time. Or, if anyone is interested in working with me on some data mining based research, let me know. There is certainly potential here as well to identify good terminology to use in academic paper titles (think search engine optimisation for academic research).

 

Web Pages on Assessment and Plagiarism

Even outside of pure academic research, these are rare.

Google finds only 953 web pages with both assessment and plagiarism in the title.

They are an interesting set of pages, many relating to regulations. Maybe I’ll talk more about that in a future post. The suggested related searches are also telling in many ways.

This can be plagiarism and assessment web page result #954.

 

Page 3 of 28«123456»1020...Last »