Christopher D. Paice
   HOME

TheInfoList



OR:

Christopher D Paice was one of the pioneers of research into stemming. The Paice-Husk stemmer was published in 1990 and his method of evaluation of stemmer performance by means of Error Rate with Respect to Truncation (ERRT) was the first direct method of comparing under-stemming and over-stemming errors. Apart from his pioneering work on stemming algorithms and evaluation methods he made other research contributions in the area of
Information Retrieval Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an Information needs, information need. The information need can be specified in the form ...
, anaphora resolution and automatic abstracting.


Teaching career

Christopher D Paice was a member of the School of Computing and Communications (SCC) at
Lancaster University Lancaster University (officially The University of Lancaster) is a collegiate public university, public research university in Lancaster, Lancashire, England. The university was established in 1964 by royal charter, as one of several new univer ...
,
United Kingdom The United Kingdom of Great Britain and Northern Ireland, commonly known as the United Kingdom (UK) or Britain, is a country in Northwestern Europe, off the coast of European mainland, the continental mainland. It comprises England, Scotlan ...
for around forty years, initially joining the then Department of Computer Studies as a Research Associate in 1969-70; then moving on to a Lectureship. He was acting Head of Department in 1977-78, Head of Department 1979-82 and retired in 2009.


The Paice-Husk Stemming Algorithm

The Paice-Husk Stemmer was developed by Chris D Paice with the assistance of Gareth Husk in the Computing Department at Lancaster University, in the late 1980s, it features an externally stored set of stemming rules, and this flexibility over the Porter stemmer made it of interest to several researchers. Originally implemented in Pascal programming language, further implementations have been made using ANSI C and Java. A Perl version was implemented by Mary Taffet at the Center for Natural Language Processing at Syracuse University, USA. The stemmer consists of a stemming algorithm and a separate set of stemming rules. The standard set of rules provides a 'strong' stemmer. Stemmer strength is a quality that is advantageous for index compression, however, it produce a larger number of Overstemming errors relative to the number of Understemming errors; users who need a lighter stemmer can easily develop their own set of rules. The Stemmer is iterative (i.e. endings are removed piecemeal in an indefinite number of stages) and the rules may specify the removal or replacement of an ending. The replacement technique avoids the need for a separate stage in the process to recode or provide partial matching; this helps maintain the efficiency of the algorithm. The rules are indexed by the last letter of the ending to allow efficient searching.


Stemmer Evaluation

Apart from the Stemmer itself, Chris Paice developed a method for directly measuring the performance of stemmers using grouped lists of words applied to the stemmer, counting the number of overstemming and understemming errors, then comparing the results with what would have been obtained by using a set of truncation stemmers. The final measure being the Error Rate Relative to Truncation (ERRT). Paice, C.D. (1996) Method for Evaluation of Stemming Algorithms based on Error Counting, JASIS, 47(8): 632-649


Personal life

Christopher D Paice was born in 1941, he married Kathleen F Moss in 1965 in the Manchester Registration district. In 2015 he was diagnosed with an aggressive brain tumour, shortly after he and his wife moved away from Cumbria to Stratford, he passed away 21 April 2016.


Publications

* * * * * * * * * * * * *


References

{{DEFAULTSORT:Paice, Christopher D. 1941 births 2016 deaths Information retrieval researchers