List Analyser

 

LA100 5876 Sortedtest Sortedtest Porter


List Analyser 1.1 is designed for use with Stemming Tester 1.4 for the purpose of tuning stemming rules. It can be used with plain text word lists and if the words are arranged in concept groups separated by barriers according to the method described by Chris Paice of Lancaster University* it will display Under Stemming Error count, Over Stemming Error count, Stemmer Weight (SW) and ERRT.

*Chris Paice. 'Method for Evaluation of Stemming Algorithms Based on Error Counting'.

While primarily intended for optimising the dtSearch stemmer in various languages, this product can also be use with other stemmers (e.g. Porter, Paice-Husk).

This software is now open source and available on GitHub:
https://github.com/electronart/WordListAnalyser

See our blog article here also:
/the-blog/blog/2018/stemmer-testing.aspx