Stemming
Stemming is the process of reducing inflected (or sometimes derived) words to their stem, base, or root form.
StarTeam search uses a Porter stemmer for default word analysis and indexing. The current JVM language variable is used to decide which stemmer to apply as follows:
Language "en" | English Stemmer |
Language "fr" | French Stemmer |
Language "pt" | Portuguese Stemmer |
Language "de" | German Stemmer |
Chinese and Japanese locales
StarTeam search uses Lucene's CJKAnalyzer by default. Analyzers are configurable by editing starteam-search-configs.xml. For example, to use Lucene's SmartChineseAnalyzer, which is an analyzer for simplified Chinese or mixed Chinese-English text, make the following change:
<Analyzers><Analyzer name="zh" value="org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer"/></Analyzers>
The string org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer is a class name which is part of the Lucene library.