public class WordNGrammer extends Filter
Constants.WORDNGRAMS_LENGHT
constant.
Filter treats all types of input tokens equally.
ParagraphPunctFilter
shall be applied to the token sequence before this filter. If it is not, than
all the document is taken as one sentence and N-grams are produced "on document level".
This filter should be the last filter to apply.
The filter has an inner context thus it cannot be shared in a filtering chain.Constructor and Description |
---|
WordNGrammer(Sequence<Token> prev)
Constructor for the WordNGrammer object
|
Modifier and Type | Method and Description |
---|---|
Token |
next()
Return the next token.
|
action, getPrevTokenizer, setPrevTokenizer
Copyright © 2016 Egothor. All Rights Reserved.