table of contents
        
      
      
    | HFST-TOKENIZE(1) | User Commands | HFST-TOKENIZE(1) | 
NAME¶
hfst-tokenize - =perform matching/lookup on text streams
SYNOPSIS¶
hfst-tokenize [--segment | --xerox | --cg | --giella-cg] [OPTIONS...] RULESET
DESCRIPTION¶
perform matching/lookup on text streams
Common options:¶
- -h, --help
 - Print help message
 - -V, --version
 - Print version info
 - -v, --verbose
 - Print verbosely while processing
 - -q, --quiet
 - Only print fatal erros and requested output
 - -s, --silent
 - Alias of --quiet
 - -n, --newline
 - Newline as input separator (default is blank line)
 - -a, --print-all
 - Print nonmatching text
 - -w, --print-weight
 - Print weights
 - -m, --tokenize-multichar Tokenize multicharacter symbols
 - (by default only one utf-8 character is tokenized at a time regardless of what is present in the alphabet)
 - -tS, --time-cutoff=S
 - Limit search after having used S seconds per input
 - -lN, --weight-classes=N
 - Output no more than N best weight classes (where analyses with equal weight constitute a class
 - -u, --unique
 - Remove duplicate analyses
 - -z, --segment
 - Segmenting / tokenization mode (default)
 - -i, --space-separated
 - Tokenization with one sentence per line, space-separated tokens
 - -x, --xerox
 - Xerox output
 - -c, --cg
 - Constraint Grammar output
 - -S, --superblanks
 - Ignore contents of unescaped [] (cf. apertium-destxt); flush on NUL
 - -g, --giella-cg
 - CG format used in Giella infrastructe (implies -l2, treats @PMATCH_INPUT_MARK@ as subreading separator, expects tags to start or end with +, flush on NUL)
 - -C --conllu
 - CoNLL-U format
 - -f, --finnpos
 - FinnPos output
 
Use standard streams for input and output (for now).
REPORTING BUGS¶
Report bugs to <hfst-bugs@helsinki.fi> or directly to our bug tracker at: <https://github.com/hfst/hfst/issues>
hfst-tokenize home page:
    <https://kitwiki.csc.fi/twiki/bin/view/KitWiki//HfstTokenize>
  
  General help using HFST software:
    <https://kitwiki.csc.fi/twiki/bin/view/KitWiki//HfstHome>
COPYRIGHT¶
Copyright © 2017 University of Helsinki, License GPLv3: GNU
    GPL version 3 <http://gnu.org/licenses/gpl.html>
  
  This is free software: you are free to change and redistribute it. There is NO
    WARRANTY, to the extent permitted by law.
| March 2017 | HFST |