userdict_ja.txt 1.3 KB

1234567891011121314151617181920212223242526272829
  1. #
  2. # This is a sample user dictionary for Kuromoji (JapaneseTokenizer)
  3. #
  4. # Add entries to this file in order to override the statistical model in terms
  5. # of segmentation, readings and part-of-speech tags. Notice that entries do
  6. # not have weights since they are always used when found. This is by-design
  7. # in order to maximize ease-of-use.
  8. #
  9. # Entries are defined using the following CSV format:
  10. # <text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag>
  11. #
  12. # Notice that a single half-width space separates tokens and readings, and
  13. # that the number tokens and readings must match exactly.
  14. #
  15. # Also notice that multiple entries with the same <text> is undefined.
  16. #
  17. # Whitespace only lines are ignored. Comments are not allowed on entry lines.
  18. #
  19. # Custom segmentation for kanji compounds
  20. 日本経済新聞,日本 経済 新聞,ニホン ケイザイ シンブン,カスタム名詞
  21. 関西国際空港,関西 国際 空港,カンサイ コクサイ クウコウ,カスタム名詞
  22. # Custom segmentation for compound katakana
  23. トートバッグ,トート バッグ,トート バッグ,かずカナ名詞
  24. ショルダーバッグ,ショルダー バッグ,ショルダー バッグ,かずカナ名詞
  25. # Custom reading for former sumo wrestler
  26. 朝青龍,朝青龍,アサショウリュウ,カスタム人名