Julius 4.2
|
単語辞書の構造体定義 [詳細]
データ構造 | |
struct | WORD_INFO |
Word dictionary structure to hold vocabulary. [詳細] | |
マクロ定義 | |
#define | MAXWSTEP 4000 |
Memory allocation step in number of words when loading a word dictionary. | |
関数 | |
WORD_INFO * | word_info_new () |
Allocate a new word dictionary structure. | |
void | word_info_free (WORD_INFO *winfo) |
Free all informations in the WORD_INFO. | |
void | winfo_init (WORD_INFO *winfo) |
Initialize a new word dictionary structure. | |
boolean | winfo_expand (WORD_INFO *winfo) |
Expand the word dictionary. | |
boolean | init_voca (WORD_INFO *winfo, char *filename, HTK_HMM_INFO *hmminfo, boolean, boolean) |
Load and initialize a word dictionary. | |
boolean | init_wordlist (WORD_INFO *winfo, char *filename, HTK_HMM_INFO *hmminfo, char *headphone, char *tailphone, char *contextphone, boolean force_dict) |
Load and initialize a word list for isolated word recognition. | |
void | voca_set_stats (WORD_INFO *winfo) |
Parse a word dictionary and set the maximum state length per word. | |
void | voca_load_start (WORD_INFO *winfo, HTK_HMM_INFO *hmminfo, boolean ignore_tri_conv) |
Start loading a dictionary. | |
boolean | voca_load_line (char *buf, WORD_INFO *winfo, HTK_HMM_INFO *hmminfo) |
Load a line from buffer and set parameters to the dictionary. | |
boolean | voca_load_end (WORD_INFO *winfo) |
End loading dictionary entries. | |
boolean | voca_load_htkdict (FILE *, WORD_INFO *, HTK_HMM_INFO *, boolean) |
Top function to read word dictionary via file pointer (gzip enabled) | |
boolean | voca_load_htkdict_fp (FILE *, WORD_INFO *, HTK_HMM_INFO *, boolean) |
Top function to read word dictionary via normal file pointer. | |
boolean | voca_append_htkdict (char *entry, WORD_INFO *winfo, HTK_HMM_INFO *hmminfo, boolean ignore_tri_conv) |
Append a single entry to the existing word dictionary. | |
boolean | voca_append (WORD_INFO *dstinfo, WORD_INFO *srcinfo, int coffset, int woffset) |
Append one word dictionary to other, for multiple grammar handling. | |
boolean | voca_load_htkdict_line (char *buf, WORD_ID *vnum, int linenum, WORD_INFO *winfo, HTK_HMM_INFO *hmminfo, boolean do_conv, boolean *ok_flag) |
Sub function to Add a dictionary entry line to the word dictionary. | |
boolean | voca_load_word_line (char *buf, WORD_INFO *winfo, HTK_HMM_INFO *hmminfo, char *headphone, char *tailpohone, char *contextphone) |
Load a line from buffer and set parameters to the dictionary. | |
boolean | voca_load_wordlist (FILE *fp, WORD_INFO *winfo, HTK_HMM_INFO *hmminfo, char *headphone, char *tailphone, char *contextphone) |
Top function to read word list via text. | |
boolean | voca_load_wordlist_fp (FILE *fp, WORD_INFO *winfo, HTK_HMM_INFO *hmminfo, char *headphone, char *tailphone, char *contextphone) |
Top function to read word list via file pointer. | |
boolean | voca_load_wordlist_line (char *buf, WORD_ID *vnum, int linenum, WORD_INFO *winfo, HTK_HMM_INFO *hmminfo, boolean do_conv, boolean *ok_flag, char *headphone, char *tailphone, char *contextphone) |
Sub function to Add a dictionary entry line to the word dictionary. | |
boolean | voca_mono2tri (WORD_INFO *winfo, HTK_HMM_INFO *hmminfo) |
Convert whole words in word dictionary to word-internal triphone. | |
WORD_ID | voca_lookup_wid (char *, WORD_INFO *) |
Look up a word on dictionary by string. | |
WORD_ID * | new_str2wordseq (WORD_INFO *, char *, int *) |
Convert string of space-separated word strings to array of word ids. | |
char * | cycle_triphone (char *p) |
Return string of triphone name composed from last 3 call. | |
char * | cycle_triphone_flush () |
Flush the triphone buffer and return the last biphone. | |
void | print_voca_info (FILE *fp, WORD_INFO *) |
Output overall word dictionary information to stdout. | |
void | put_voca (FILE *fp, WORD_INFO *winfo, WORD_ID wid) |
Output information of a word in dictionary to stdout. | |
boolean | make_base_phone (HTK_HMM_INFO *hmminfo, WORD_INFO *winfo) |
Build basephone information. | |
void | print_phone_info (FILE *fp, HTK_HMM_INFO *hmminfo) |
Output general information concerning phone mapping in HMM definition. | |
void | print_all_basephone_detail (HMM_basephone *base) |
Output all basephone informations to stdout. | |
void | print_all_basephone_name (HMM_basephone *base) |
Output all basephone names to stdout. | |
void | test_interword_triphone (HTK_HMM_INFO *hmminfo, WORD_INFO *winfo) |
Top function to check if all the possible triphones on given word dictionary actually exist in the logical HMM. |
単語辞書の構造体定義
このファイルは認識で用いられる単語辞書を定義します.単語辞書は単語の読み, 出力文字列,音素列の他に,文の開始・終了単語や透過単語情報も保持します.
N-gramに出現する語彙の辞書は NGRAM_INFO に格納され,この認識用単語辞書とは 区別されることに注意して下さい.単語辞書からN-gramの語彙へのマッピングは WORD_INFO 内の wton[] によって行なわれます.またDFAの場合,wton は その単語が属するDFA_INFO 内のカテゴリ番号を含みます.
vocabulary.h で定義されています。
WORD_INFO* word_info_new | ( | ) |
Allocate a new word dictionary structure.
voca_malloc.c の 35 行で定義されています。
参照元 initialize_dict(), multigram_read_file_and_add(), と multigram_update().
void word_info_free | ( | WORD_INFO * | winfo | ) |
Free all informations in the WORD_INFO.
winfo | [i/o] word dictionary data to be freed. |
voca_malloc.c の 52 行で定義されています。
参照元 initialize_dict(), j_process_lm_free(), multigram_exec_delete(), multigram_free_all(), multigram_read_file_and_add(), と multigram_update().
void winfo_init | ( | WORD_INFO * | winfo | ) |
Initialize a new word dictionary structure.
winfo | [i/o] word dictionary to be initialized. |
voca_malloc.c の 76 行で定義されています。
参照元 multigram_update(), と voca_load_start().
Expand the word dictionary.
winfo | [i/o] word dictionary to be expanded. |
voca_malloc.c の 106 行で定義されています。
参照元 voca_append(), voca_load_line(), と voca_load_word_line().
boolean init_voca | ( | WORD_INFO * | winfo, |
char * | filename, | ||
HTK_HMM_INFO * | hmminfo, | ||
boolean | not_conv_tri, | ||
boolean | force_dict | ||
) |
Load and initialize a word dictionary.
winfo | [out] pointer to a word dictionary data to store the read data |
filename | [in] file name of the word dictionary to read |
hmminfo | [in] HMM definition data, needed for triphone conversion. |
not_conv_tri | [in] TRUE if not converting monophone to triphone. |
force_dict | [in] TRUE if want to ignore the error words in the dictionary |
init_voca.c の 41 行で定義されています。
boolean init_wordlist | ( | WORD_INFO * | winfo, |
char * | filename, | ||
HTK_HMM_INFO * | hmminfo, | ||
char * | headphone, | ||
char * | tailphone, | ||
char * | contextphone, | ||
boolean | force_dict | ||
) |
Load and initialize a word list for isolated word recognition.
winfo | [out] pointer to a word dictionary data to store the read data |
filename | [in] file name of the word dictionary to read |
hmminfo | [in] HMM definition data, needed for triphone conversion. |
headphone | [in] word head silence phone name |
tailphone | [in] word tail silence phone name |
conextphone | [in] silence context name at head and tail phoneme |
force_dict | [in] TRUE if want to ignore the error words in the dictionary |
init_voca.c の 81 行で定義されています。
void voca_set_stats | ( | WORD_INFO * | winfo | ) |
Parse a word dictionary and set the maximum state length per word.
winfo | [i/o] |
voca_load_htkdict.c の 186 行で定義されています。
参照元 voca_append(), と voca_load_end().
void voca_load_start | ( | WORD_INFO * | winfo, |
HTK_HMM_INFO * | hmminfo, | ||
boolean | ignore_tri_conv | ||
) |
Start loading a dictionary.
See voca_load_htkdict() for an example of using this function.
winfo | [i/o] dictionary data where the data will be loaded |
hmminfo | [in] phoneme HMM definition |
ignore_tri_conv | [in] if TRUE, skip triphone conversion while loading |
voca_load_htkdict.c の 228 行で定義されています。
参照元 voca_load_htkdict(), voca_load_htkdict_fp(), voca_load_wordlist(), と voca_load_wordlist_fp().
boolean voca_load_line | ( | char * | buf, |
WORD_INFO * | winfo, | ||
HTK_HMM_INFO * | hmminfo | ||
) |
Load a line from buffer and set parameters to the dictionary.
See voca_load_htkdict() for an example of using this function.
buf | [in] input buffer containing a word entry |
winfo | [i/o] word dictionary to append the entry |
hmminfo | [in] phoneme HMM definition |
voca_load_htkdict.c の 255 行で定義されています。
参照元 initialize_dict(), voca_append_htkdict(), voca_load_htkdict(), と voca_load_htkdict_fp().
End loading dictionary entries.
It calculates some statistics for the read entries, outputs errors if encountered while the last loading, and returns with status whether an error occured while loading.
winfo | [i/o] word dictionary just read by voca_load_line() calls |
voca_load_htkdict.c の 284 行で定義されています。
参照元 initialize_dict(), voca_append_htkdict(), voca_load_htkdict(), voca_load_htkdict_fp(), voca_load_wordlist(), と voca_load_wordlist_fp().
boolean voca_load_htkdict | ( | FILE * | fp, |
WORD_INFO * | winfo, | ||
HTK_HMM_INFO * | hmminfo, | ||
boolean | ignore_tri_conv | ||
) |
Top function to read word dictionary via file pointer (gzip enabled)
fp | [in] file pointer |
winfo | [out] pointer to word dictionary to store the read data. |
hmminfo | [in] HTK HMM definition data. if NULL, phonemes are ignored. |
ignore_tri_conv | [in] TRUE if triphone conversion is ignored |
voca_load_htkdict.c の 305 行で定義されています。
参照元 init_voca().
boolean voca_load_htkdict_fp | ( | FILE * | fp, |
WORD_INFO * | winfo, | ||
HTK_HMM_INFO * | hmminfo, | ||
boolean | ignore_tri_conv | ||
) |
Top function to read word dictionary via normal file pointer.
fp | [in] file pointer |
winfo | [out] pointer to word dictionary to store the read data. |
hmminfo | [in] HTK HMM definition data. if NULL, phonemes are ignored. |
ignore_tri_conv | [in] TRUE if triphone conversion is ignored |
voca_load_htkdict.c の 330 行で定義されています。
boolean voca_append_htkdict | ( | char * | entry, |
WORD_INFO * | winfo, | ||
HTK_HMM_INFO * | hmminfo, | ||
boolean | ignore_tri_conv | ||
) |
Append a single entry to the existing word dictionary.
entry | [in] dictionary entry string to be appended. |
winfo | [out] pointer to word dictionary to append the data. |
hmminfo | [in] HTK HMM definition data. if NULL, phonemes are ignored. |
ignore_tri_conv | [in] TRUE if triphone conversion is ignored |
voca_load_htkdict.c の 354 行で定義されています。
参照元 initialize_dict().
Append one word dictionary to other, for multiple grammar handling.
Assumes that the same HMM definition is used on both word dictionary.
dstinfo | [i/o] word dictionary |
srcinfo | [in] word dictionary to be appended to dst |
coffset | [in] category id offset in dst where the new data should be stored |
woffset | [in] word id offset in dst where the new data should be stored |
voca_load_htkdict.c の 644 行で定義されています。
参照元 multigram_add_words_to_grammar(), multigram_append_to_global(), と multigram_update().
boolean voca_load_htkdict_line | ( | char * | buf, |
WORD_ID * | vnum_p, | ||
int | linenum, | ||
WORD_INFO * | winfo, | ||
HTK_HMM_INFO * | hmminfo, | ||
boolean | do_conv, | ||
boolean * | ok_flag | ||
) |
Sub function to Add a dictionary entry line to the word dictionary.
buf | [i/o] buffer to hold the input string, will be modified in this function |
vnum_p | [in] current number of words in winfo |
linenum | [in] current line number of the input |
winfo | [out] pointer to word dictionary to append the data. |
hmminfo | [in] HTK HMM definition data. if NULL, phonemes are ignored. |
do_conv | [in] TRUE if performing triphone conversion |
ok_flag | [out] will be set to FALSE if an error occured for this input. |
voca_load_htkdict.c の 374 行で定義されています。
参照元 voca_load_line().
boolean voca_load_word_line | ( | char * | buf, |
WORD_INFO * | winfo, | ||
HTK_HMM_INFO * | hmminfo, | ||
char * | headphone, | ||
char * | tailphone, | ||
char * | contextphone | ||
) |
Load a line from buffer and set parameters to the dictionary.
buf | [in] input buffer containing a word entry |
winfo | [i/o] word dictionary to append the entry |
hmminfo | [in] phoneme HMM definition |
headphone | [in] word head silence model name |
tailphone | [in] word tail silence model name |
contextphone | [in] silence context name to be used at head and tail |
voca_load_wordlist.c の 114 行で定義されています。
boolean voca_load_wordlist | ( | FILE * | fp, |
WORD_INFO * | winfo, | ||
HTK_HMM_INFO * | hmminfo, | ||
char * | headphone, | ||
char * | tailphone, | ||
char * | contextphone | ||
) |
Top function to read word list via text.
fp | [in] file pointer |
winfo | [out] pointer to word dictionary to store the read data. |
hmminfo | [in] HTK HMM definition data. if NULL, phonemes are ignored. |
headphone | [in] word head silence model name |
tailphone | [in] word tail silence model name |
contextphone | [in] silence context name to be used at head and tail |
voca_load_wordlist.c の 142 行で定義されています。
参照元 init_wordlist().
boolean voca_load_wordlist_fp | ( | FILE * | fp, |
WORD_INFO * | winfo, | ||
HTK_HMM_INFO * | hmminfo, | ||
char * | headphone, | ||
char * | tailphone, | ||
char * | contextphone | ||
) |
Top function to read word list via file pointer.
fp | [in] file pointer |
winfo | [out] pointer to word dictionary to store the read data. |
hmminfo | [in] HTK HMM definition data. if NULL, phonemes are ignored. |
headphone | [in] word head silence model name |
tailphone | [in] word tail silence model name |
contextphone | [in] silence context name to be used at head and tail |
voca_load_wordlist.c の 169 行で定義されています。
boolean voca_load_wordlist_line | ( | char * | buf, |
WORD_ID * | vnum_p, | ||
int | linenum, | ||
WORD_INFO * | winfo, | ||
HTK_HMM_INFO * | hmminfo, | ||
boolean | do_conv, | ||
boolean * | ok_flag, | ||
char * | headphone, | ||
char * | tailphone, | ||
char * | contextphone | ||
) |
Sub function to Add a dictionary entry line to the word dictionary.
buf | [i/o] buffer to hold the input string, will be modified in this function |
vnum_p | [in] current number of words in winfo |
linenum | [in] current line number of the input |
winfo | [out] pointer to word dictionary to append the data. |
hmminfo | [in] HTK HMM definition data. if NULL, phonemes are ignored. |
do_conv | [in] TRUE if performing triphone conversion |
ok_flag | [out] will be set to FALSE if an error occured for this input. |
headphone | [in] word head silence model name |
tailphone | [in] word tail silence model name |
contextphone | [in] silence context name to be used at head and tail |
voca_load_wordlist.c の 199 行で定義されています。
boolean voca_mono2tri | ( | WORD_INFO * | winfo, |
HTK_HMM_INFO * | hmminfo | ||
) |
Convert whole words in word dictionary to word-internal triphone.
Normally triphone conversion will be performed directly when reading dictionary file. This function is for post conversion only.
winfo | [i/o] word dictionary information |
hmminfo | [in] HTK HMM definition |
voca_load_htkdict.c の 603 行で定義されています。
Look up a word on dictionary by string.
keyword | [in] keyword to search |
winfo | [in] word dictionary |
voca_lookup.c の 43 行で定義されています。
参照元 initialize_dict(), と new_str2wordseq().
Convert string of space-separated word strings to array of word ids.
winfo | [in] word dictionary |
s | [in] string of space-separated word strings |
len_return | [out] number of found words |
voca_lookup.c の 117 行で定義されています。
char* cycle_triphone | ( | char * | p | ) |
Return string of triphone name composed from last 3 call.
p | [in] next phone string |
voca_load_htkdict.c の 80 行で定義されています。
参照元 cycle_triphone_flush(), new_str2phseq(), voca_load_htkdict_line(), voca_load_wordlist_line(), と voca_mono2tri().
char* cycle_triphone_flush | ( | ) |
Flush the triphone buffer and return the last biphone.
voca_load_htkdict.c の 126 行で定義されています。
参照元 new_str2phseq(), voca_load_htkdict_line(), voca_load_wordlist_line(), と voca_mono2tri().
void print_voca_info | ( | FILE * | fp, |
WORD_INFO * | winfo | ||
) |
Output overall word dictionary information to stdout.
fp | [in] file descriptor |
winfo | [in] word dictionary |
voca_util.c の 35 行で定義されています。
参照元 print_engine_info().
Output information of a word in dictionary to stdout.
fp | [in] file descriptor |
winfo | [in] word dictionary |
wid | [in] word id to be output |
voca_util.c の 67 行で定義されています。
参照元 hmm_check(), make_dfa_voca_ref(), print_engine_info(), と wchmm_add_word().
boolean make_base_phone | ( | HTK_HMM_INFO * | hmminfo, |
WORD_INFO * | winfo | ||
) |
Build basephone information.
Extract base phones from HMM definition, mark them whether they appear on word head or word tail, and count the number.
hmminfo | [i/o] HMM definition information, basephone list will be added. |
winfo | [in] word dictionary information |
chkhmmlist.c の 386 行で定義されています。
参照元 hmm_check().
void print_phone_info | ( | FILE * | fp, |
HTK_HMM_INFO * | hmminfo | ||
) |
Output general information concerning phone mapping in HMM definition.
fp | [in] file descriptor |
hmminfo | [in] HMM definition data. |
chkhmmlist.c の 404 行で定義されています。
参照元 hmm_check().
void print_all_basephone_detail | ( | HMM_basephone * | base | ) |
Output all basephone informations to stdout.
base | [in] pointer to the top basephone data holder. |
chkhmmlist.c の 106 行で定義されています。
参照元 hmm_check().
void print_all_basephone_name | ( | HMM_basephone * | base | ) |
Output all basephone names to stdout.
base | [in] pointer to the top basephone data holder. |
chkhmmlist.c の 116 行で定義されています。
参照元 hmm_check().
void test_interword_triphone | ( | HTK_HMM_INFO * | hmminfo, |
WORD_INFO * | winfo | ||
) |
Top function to check if all the possible triphones on given word dictionary actually exist in the logical HMM.
hmminfo | [in] HMM definition information, with basephone list. |
winfo | [in] word dictionary information |
chkhmmlist.c の 345 行で定義されています。
参照元 hmm_check().