Julius 4.2
関数
libsent/src/ngram/init_ngram.c

N-gramファイルをメモリに読み込み単語辞書と対応を取る [詳細]

#include <sent/stddefs.h>
#include <sent/ngram2.h>
#include <sent/vocabulary.h>

ソースコードを見る。

関数

boolean init_ngram_bin (NGRAM_INFO *ndata, char *bin_ngram_file)
 Read and setup N-gram data from binary format file.
boolean init_ngram_arpa (NGRAM_INFO *ndata, char *ngram_file, int dir)
 Read and setup N-gram data from ARPA format file.
boolean init_ngram_arpa_additional (NGRAM_INFO *ndata, char *bigram_file)
 Read additional LR 2-gram for 1st pass.
boolean make_voca_ref (NGRAM_INFO *ndata, WORD_INFO *winfo)
 Make correspondence between word dictionary and N-gram vocabulary.
void set_unknown_id (NGRAM_INFO *ndata, char *str)
 Set unknown word ID to the N-gram data.
void fix_uniprob_srilm (NGRAM_INFO *ndata, WORD_INFO *winfo)
 Fix unigram probability of BOS / EOS word.

説明

N-gramファイルをメモリに読み込み単語辞書と対応を取る

作者:
Akinobu LEE
日付:
Wed Feb 16 07:40:53 2005
Revision:
1.7

init_ngram.c で定義されています。


関数

boolean init_ngram_bin ( NGRAM_INFO ndata,
char *  bin_ngram_file 
)

Read and setup N-gram data from binary format file.

引数:
ndata[out] pointer to N-gram data structure to store the data
bin_ngram_file[in] file name of the binary N-gram

init_ngram.c36 行で定義されています。

参照元 initialize_ngram().

boolean init_ngram_arpa ( NGRAM_INFO ndata,
char *  ngram_file,
int  dir 
)

Read and setup N-gram data from ARPA format file.

引数:
ndata[out] pointer to N-gram data structure to store the data
ngram_file[in] file name of ARPA (reverse) 3-gram file
dir[in] direction (DIR_LR | DIR_RL)

init_ngram.c65 行で定義されています。

参照元 initialize_ngram().

boolean init_ngram_arpa_additional ( NGRAM_INFO ndata,
char *  bigram_file 
)

Read additional LR 2-gram for 1st pass.

引数:
ndata[out] pointer to N-gram data structure to store the data
bigram_file[in] file name of ARPA 2-gram file

init_ngram.c98 行で定義されています。

参照元 initialize_ngram().

boolean make_voca_ref ( NGRAM_INFO ndata,
WORD_INFO winfo 
)

Make correspondence between word dictionary and N-gram vocabulary.

引数:
ndata[i/o] word/class N-gram, the unknown word information will be set.
winfo[i/o] word dictionary, the word-to-ngram-entry mapping will be done here.

init_ngram.c127 行で定義されています。

参照元 initialize_ngram().

void set_unknown_id ( NGRAM_INFO ndata,
char *  str 
)

Set unknown word ID to the N-gram data.

引数:
ndata[out] N-gram data to set unknown word ID.
str[in] word name string of unknown word

init_ngram.c169 行で定義されています。

参照元 initialize_ngram().

void fix_uniprob_srilm ( NGRAM_INFO ndata,
WORD_INFO winfo 
)

Fix unigram probability of BOS / EOS word.

This function checks the probabilities of BOS / EOS word, and if it is set to "-99", give the same as another one. This is the case when the LM is trained by SRILM, which assigns unigram probability of "-99" to the beginning-of-sentence word, and causes search on reverse direction to fail.

引数:
ndata[i/o] N-gram data
winfo[i/o] Vocabulary information

init_ngram.c206 行で定義されています。

参照元 initialize_ngram().