Julius 4.2
データ構造 | マクロ定義 | 関数
libsent/include/sent/dfa.h

決定性有限状態オートマトン文法(DFA)およびカテゴリ対情報の構造体定義 [詳細]

#include <sent/stddefs.h>
#include <sent/vocabulary.h>
#include <sent/htk_hmm.h>

ソースコードを見る。

データ構造

struct  DFA_ARC
 Transition arc of DFA. [詳細]
struct  DFA_STATE
 State of DFA. [詳細]
struct  TERM_INFO
 Information of each terminal symbol (=category) [詳細]
struct  DFA_INFO
 Top structure of a DFA. [詳細]

マクロ定義

#define DFA_STATESTEP   1000
 Allocation step of DFA state.
#define DFA_CP_MINSTEP   20
 Minimum initial CP data size per category.
#define INITIAL_S   0x10000000
 Status flag mask specifying an initial state.
#define ACCEPT_S   0x00000001
 Status flag mask specifying an accept state.

関数

DFA_INFOdfa_info_new ()
 Allocate a new grammar information data structure and initialize it.
void dfa_info_free (DFA_INFO *dfa)
 Free all informations in the DFA_INFO.
void dfa_state_init (DFA_INFO *dinfo)
 Initialize and allocate DFA state information list in the grammar.
void dfa_state_expand (DFA_INFO *dinfo, int needed)
 Expand the state information list to the required length.
boolean rddfa (FILE *fp, DFA_INFO *dinfo)
 Top loop function to read DFA grammar via file pointer (gzip enabled)
boolean rddfa_fp (FILE *fp, DFA_INFO *dinfo)
 Top loop function to read DFA grammar via file descriptor.
boolean rddfa_line (char *line, DFA_INFO *dinfo, int *state_max, int *arc_num, int *terminal_max)
 Parse the input line and set grammar information, one by line.
void dfa_append (DFA_INFO *dst, DFA_INFO *src, int soffset, int coffset)
 Append the DFA state information to other.
boolean init_dfa (DFA_INFO *dinfo, char *filename)
 Read in a grammar file and set to DFA grammar structure.
WORD_ID dfa_symbol_lookup (DFA_INFO *dinfo, char *terminalname)
 Return category id corresponding to the given terminal name.
boolean extract_cpair (DFA_INFO *dinfo)
 Extract category-pair constraint from DFA grammar and newly set the category pair matrix of the give DFA.
boolean cpair_append (DFA_INFO *dst, DFA_INFO *src, int coffset)
 Append the category pair matrix at the last.
void print_dfa_info (FILE *fp, DFA_INFO *dinfo)
 Output overall grammar information to stdout.
void print_dfa_cp (FILE *fp, DFA_INFO *dinfo)
 Output the category-pair matrix in text format to stdout.
boolean dfa_cp (DFA_INFO *dfa, int i, int j)
 Return whether the given two category can be connected or not.
boolean dfa_cp_begin (DFA_INFO *dfa, int i)
 Return whether the category can be appear at the beginning of sentence.
boolean dfa_cp_end (DFA_INFO *dfa, int i)
 Return whether the category can be appear at the end of sentence.
void set_dfa_cp (DFA_INFO *dfa, int i, int j, boolean value)
 Set a category-pair matrix bit.
void set_dfa_cp_begin (DFA_INFO *dfa, int i, boolean value)
 Set a category-pair matrix bit for the beginning of sentence.
void set_dfa_cp_end (DFA_INFO *dfa, int i, boolean value)
 Set a category-pair matrix bit for the end of sentence.
void init_dfa_cp (DFA_INFO *dfa)
 Initialize category pair matrix in the grammar data.
void malloc_dfa_cp (DFA_INFO *dfa, int term_num, int size)
 Allocate memory for category pair matrix and initialize it.
void realloc_dfa_cp (DFA_INFO *dfa, int old_term_num, int new_term_num)
void free_dfa_cp (DFA_INFO *dfa)
 Free the category pair matrix from DFA grammar.
void dfa_cp_output_rawdata (FILE *fp, DFA_INFO *dfa)
void dfa_cp_count_size (DFA_INFO *dfa, unsigned long *size_ret, unsigned long *allocsize_ret)
boolean dfa_cp_append (DFA_INFO *dfa, DFA_INFO *src, int offset)
 Append a categori-pair matrix to another.
boolean make_dfa_voca_ref (DFA_INFO *dinfo, WORD_INFO *winfo)
 Make correspondence between all words in dictionary and categories in grammar, both from a word to a category and from a category to words.
void make_terminfo (TERM_INFO *tinfo, DFA_INFO *dinfo, WORD_INFO *winfo)
 Make a word list for each category.
void free_terminfo (TERM_INFO *tinfo)
 Free word list for each category.
void terminfo_append (TERM_INFO *dst, TERM_INFO *src, int coffset, int woffset)
 Append the terminal(category) word list.
void dfa_find_pause_word (DFA_INFO *dfa, WORD_INFO *winfo, HTK_HMM_INFO *hmminfo)
 Find pause word and pause category information, and set to the grammar data.
boolean dfa_pause_word_append (DFA_INFO *dst, DFA_INFO *src, int coffset)
 Append the pause word/category information at the last.

説明

決定性有限状態オートマトン文法(DFA)およびカテゴリ対情報の構造体定義

このファイルには, DFAと呼ばれる有限状態文法の構造体が定義されています.

DFAは, 単語のカテゴリ番号を入力とする決定性オートマトンで,構文制約を 表現します.カテゴリごとの単語リストも保持します.

また,第1パスの認識のために,DFAカテゴリ間の接続関係のみを抜き出した 単語対情報も保持します.これは文法を読みだし後に内部でDFAから抽出されます.

作者:
Akinobu LEE
日付:
Thu Feb 10 18:21:27 2005
Revision:
1.6

dfa.h で定義されています。


関数

DFA_INFO* dfa_info_new ( )

Allocate a new grammar information data structure and initialize it.

戻り値:
pointer to the newly allocated DFA_INFO.

dfa_malloc.c34 行で定義されています。

参照元 multigram_read_file_and_add(), と multigram_update().

void dfa_info_free ( DFA_INFO dfa)

Free all informations in the DFA_INFO.

引数:
dfa[i/o] grammar information data to be freed.

dfa_malloc.c55 行で定義されています。

参照元 j_process_lm_free(), multigram_exec_delete(), multigram_free_all(), multigram_read_file_and_add(), と multigram_update().

void dfa_state_init ( DFA_INFO dinfo)

Initialize and allocate DFA state information list in the grammar.

引数:
dinfo[i/o] DFA grammar

rddfa.c36 行で定義されています。

参照元 multigram_update(), rddfa(), と rddfa_fp().

void dfa_state_expand ( DFA_INFO dinfo,
int  needed 
)

Expand the state information list to the required length.

引数:
dinfo[i/o] DFA grammar
needed[in] required new length

rddfa.c57 行で定義されています。

参照元 dfa_append(), と rddfa_line().

boolean rddfa ( FILE *  fp,
DFA_INFO dinfo 
)

Top loop function to read DFA grammar via file pointer (gzip enabled)

引数:
fp[in] file pointer that points to the DFA grammar data
dinfo[out] the read data will be stored in this DFA grammar structure
戻り値:
TRUE on success, FALSE on failure.

rddfa.c80 行で定義されています。

参照元 init_dfa().

boolean rddfa_fp ( FILE *  fp,
DFA_INFO dinfo 
)

Top loop function to read DFA grammar via file descriptor.

引数:
fp[in] file pointer that points to the DFA grammar data
dinfo[out] the read data will be stored in this DFA grammar structure
戻り値:
TRUE on success, FALSE on failure.

rddfa.c110 行で定義されています。

boolean rddfa_line ( char *  line,
DFA_INFO dinfo,
int *  state_max,
int *  arc_num,
int *  terminal_max 
)

Parse the input line and set grammar information, one by line.

引数:
line[in] text buffer that holds a line of DFA file
dinfo[i/o] the read data will be appended to this DFA data
state_max[i/o] maximum number of state id appeared, will be updated
arc_num[i/o] number of read arcs, will be updated
terminal_max[i/o] maximum number of state id appended, will be updated
戻り値:
TRUE if the line was successfully parsed, FALSE if failed.

rddfa.c143 行で定義されています。

参照元 rddfa(), と rddfa_fp().

void dfa_append ( DFA_INFO dst,
DFA_INFO src,
int  soffset,
int  coffset 
)

Append the DFA state information to other.

引数:
dst[i/o] DFA grammar
src[i/o] DFA grammar to be appended to dst
soffset[in] offset state number in dst where the new state should be stored
coffset[in] category id offset in dst where the new data should be stored

rddfa.c218 行で定義されています。

参照元 multigram_append_to_global().

boolean init_dfa ( DFA_INFO dinfo,
char *  filename 
)

Read in a grammar file and set to DFA grammar structure.

引数:
dinfo[i/o] a blank DFA data
filename[in] DFA grammar file name

init_dfa.c46 行で定義されています。

参照元 multigram_read_file_and_add().

WORD_ID dfa_symbol_lookup ( DFA_INFO dinfo,
char *  terminalname 
)

Return category id corresponding to the given terminal name.

Actually they are mere strings of ID itself.

引数:
dinfo[in] DFA grammar information
terminalname[in] name string
戻り値:
the category id.

dfa_lookup.c45 行で定義されています。

参照元 make_dfa_voca_ref().

boolean extract_cpair ( DFA_INFO dinfo)

Extract category-pair constraint from DFA grammar and newly set the category pair matrix of the give DFA.

引数:
dinfo[i/o] DFA grammar, in which the category-pair matrix will be created.

mkcpair.c61 行で定義されています。

参照元 multigram_update().

boolean cpair_append ( DFA_INFO dst,
DFA_INFO src,
int  coffset 
)

Append the category pair matrix at the last.

引数:
dst[i/o] DFA grammar
src[in] DFA grammar to be appended to dst
coffset[in] category id in dst where the new data should be stored

mkcpair.c123 行で定義されています。

参照元 multigram_append_to_global().

void print_dfa_info ( FILE *  fp,
DFA_INFO dinfo 
)

Output overall grammar information to stdout.

引数:
fp[in] file pointer
dinfo[in] DFA grammar

dfa_util.c35 行で定義されています。

参照元 print_engine_info().

void print_dfa_cp ( FILE *  fp,
DFA_INFO dinfo 
)

Output the category-pair matrix in text format to stdout.

引数:
fp[in] file pointer
dinfo[in] DFA grammar that holds category pair matrix

dfa_util.c54 行で定義されています。

参照元 print_engine_info().

boolean dfa_cp ( DFA_INFO dfa,
int  i,
int  j 
)

Return whether the given two category can be connected or not.

引数:
dfa[in] DFA grammar holding category pair matrix
i[in] category id of left word
j[in] category id of right word
戻り値:
TRUE if connection is allowed by the grammar, FALSE if prohibited.

cpair.c91 行で定義されています。

参照元 beam_inter_word(), と can_succeed().

boolean dfa_cp_begin ( DFA_INFO dfa,
int  i 
)

Return whether the category can be appear at the beginning of sentence.

引数:
dfa[in] DFA grammar holding category pair matrix
i[in] category id of the word
戻り値:
TRUE if it can appear at the beginning of sentence, FALSE if not.

cpair.c109 行で定義されています。

参照元 can_succeed(), と init_nodescore().

boolean dfa_cp_end ( DFA_INFO dfa,
int  i 
)

Return whether the category can be appear at the end of sentence.

引数:
dfa[in] DFA grammar holding category pair matrix
i[in] category id of the word
戻り値:
TRUE if it can appear at the end of sentence, FALSE if not.

cpair.c126 行で定義されています。

void set_dfa_cp ( DFA_INFO dfa,
int  i,
int  j,
boolean  value 
)

Set a category-pair matrix bit.

引数:
dfa[out] DFA grammar holding category pair matrix
i[in] category id of left word
j[in] category id of right word
valueTRUE if connection allowed, FALSE if connection prohibited.

cpair.c200 行で定義されています。

参照元 extract_cpair().

void set_dfa_cp_begin ( DFA_INFO dfa,
int  i,
boolean  value 
)

Set a category-pair matrix bit for the beginning of sentence.

引数:
dfa[out] DFA grammar holding category pair matrix
i[in] category id of the word
valueTRUE if the category can appear at the beginning of sentence, FALSE if not.

cpair.c225 行で定義されています。

参照元 extract_cpair().

void set_dfa_cp_end ( DFA_INFO dfa,
int  i,
boolean  value 
)

Set a category-pair matrix bit for the end of sentence.

引数:
dfa[out] DFA grammar holding category pair matrix
i[in] category id of the word
valueTRUE if the category can appear at the end of sentence, FALSE if not.

cpair.c251 行で定義されています。

参照元 extract_cpair().

void init_dfa_cp ( DFA_INFO dfa)

Initialize category pair matrix in the grammar data.

引数:
dfa[out] DFA grammar to hold category pair matrix

cpair.c274 行で定義されています。

参照元 dfa_info_new().

void malloc_dfa_cp ( DFA_INFO dfa,
int  term_num,
int  size 
)

Allocate memory for category pair matrix and initialize it.

引数:
dfa[out] DFA grammar to hold category pair matrix
term_num[in] number of categories in the grammar
size[in] memory allocation length for each cp list as initial

cpair.c287 行で定義されています。

参照元 extract_cpair().

void free_dfa_cp ( DFA_INFO dfa)

Free the category pair matrix from DFA grammar.

引数:
dfa[i/o] DFA grammar holding category pair matrix

cpair.c396 行で定義されています。

参照元 dfa_info_free().

boolean dfa_cp_append ( DFA_INFO dfa,
DFA_INFO src,
int  offset 
)

Append a categori-pair matrix to another.

This function assumes that other grammar information has been already appended and dfa->term_num contains the new size.

引数:
dfa[i/o] DFA grammar to which the new category pair will be appended
src[in] source DFA
offset[in] appending point at dfa
戻り値:
TRUE on success, FALSE on error.

cpair.c319 行で定義されています。

参照元 cpair_append().

boolean make_dfa_voca_ref ( DFA_INFO dinfo,
WORD_INFO winfo 
)

Make correspondence between all words in dictionary and categories in grammar, both from a word to a category and from a category to words.

引数:
dinfo[i/o] DFA grammar, category information will be built here.
winfo[i/o] Word dictionary, word-to-category information will be build here.

init_dfa.c74 行で定義されています。

参照元 multigram_add_words_to_grammar(), と multigram_update().

void make_terminfo ( TERM_INFO tinfo,
DFA_INFO dinfo,
WORD_INFO winfo 
)

Make a word list for each category.

引数:
tinfo[i/o] terminal data structure to hold the result
dinfo[in] DFA gammar to supply the number of category in the grammar
winfo[in] word dictionary.

mkterminfo.c39 行で定義されています。

参照元 make_dfa_voca_ref().

void free_terminfo ( TERM_INFO tinfo)

Free word list for each category.

引数:
tinfo[in] terminal data structure holding the content.

mkterminfo.c75 行で定義されています。

参照元 dfa_info_free(), と multigram_add_words_to_grammar().

void terminfo_append ( TERM_INFO dst,
TERM_INFO src,
int  coffset,
int  woffset 
)

Append the terminal(category) word list.

引数:
dst[i/o] category data
src[i/o] category data to be appended to dst
coffset[in] category id offset in dst where the new data should be stored
woffset[in] word id offset where the new data should be stored

mkterminfo.c97 行で定義されています。

参照元 multigram_append_to_global().

void dfa_find_pause_word ( DFA_INFO dfa,
WORD_INFO winfo,
HTK_HMM_INFO hmminfo 
)

Find pause word and pause category information, and set to the grammar data.

引数:
dfa[i/o] DFA grammar, sp_id and is_sp will be built here.
winfo[in] Word dictionary
hmminfo[in] HTK HMM to provide which is short pause HMM

init_dfa.c107 行で定義されています。

参照元 multigram_update().

boolean dfa_pause_word_append ( DFA_INFO dst,
DFA_INFO src,
int  coffset 
)

Append the pause word/category information at the last.

引数:
dst[i/o] DFA grammar
src[in] DFA grammar to be appended to dst
coffsetappending category point in dst

init_dfa.c138 行で定義されています。

参照元 multigram_append_to_global().