Awesome
GF GrammarHelper
GF GrammarHelper is a library designed to facilitate the generation of Grammatical Framework (GF) grammar files tailored for various NLP tasks.
Before diving into the library, we highly recommend familiarizing yourself with the basics of GF by reading the Grammatical Framework Tutorial first to get a basic understanding of GF.
To reproduce the grammars used in the paper, please refer to the Reproduce the grammars needed for the paper section.
Overview
Incorporating the insights presented in this paper, formal grammar can be employed to represent a task-specific language. Leveraging a runtime that supports partial parsing, this formal grammar can incrementally parse sentences, guiding the production of the subsequent permissible tokens.
Using the GF GrammarHelper library, the process of employing formal grammar for constrained language generation is delineated into four steps:
- Grammar Definition: Initiate the task-specific formal grammar by constructing a
Grammar
object (refer tosrc/grammar.py
). - Grammar File Creation: Store the Grammar object into a GF grammar file(
.gf
) using theGrammar.save()
method. - Compilation: Utilize the GF runtime to transform the GF grammar file into a
.pgf
file. - Constrained Text Generation: Rely on the Python GF wrapper to load the
.pgf
file and execute constrained text generation. For a practical demonstration, see the xxx repository.
Features:
- Offers a abstraction of
Grammar
andProductionRule
to encapsulate a GF grammar. - Introduces three specialized modules:
IE_grammar
,ED_grammar
, andCP_grammar
. These are dedicated to generating GF grammar files for Information Extraction, Entity Disambiguation, and Constituency Parsing tasks, respectively. - Provides a
GrammarFactory
class to facilitate the creation ofGrammar
objects in batch.
Requirements
pip install -r requirements.txt
Add the root folder of this repository to PYTHONPATH
environment variable.
# in the root folder of this repository
export PYTHONPATH="$(pwd):$PYTHONPATH" # this is required to run scripts in `example/` folder
Reproduce the grammars needed for the paper
bash generate_gcd_grammars.sh
The generated grammars are saved in output/grammars/gf/
and output/grammars/pgf/
folders.
The gf
files are the source files of the grammars, and the pgf
files are the compiled files of the grammars.
Only the pgf
files are directly used by the GF runtime for constrained text generation.