zensols.propbankdb package¶
Submodules¶
zensols.propbankdb.app module¶
An API to access the frameset database and generate embeddings from them.
- class zensols.propbankdb.app.Application(config_factory, db)[source]¶
Bases:
objectAccess the Frameset database and generate embeddings from them.
- __init__(config_factory, db)¶
-
config_factory:
ConfigFactory¶ Used to get the metadata configuration from the install.
- predicate(lemma, format=Format.text)[source]¶
Dump a role set.
- Parameters:
id – the lemma of the predicate (i.e.
see)format (
Format) – the format of the output
zensols.propbankdb.cli module¶
Command line entry point to the application.
- class zensols.propbankdb.cli.ApplicationFactory(*args, **kwargs)[source]¶
Bases:
ApplicationFactory
zensols.propbankdb.dapp module¶
A separate CLI entry point for creating the distribution file. The distribution file contains the SQLite database file with the frameset structured data and the embeddings for various facets of the contained role sets.
- class zensols.propbankdb.dapp.LoadApplication(loader, embedding_generator, packager, cleaner)[source]¶
Bases:
objectCreate the deployment distribution file.
- __init__(loader, embedding_generator, packager, cleaner)¶
-
embedding_generator:
EmbeddingGenerator¶ Creates sentence embeddings for PropBank objects.
-
loader:
DatabaseLoader¶ Loads the parsed frameset XML files in to an SQLite database.
zensols.propbankdb.db module¶
Loads parsed XML files to an SQLite database.
- class zensols.propbankdb.db.AliasPersister(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)[source]¶
Bases:
BankObjectPersister- __init__(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)¶
- class zensols.propbankdb.db.BankObjectPersister(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)[source]¶
Bases:
BeanDbPersisterUtility methods to de-hydrate frame set objects from the datbase.
- __init__(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)¶
- get_by_id(uid)[source]¶
Return an object using it’s unique ID, which is could be the row ID in SQLite.
- Return type:
- populators: List[BankObjectPopulator]¶
The list of other populators to invoke on
_populate().
- class zensols.propbankdb.db.Database(conn_manager, frameset_persister, predicate_persister, roleset_persister, alias_persister, example_persister, role_persister, role_link_persister, function_persister, relation_persister, roleset_stash, predicate_stash, relation_stash)[source]¶
Bases:
objectA data access object for all frame set data. This provides access to the
loader, which parses the XML and loads it in to the database. It also provides methods to re-hydrate object instances from the database.Important implementation note: Stash references need to be obtained from this instance rather than directly from the
ConfigFactory, otherwise it will not be correctly initialized.- __init__(conn_manager, frameset_persister, predicate_persister, roleset_persister, alias_persister, example_persister, role_persister, role_link_persister, function_persister, relation_persister, roleset_stash, predicate_stash, relation_stash)¶
-
alias_persister:
AliasPersister¶ `.Alias.
- Type:
Persists instances of
- Type:
class
-
conn_manager:
ConnectionManager¶ The relational database (SQLite only for now) connection manager.
- See:
installer
-
example_persister:
ExamplePersister¶ `.Example.
- Type:
Persists instances of
- Type:
class
-
frameset_persister:
FramesetPersister¶ `.Frameset.
- Type:
Persists instances of
- Type:
class
-
function_persister:
FunctionPersister¶ `.Function.
- Type:
Persists instances of
- Type:
class
-
predicate_persister:
PredicatePersister¶ `.Predicate.
- Type:
Persists instances of
- Type:
class
-
predicate_stash:
Stash¶ A stash adapatation of
predicate_persister.
-
relation_persister:
RelationPersister¶ `.Relation.
- Type:
Persists instances of
- Type:
class
-
relation_stash:
Stash¶ A stash adapatation of
relation_persister.
-
role_link_persister:
RoleLinkPersister¶ `.RoleLink.
- Type:
Persists instances of
- Type:
class
-
role_persister:
RolePersister¶ `.Role
- Type:
Persists instances of
- Type:
class
-
roleset_persister:
RolesetPersister¶ `.Roleset.
- Type:
Persists instances of
- Type:
class
-
roleset_stash:
Stash¶ A stash adapatation of
roleset_persister.
- class zensols.propbankdb.db.ExamplePersister(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)[source]¶
Bases:
BankObjectPersister- __init__(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)¶
- class zensols.propbankdb.db.FramesetPersister(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)[source]¶
Bases:
BankObjectPersister- __init__(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)¶
- class zensols.propbankdb.db.FunctionPersister(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None)[source]¶
Bases:
BeanDbPersisterUtility persister to access :class:`.Function`s by label and ID. This is a somewhat like a GoF flyweight pattern in that it attempts to minimize the memory footprint.
- __init__(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None)¶
- class zensols.propbankdb.db.InstallerConnectionManager(db_file, create_db=True, installer=None)[source]¶
Bases:
SqliteConnectionManagerA connection manager that first downloads the distribution SQLite Propbankdb file.
- __init__(db_file, create_db=True, installer=None)¶
- class zensols.propbankdb.db.PredicatePersister(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)[source]¶
Bases:
BankObjectPersister- __init__(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)¶
- class zensols.propbankdb.db.RelationPersister(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)[source]¶
Bases:
BankObjectPersisterUtility persister to access :class:`.Relation`s by label and ID. This is a somewhat like a GoF flyweight pattern in that it attempts to minimize the memory footprint.
- __init__(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)¶
- class zensols.propbankdb.db.RoleLinkPersister(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)[source]¶
Bases:
BankObjectPersister- __init__(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)¶
- class zensols.propbankdb.db.RolePersister(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)[source]¶
Bases:
BankObjectPersister- __init__(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)¶
- class zensols.propbankdb.db.RolesetPersister(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)[source]¶
Bases:
BankObjectPersister- __init__(conn_manager, sql_file=None, row_factory='tuple', select_name=None, select_by_id_name=None, select_exists_name=None, insert_name=None, update_name=None, delete_name=None, keys_name=None, count_name=None, _db=None, populators=<factory>)¶
zensols.propbankdb.domain module¶
Bank domain classes.
- class zensols.propbankdb.domain.Alias(uid=None, part_of_speech=None, word=None)[source]¶
Bases:
BankObjectSurface forms of the
Roleset.lemmaand their part of speech.- __init__(uid=None, part_of_speech=None, word=None)¶
- uid: int = None¶
A unique identifier of the function.
- word: str = None¶
“Surface forms of the
Roleset.lemma().
- exception zensols.propbankdb.domain.BankError[source]¶
Bases:
ApplicationErrorRaised for this package’s application errors meant for the command line. It will result in a command line error and usage message.
- __module__ = 'zensols.propbankdb.domain'¶
- class zensols.propbankdb.domain.BankObject[source]¶
Bases:
PersistableContainer,DictableA base class for all
*bankdomain classes.- __init__()¶
- exception zensols.propbankdb.domain.BankParseError[source]¶
Bases:
APIErrorRaised for this package’s programmatic errors.
- __annotations__ = {}¶
- __module__ = 'zensols.propbankdb.domain'¶
- class zensols.propbankdb.domain.Example(uid=None, name=None, source=None, text=None, propbank=None)[source]¶
Bases:
BankObjectExamples of the usage of the
Roleset.- __init__(uid=None, name=None, source=None, text=None, propbank=None)¶
- name: str = None¶
The name, such as (
see-v: ARG0 and ARG1).
- propbank: Optional[PropBank] = None¶
The PropBank annotations for the example, which include token spans of the use of arguments.
- source: str = None¶
The source of the example, such as (
ontonotes mz/sinorama/10/ectb_1057).
- text: str = None¶
The text of the example, such as (
But recently many people...).
- uid: int = None¶
The database unique identifier.
- class zensols.propbankdb.domain.Frameset(uid=None, path=None, predicates=None)[source]¶
Bases:
BankObjectContains all the
Predicatedefinitions from an file.- __init__(uid=None, path=None, predicates=None)¶
- path: Path = None¶
The file from which the definition was parsed.
- predicates: Tuple[Predicate] = None¶
The the role sets for a lemmatized word.
- uid: int = None¶
The database unique identifier.
- class zensols.propbankdb.domain.Function(uid=None, label=None, description='none', group='unknown')[source]¶
Bases:
BankObjectThe role function of the role, such as
PAGforprototypical agent. These are taken from the frameset DTD from the AMR 3.0 corpus.- NO_DESCRIPTION: ClassVar[str] = 'none'¶
A constant used when the function has no description.
- __init__(uid=None, label=None, description='none', group='unknown')¶
- description: str = 'none'¶
The human readable description of the function.
- group: str = 'unknown'¶
The group the function belongs to (i.e.
spacial).
- label: str = None¶
The label (i.e.
PAG).
- uid: int = None¶
A unique identifier of the function.
- class zensols.propbankdb.domain.PartOfSpeech(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
Bases:
EnumThe part of speech identifier in aliases.
- adjective = 'j'¶
- adverb = 'r'¶
- noun = 'n'¶
- preposition = 'p'¶
- unknown = '-'¶
- verb = 'v'¶
- class zensols.propbankdb.domain.Predicate(uid=None, lemma=None, rolesets=None)[source]¶
Bases:
BankObjectContains the role sets for a lemmatized word.
- __init__(uid=None, lemma=None, rolesets=None)¶
- lemma: str = None¶
The lemmetized version of the word this role set describes, such as
see.
- rolesets: Tuple[Roleset] = None¶
The associated role sets for this predicate.
- uid: int = None¶
The database unique identifier.
- class zensols.propbankdb.domain.PropBank(relative_indicies, relative_tokens, argument_spans)[source]¶
Bases:
BankObjectThe PropBank annotations for the
Example, which include token spans of the use of arguments.- UNKONWN_INDEX: ClassVar[int] = -1¶
- __init__(relative_indicies, relative_tokens, argument_spans)¶
- argument_spans: Tuple[PropBankArgumentSpan]¶
The spans of the arguments used in the example.
- relative_indicies: Tuple[int]¶
The 0-index index of the relative token in the example.
- relative_tokens: Tuple[str]¶
The relative token in example (for example,
see).
- class zensols.propbankdb.domain.PropBankArgumentSpan(type, span, token)[source]¶
Bases:
BankObjectAn argument span used in a
PropBank.- __init__(type, span, token)¶
- span: Tuple[int, int]¶
The 0-index inclusive token span in form
(start, end).
- type: str¶
The type (index) of argument (for example,
ARG0).
- class zensols.propbankdb.domain.Reification(concept, source_argument, target_argument)[source]¶
Bases:
BankObjectReifications are a particular kind of transformation that replaces an edge relation with a new node and two outgoing edge relations, with one inverted.
- __init__(concept, source_argument, target_argument)¶
- concept: RolesetId¶
The concept to add.
- source_argument: int¶
The source argument used to index.
- target_argument: int¶
The target argument to create.
- class zensols.propbankdb.domain.Relation(uid=None, label=None, type=None, description=None, regex=None, reification=None)[source]¶
Bases:
BankObjectRepresents an AMR relation, which is a label on the edge an an AMR graph such as
:ARG0-of. Note that a relation is often referred to as a role in the context of Penman notation. However, you can think of an instance of role as an edge in am AMR graph as an instance of this class.- REGEX: ClassVar[re.Pattern] = re.compile('^:([^0-9-]+)(\\d+)?(?:-(of))?$')¶
The regular expresssion used to parse AMR roles.
- __init__(uid=None, label=None, type=None, description=None, regex=None, reification=None)¶
- description: str = None¶
A somewhat human readable string describing the relation. This is used to create the relation embeddings.
- label: str = None¶
The surface name of the relation (i.e.
ARGfrom:ARG0-of).
- regex: re.Pattern = None¶
A regular expression used to match role instances.
- reification: Optional[Reification] = None¶
The reification of the relation if any exist.
- type: str = None¶
The type of relation (i.e. general for
:ARGor date fortime).
- uid: int = None¶
The database unique identifier.
- class zensols.propbankdb.domain.Role(uid=None, description=None, function=None, index=None, role_links=None)[source]¶
Bases:
BankObjectDefines an argument of the propbank role, which in AMR, has the syntax
:ARG1for the second (0-index) second argument.- __init__(uid=None, description=None, function=None, index=None, role_links=None)¶
- description: str = None¶
The human readable description of the role, such as (
Cause of hardening).
- function: Function = None¶
The function of the role, such as
PAGforprototypical agent.
- index: str = None¶
The index of the role’s argument, which is a a number, or an
Mfor common adjuncts that don’t qualify for number argument status.
- role_links: Tuple[RoleLink] = None¶
Links to the source banks for this role.
- uid: int = None¶
The database unique identifier.
- class zensols.propbankdb.domain.RoleLink(uid=None, cls=None, resource=None, version=None, name=None)[source]¶
Bases:
BankObjectContains links in to other source banks.
- __init__(uid=None, cls=None, resource=None, version=None, name=None)¶
- cls: str = None¶
The roleset’s levine class, which is the ID in to the bank such as
other_cos-45.4.
- name: str = None¶
The name of the role, such as
agentorcause.
- resource: RoleResource = None¶
The name of the source bank, such as
VerbNet.
- uid: int = None¶
The database unique identifier.
- version: str = None¶
The version of the source bank, such as
verbnet3.3.
- write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]¶
Write this instance as either a
Writableor as aDictable. If class attribute_DICTABLE_WRITABLE_DESCENDANTSis set asTrue, then use thewrite()method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating adictrecursively usingasdict(), then formatting the output.If the attribute
_DICTABLE_WRITE_EXCLUDESis set, those attributes are removed from what is written in thewrite()method.Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.
- Parameters:
depth (
int) – the starting indentation depthwriter (
TextIOBase) – the writer to dump the content of this writable
- class zensols.propbankdb.domain.RoleResource(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
Bases:
EnumThe source bank of the role. This has the XML attribute
resource.- framenet = 2¶
- verbnet = 1¶
- class zensols.propbankdb.domain.Roleset(uid=None, id=None, name=None, aliases=None, roles=None, examples=None)[source]¶
Bases:
BankObjectA bank role set entry that contains a grouping of :class:`.Role`s.
- __init__(uid=None, id=None, name=None, aliases=None, roles=None, examples=None)¶
- aliases: Tuple[Alias] = None¶
The surfrace forms of the role set.
- examples: Tuple[Example] = None¶
The examples for the roleset.
- id: RolesetId = None¶
The
*bankidentifier of the role set, wuch assee.01.
- name: str = None¶
The human readable short description of the role set, such as
view.
- roles: Tuple[Role] = None¶
The roles that define this set.
- uid: int = None¶
The database unique identifier.
- class zensols.propbankdb.domain.RolesetId(label=None, lemma=None, index=None, normalize=True)[source]¶
Bases:
BankObjectA role set identifier identifier such as
see.01orsee-01. Note the later example is to support AMR formatted nodes.- See:
- See:
- __init__(label=None, lemma=None, index=None, normalize=True)¶
- property is_valid: bool¶
Whether this is a valid formatted role set ID, which means it has both a
lemmaandindex.
- label: str = None¶
The surface label of the ID (i.e.
see-01).
- normalize: InitVar[bool] = True¶
Whether to normalize the label.
zensols.propbankdb.embedgen module¶
Creates embeddings for PropBank objects. This module is used to create the
distribution file’s embeddings with EmbeddingGenerator. The
distribution file’s contents include the Framenet object graphs, embeddings and
a metadata files used to configure EmbeddingPopulator.embed_model, which
is used by the package to populate the embeddings.
- class zensols.propbankdb.embedgen.EmbeddingGenerator(config, doc_parser, word_piece_doc_factory, populator, db, output_dir, output_decimals, output_limit=9223372036854775807)[source]¶
Bases:
objectCreates sentence embeddings for PropBank objects (see module docs).
- __init__(config, doc_parser, word_piece_doc_factory, populator, db, output_dir, output_decimals, output_limit=9223372036854775807)¶
-
config:
Configurable¶ Used to copy application context in to the metadata file included in the distribution file.
-
doc_parser:
FeatureDocumentParser¶ The used for parsing the English in the Framenets.
-
populator:
EmbeddingPopulator¶ The embedding populator used to create the embedding keys.
-
word_piece_doc_factory:
WordPieceFeatureDocumentFactory¶ Used to create the embeddings from the language in the Framenets.
zensols.propbankdb.embedpop module¶
Populate embeddings generated from EmbeddingGenerator.
- see:
embedgen
- class zensols.propbankdb.embedpop.EmbeddingBankObjectPopulator(embed_populator)[source]¶
Bases:
BankObjectPopulator- __init__(embed_populator)¶
-
embed_populator:
EmbeddingPopulator¶
- class zensols.propbankdb.embedpop.EmbeddingPopulator(config, embed_model, function_persister, torch_config, roleset_key_pattern='s~{rs.id.label}', role_key_pattern='r~{rs.id}~{r.index}', function_key_pattern='f-{f.label}', relation_key_pattern='e-{r.label}')[source]¶
Bases:
objectAdds embeddings to certains
BankObjectinstances (see module docs).-
DEFAULT_SECTION:
ClassVar[str] = 'propbankdb_default'¶ The default application context section that has the distribution file version.
-
EMBEDDING_SECTION:
ClassVar[str] = 'embedding'¶ The embedding information section in the distribution metadata config file.
-
FILE_NAME_OPTION:
ClassVar[str] = 'file_name'¶ The option key in the distribution metadata config file.
-
SENT_TEXT_FILE:
ClassVar[str] = 'sentence.csv'¶ The name of the CSV file that has the sentence output with keys.
- __init__(config, embed_model, function_persister, torch_config, roleset_key_pattern='s~{rs.id.label}', role_key_pattern='r~{rs.id}~{r.index}', function_key_pattern='f-{f.label}', relation_key_pattern='e-{r.label}')¶
-
config:
Configurable¶ Used to check API with model version.
-
embed_model:
WordEmbedModel¶ The embedding model that was created from
EmbeddingGeneratorand used to populate data inBankObjectinstances.
-
function_persister:
FunctionPersister¶ The persister used to populate embeddings for :class:`.Function`s.
- get_sentence(*objs)[source]¶
Get the sentence used to produce the embedding in
objsbank objects.- See:
- Return type:
- object_to_key(*objs)[source]¶
Generate a key from
objsby calling one of the*_to_keymethods.- Return type:
- populate_roleset(roleset)[source]¶
Populate embeddings for all of a role set’s object graph.
- Parameters:
roleset (
Roleset) – the role set with respectiveBaseObject.embeddingfields to be populated
-
torch_config:
TorchConfig¶ Used to copy the embedding matrix to the GPU.
-
DEFAULT_SECTION:
zensols.propbankdb.load module¶
Parses frameset XML files from the file system.
- class zensols.propbankdb.load.DatabaseLoader(parser, relation_loader, db, frameset_limit=9223372036854775807)[source]¶
Bases:
objectLoads the parsed frameset XML files in to an SQLite database.
- __init__(parser, relation_loader, db, frameset_limit=9223372036854775807)¶
-
parser:
FramesetParser¶ Parses all frameset XML files on the file system, which is then used to load the database.
-
relation_loader:
RelationLoader¶ Loads AMR role relations.
- class zensols.propbankdb.load.FramesetParser(installer, function_path, frames_dir)[source]¶
Bases:
objectParses all frameset XML files on the file system.
- __init__(installer, function_path, frames_dir)¶
-
frames_dir:
Path¶ The relative path from where the GitHub propbank files are downloaded and uncompressed.
- property function_df: Dict[str, Function]¶
Keys are function labels/names, values are the function instances.
zensols.propbankdb.pack module¶
Package resources for distribution.