zensols.amr.wlk package#
Submodules#
zensols.amr.wlk.amr_similarity#
- class zensols.amr.wlk.amr_similarity.AmrWasserPreProcessor(w2v_uri='glove-wiki-gigaword-100', relation_type='scalar', init='random_uniform', is_resettable=True)[source]#
Bases:
Preprocessor
- __init__(w2v_uri='glove-wiki-gigaword-100', relation_type='scalar', init='random_uniform', is_resettable=True)[source]#
Initilize Preprocessor object
- Parameters:
w2v_uri (string) –
uri to desired word embedding e.g., ‘word2vec-google-news-100’ ‘glove-twitter-100’ ‘fasttext-wiki-news-subwords-300’ etc. if None, then use only random embeddings
alternatively: a dict type object with pretrained vecs
relation_type (string) – edge label representation type possible: either ‘scalar’ or ‘vector’
init (string) – how to initialize edge weights?
is_resettable (bool) – can the parameters be resetted?
- embeds(gs1, gs2)[source]#
- embeds all graphs, i.e., assign embeddings to node labels
and edge labels
- Parameters:
gs1 (list with nx medi graphs) – a list of graphs
gs2 (list with nx medi graphs) – a list of graphs
- Returns:
None
- get_edge_labels(G)[source]#
Retrieve all edge labels from a graph
- Parameters:
G (nx medi graph) – nx multi edge dir. graph
- Returns:
list with edge labels
- sample_edge_label_param(n=1)[source]#
initialize edge parameters.
The idea with min entropy is to better be able distinguish between edges. This helps with label discirmintation in ARG tasks (but slightly reduces performance in other tasks, for other tasks similar or learnt edge weights may be better)
- Parameters:
n (int) – how many parameters are needed?
- Returns:
array with parameters
- class zensols.amr.wlk.amr_similarity.EmSimilarity[source]#
Bases:
object
- class zensols.amr.wlk.amr_similarity.GraphSimilarityPredictor[source]#
Bases:
object
(Interface): predicts similarities for paired Inputs that are multi edge networkx Di-Graphs,
- class zensols.amr.wlk.amr_similarity.GraphSimilarityPredictorAligner[source]#
Bases:
GraphSimilarityPredictor
(Interface): predicts similarities for paired amrs Input are multi edge networkx Di-Graphs
- class zensols.amr.wlk.amr_similarity.NodeDistanceMatrixGenerator(params=None, param_keys=None, iters=2, communication_direction='both')[source]#
Bases:
object
Given a list with graph tuples, it generates node embeddings and produces distance matrix
- __init__(params=None, param_keys=None, iters=2, communication_direction='both')[source]#
Intitalizes node embedding generatror
- collect_graph_embed(nx_latent)[source]#
collect the node embeddings from a graph
- Parameters:
nx_latent (nx medi graph) – a graph that has node embeddings as attributes
- Returns:
the node embeddings
labels of nodes
- class zensols.amr.wlk.amr_similarity.Preprocessor[source]#
Bases:
object
(Interface): preprocesses data of paired multi edge networkx Di-Graphs
- class zensols.amr.wlk.amr_similarity.WLK(simfun='cosine', iters=2, communication_direction='both')[source]#
Bases:
GraphSimilarityPredictor
- get_stats(g1, g2, stattype='nodecount')[source]#
get feature vec for a statistitic type
- Parameters:
g1 (nx medi graph) – graph A
g2 (nx medi graph) – graph B
stattype (string) – statistics type, default: node count
- Returns:
vector for A
vector for B
vocab
- nc(g1, g2)[source]#
feature vector constructor for node BOW of two graphs
- Parameters:
g1 (nx medi graph) – graph A
g2 (nx medi graph) – graph B
- Returns:
feature vector for graph A, feature vector for graph B, vocab
- tc(g1, g2)[source]#
feature vector constructor for triple BOW of two graphs
- Parameters:
g1 (nx medi graph) – graph A
g2 (nx medi graph) – graph B
- Returns:
feature vector for graph A, feature vector for graph B, vocab
- wl(nx_g1, nx_g2, iters=2, stattype='nodecount')[source]#
collect vectors over WL iterations
- Parameters:
nx_g1 (nx medi graph) – graph A
nx_g2 (nx medi graph) – graph B
- Returns:
a list for every graph that contains vectors
- wl_gather_node(node, G)[source]#
gather edges+labels for a node from the neighborhood
- Parameters:
node (hashable object) – a node of the graph
G (nx medi graph) – the graph
- Returns:
a list with edge+label from neighbors
- wl_gather_nodes(G)[source]#
apply gathering (wl_gather_node) for all nodes
- Parameters:
G (nx medi graph) – the graph
- Returns:
a dictionary node -> neigjborhood
- wl_iter(nx_g1, nx_g2, stattype='nodecount')[source]#
collect vectors over one WL iteration
- Parameters:
nx_g1 (nx medi graph) – graph A
nx_g2 (nx medi graph) – graph B
- Returns:
a list for every graph that contains vectors
new aggreagate graphs
- wlk(nx_g1, nx_g2, iters=2, weighting='linear', kt='dot', stattype='nodecount', init_vecs=(None, None))[source]#
compute WL kernel similarity of graph A and B
- Parameters:
nx_g1 (nx medi graph) – graph A
nx_g2 (nx medi graph) – graph B
iters (int) – iterations
weighting (string) – decrease weight of iteration stats
kt (string) – kernel type, default dot
stattype (string) – which features? default: nodecount
init_vecs (tuple) – perhaps there are already some features for A and B?
- Returns:
kernel similarity
- class zensols.amr.wlk.amr_similarity.WasserWLK(preprocessor, iters=2, stability=0, communication_direction='both')[source]#
Bases:
GraphSimilarityPredictorAligner
- __init__(preprocessor, iters=2, stability=0, communication_direction='both')[source]#
Initializes Wasserstein Weisfeiler Leman Kernel
- Parameters:
preprocessor (Preprocessor) – an object that assigns embeddings to graph nodes and labels
iters (int) – K
stability (int) –
- in case there is randomness in pre-processing
(e.g., random embeddings for node labels not found in word2vec)
then we compute an expected distance matrix by repeated sampling
communication_direction (string) – communication direction in which messages are passed
- Returns:
None
zensols.amr.wlk.graph_helpers#
- class zensols.amr.wlk.graph_helpers.GraphParser(input_format='penman', edge_to_node_transform=False)[source]#
Bases:
object
- zensols.amr.wlk.graph_helpers.amrtriples2nxmedigraph(triples, edge_to_node_transform=False)[source]#
builds nx medi graph from amr triples.
- Parameters:
- Returns:
nx multi edge di graph where nodes are ids and nodes and labels carry attribute “label”.
- zensols.amr.wlk.graph_helpers.get_var_concept_map(triples)[source]#
- creates a dictionary that maps varibales to their concepts
e.g., [(x, :instance, y),…] —> {x:y,…}
- Parameters:
triples (list) – triples
- Returns:
dictionary
- zensols.amr.wlk.graph_helpers.get_var_index_map(triples)[source]#
- creates a dictionary that maps varibales to indeces
e.g., [(x, :instance, y),…] —> {x:y,…}
- Parameters:
triples (list) – triples
- Returns:
dictionary
- zensols.amr.wlk.graph_helpers.nx_digraph_to_triples(G)[source]#
convert nx graph to triples. Attention: there may be info loss
- zensols.amr.wlk.graph_helpers.stringamr2graph(string)[source]#
uses penman to convert serialized AMR to penman graph
- Parameters:
string (str) – serialized AMR ‘(n / concept :arg1 ()…)’
- Returns:
penman graph object
- zensols.amr.wlk.graph_helpers.tsv2triples(string)[source]#
Parses tsv graph to triples
- Parameters:
string (str) –
tsv graph e.g.
x y :edge_1 y z :edge_2 z x :edge_3 x dog :instance y cat :instance z like :instance
defines a graph between source and target nodes with edge labels. Node labels are indicated with special :instance edge
- Returns:
triples
zensols.amr.wlk.score#
Weisfeiler-Leman Graph Kernels for AMR Graph Similarity
Citation:
.. code:: none
Juri Opitz, Angel Daza, and Anette Frank. 2021. Weisfeiler-Leman in the
Bamboo: Novel AMR Graph Metrics and a Benchmark for AMR Graph
Similarity. Transactions of the Association for Computational Linguistics,
9:1425–1441.
- see:
- class zensols.amr.wlk.score.WeisfeilerLemanKernelScoreCalculator(reverse_sents=False, params=<factory>)[source]#
Bases:
ScoreMethod
Computes the Weisfeiler-Leman Kernel (WLK) scores of AMR sentences (see module docs).
Juri Opitz, Angel Daza, and Anette Frank. 2021. Weisfeiler-Leman in the Bamboo: Novel AMR Graph Metrics and a Benchmark for AMR Graph Similarity. Transactions of the Association for Computational Linguistics, 9:1425–1441.
- See:
- __init__(reverse_sents=False, params=<factory>)#
Module contents#
Weisfeiler-Leman Graph Kernels for AMR Graph Similarity
Juri Opitz, Angel Daza, and Anette Frank. 2021. Weisfeiler-Leman in the Bamboo: Novel AMR Graph Metrics and a Benchmark for AMR Graph Similarity. Transactions of the Association for Computational Linguistics, 9:1425–1441.
- see: