zensols.amr.wlk package#

Submodules#

zensols.amr.wlk.amr_similarity#

Inheritance diagram of zensols.amr.wlk.amr_similarity
class zensols.amr.wlk.amr_similarity.AmrWasserPreProcessor(w2v_uri='glove-wiki-gigaword-100', relation_type='scalar', init='random_uniform', is_resettable=True)[source]#

Bases: Preprocessor

__init__(w2v_uri='glove-wiki-gigaword-100', relation_type='scalar', init='random_uniform', is_resettable=True)[source]#

Initilize Preprocessor object

Parameters:
  • w2v_uri (string) –

    uri to desired word embedding e.g., ‘word2vec-google-news-100’ ‘glove-twitter-100’ ‘fasttext-wiki-news-subwords-300’ etc. if None, then use only random embeddings

    alternatively: a dict type object with pretrained vecs

  • relation_type (string) – edge label representation type possible: either ‘scalar’ or ‘vector’

  • init (string) – how to initialize edge weights?

  • is_resettable (bool) – can the parameters be resetted?

embed(G)[source]#
embeds(gs1, gs2)[source]#
embeds all graphs, i.e., assign embeddings to node labels

and edge labels

Parameters:
  • gs1 (list with nx medi graphs) – a list of graphs

  • gs2 (list with nx medi graphs) – a list of graphs

Returns:

None

get_edge_labels(G)[source]#

Retrieve all edge labels from a graph

Parameters:

G (nx medi graph) – nx multi edge dir. graph

Returns:

list with edge labels

sample_edge_label_param(n=1)[source]#

initialize edge parameters.

The idea with min entropy is to better be able distinguish between edges. This helps with label discirmintation in ARG tasks (but slightly reduces performance in other tasks, for other tasks similar or learnt edge weights may be better)

Parameters:

n (int) – how many parameters are needed?

Returns:

array with parameters

class zensols.amr.wlk.amr_similarity.EmSimilarity[source]#

Bases: object

__init__()[source]#
ems_multi(distss, v1s, v2s, parallel=False)[source]#

Predict WWLK similarities for two (parallel) data sets

Parameters:
  • amrs1 (list) – list with nx medi graphs a_1,…,a_n

  • amrs2 (list) – list with nx medi graphs b_1,…,b_n

  • parallel (boolean) – parallelize computation? default=no

Returns:

similarities for AMR graphs

class zensols.amr.wlk.amr_similarity.GraphSimilarityPredictor[source]#

Bases: object

(Interface): predicts similarities for paired Inputs that are multi edge networkx Di-Graphs,

predict(graphs_1, graphs_2)[source]#
validate(graphs)[source]#
class zensols.amr.wlk.amr_similarity.GraphSimilarityPredictorAligner[source]#

Bases: GraphSimilarityPredictor

(Interface): predicts similarities for paired amrs Input are multi edge networkx Di-Graphs

predict_and_align(graphs_1, graphs_2, node_map1, node_map2)[source]#
validate(graphs, nodemaps)[source]#
class zensols.amr.wlk.amr_similarity.NodeDistanceMatrixGenerator(params=None, param_keys=None, iters=2, communication_direction='both')[source]#

Bases: object

Given a list with graph tuples, it generates node embeddings and produces distance matrix

__init__(params=None, param_keys=None, iters=2, communication_direction='both')[source]#

Intitalizes node embedding generatror

Parameters:
  • params (array) – edge parameters

  • param_keys (dict) – maps from edge labels to parameter index

  • iters (int) – contextualization iterations

  • communication_direction – either “both”, “fromout”, or “fromin” specifies message passing direction (see arguments of main_wlk_wasser.py)

collect_graph_embed(nx_latent)[source]#

collect the node embeddings from a graph

Parameters:

nx_latent (nx medi graph) – a graph that has node embeddings as attributes

Returns:

  • the node embeddings

  • labels of nodes

generate(amrs1, amrs2, parallel=False)[source]#

two (parallel) data sets, call _wl_embed_single on each paired graph

Parameters:
  • amrs1 (list) – list with nx medi graphs a_1,…,a_n

  • amrs2 (list) – list with nx medi graphs b_1,…,b_n

  • parallel (boolean) – parallelize computation? default=no

Returns:

output of _wl_embed_single for each graph pair

get_params()[source]#

get edge params

maybe_has_param(label)[source]#

safe retrieval of an edge parameter

norm(x)[source]#

scale vector to length 1

set_params(params, idx=None)[source]#

set edge params

class zensols.amr.wlk.amr_similarity.Preprocessor[source]#

Bases: object

(Interface): preprocesses data of paired multi edge networkx Di-Graphs

prepare(graphs_1, graphs_2)[source]#
reset()[source]#
transform(graphs_1, graphs_2)[source]#

preprocesses amr graphs

class zensols.amr.wlk.amr_similarity.WLK(simfun='cosine', iters=2, communication_direction='both')[source]#

Bases: GraphSimilarityPredictor

__init__(simfun='cosine', iters=2, communication_direction='both')[source]#
create_fea_vec(items, vocab)[source]#

create freture vector from bow list and vocab

Parameters:
  • items (list) – list with items e.g. [x, y, z]

  • vocab (dict) – dict with item-> id eg. {x:2, y:4, z:5}

Returns:

feature vector, e.g., [0, 0, 1, 0, 1, 1]

get_stats(g1, g2, stattype='nodecount')[source]#

get feature vec for a statistitic type

Parameters:
  • g1 (nx medi graph) – graph A

  • g2 (nx medi graph) – graph B

  • stattype (string) – statistics type, default: node count

Returns:

  • vector for A

  • vector for B

  • vocab

nc(g1, g2)[source]#

feature vector constructor for node BOW of two graphs

Parameters:
  • g1 (nx medi graph) – graph A

  • g2 (nx medi graph) – graph B

Returns:

feature vector for graph A, feature vector for graph B, vocab

sort_relabel(dic1, dic2)[source]#

form aggregate labels via sorting

Parameters:
  • dic1 (dict) – node-neighborhood dict of graph A

  • dic2 (dict) – node-neighborhood dict of graph B

Returns:

two dicts where keys are same and values are strings

tc(g1, g2)[source]#

feature vector constructor for triple BOW of two graphs

Parameters:
  • g1 (nx medi graph) – graph A

  • g2 (nx medi graph) – graph B

Returns:

feature vector for graph A, feature vector for graph B, vocab

update_node_labels(G, dic)[source]#
wl(nx_g1, nx_g2, iters=2, stattype='nodecount')[source]#

collect vectors over WL iterations

Parameters:
  • nx_g1 (nx medi graph) – graph A

  • nx_g2 (nx medi graph) – graph B

Returns:

a list for every graph that contains vectors

wl_gather_node(node, G)[source]#

gather edges+labels for a node from the neighborhood

Parameters:
  • node (hashable object) – a node of the graph

  • G (nx medi graph) – the graph

Returns:

a list with edge+label from neighbors

wl_gather_nodes(G)[source]#

apply gathering (wl_gather_node) for all nodes

Parameters:

G (nx medi graph) – the graph

Returns:

a dictionary node -> neigjborhood

wl_iter(nx_g1, nx_g2, stattype='nodecount')[source]#

collect vectors over one WL iteration

Parameters:
  • nx_g1 (nx medi graph) – graph A

  • nx_g2 (nx medi graph) – graph B

Returns:

  • a list for every graph that contains vectors

  • new aggreagate graphs

wlk(nx_g1, nx_g2, iters=2, weighting='linear', kt='dot', stattype='nodecount', init_vecs=(None, None))[source]#

compute WL kernel similarity of graph A and B

Parameters:
  • nx_g1 (nx medi graph) – graph A

  • nx_g2 (nx medi graph) – graph B

  • iters (int) – iterations

  • weighting (string) – decrease weight of iteration stats

  • kt (string) – kernel type, default dot

  • stattype (string) – which features? default: nodecount

  • init_vecs (tuple) – perhaps there are already some features for A and B?

Returns:

kernel similarity

class zensols.amr.wlk.amr_similarity.WasserWLK(preprocessor, iters=2, stability=0, communication_direction='both')[source]#

Bases: GraphSimilarityPredictorAligner

__init__(preprocessor, iters=2, stability=0, communication_direction='both')[source]#

Initializes Wasserstein Weisfeiler Leman Kernel

Parameters:
  • preprocessor (Preprocessor) – an object that assigns embeddings to graph nodes and labels

  • iters (int) – K

  • stability (int) –

    in case there is randomness in pre-processing

    (e.g., random embeddings for node labels not found in word2vec)

    then we compute an expected distance matrix by repeated sampling

  • communication_direction (string) – communication direction in which messages are passed

Returns:

None

zensols.amr.wlk.graph_helpers#

Inheritance diagram of zensols.amr.wlk.graph_helpers
class zensols.amr.wlk.graph_helpers.GraphParser(input_format='penman', edge_to_node_transform=False)[source]#

Bases: object

__init__(input_format='penman', edge_to_node_transform=False)[source]#
graphs_to_triples(string_graphs)[source]#
parse(string_graphs)[source]#
zensols.amr.wlk.graph_helpers.add_edges(G, triples, src_tgt_index_map)[source]#

add edges to graph.

Parameters:
  • G (nx medi graph) – a graph

  • triples (list) – list with (s, rel, t) tuples

  • src_tgt_index_map (dict) – a map from amr variables to node ids

Returns:

None

zensols.amr.wlk.graph_helpers.add_nodes(G, nodelist, label_map)[source]#

add nodes to a graph

Parameters:
  • G (nx medi graph) – input graph

  • nodelist (list) – a list with node ids to be inserted into G

  • label_map (dict) – a map node id –> label (e.g., {0:”boy”, …})

Returns:

None

zensols.amr.wlk.graph_helpers.amrtriples2nxmedigraph(triples, edge_to_node_transform=False)[source]#

builds nx medi graph from amr triples.

Parameters:
  • triples (list) – list with AMR triples, e.g. [(“a”, “:instance”, “boy”), (“r”, “:arg0”, “b”), …]

  • add_coref_to_labels (bool) – if true then add (redundant) coref info to node labels (default: False)

Returns:

nx multi edge di graph where nodes are ids and nodes and labels carry attribute “label”.

zensols.amr.wlk.graph_helpers.do_edge_node_transform(triples)[source]#
zensols.amr.wlk.graph_helpers.get_var_concept_map(triples)[source]#
creates a dictionary that maps varibales to their concepts

e.g., [(x, :instance, y),…] —> {x:y,…}

Parameters:

triples (list) – triples

Returns:

dictionary

zensols.amr.wlk.graph_helpers.get_var_index_map(triples)[source]#
creates a dictionary that maps varibales to indeces

e.g., [(x, :instance, y),…] —> {x:y,…}

Parameters:

triples (list) – triples

Returns:

dictionary

zensols.amr.wlk.graph_helpers.maybe_fix_if_concept_node_same_as_var_node(triples)[source]#
zensols.amr.wlk.graph_helpers.nx_digraph_to_triples(G)[source]#

convert nx graph to triples. Attention: there may be info loss

zensols.amr.wlk.graph_helpers.penmangraph2triples(G)[source]#
zensols.amr.wlk.graph_helpers.reify_nodes(triples)[source]#
zensols.amr.wlk.graph_helpers.stringamr2graph(string)[source]#

uses penman to convert serialized AMR to penman graph

Parameters:

string (str) – serialized AMR ‘(n / concept :arg1 ()…)’

Returns:

penman graph object

zensols.amr.wlk.graph_helpers.triples2penmangraph(triples)[source]#
zensols.amr.wlk.graph_helpers.tsv2triples(string)[source]#

Parses tsv graph to triples

Parameters:

string (str) –

tsv graph e.g.

x y :edge_1 y z :edge_2 z x :edge_3 x dog :instance y cat :instance z like :instance

defines a graph between source and target nodes with edge labels. Node labels are indicated with special :instance edge

Returns:

triples

zensols.amr.wlk.score#

Inheritance diagram of zensols.amr.wlk.score

Weisfeiler-Leman Graph Kernels for AMR Graph Similarity

Citation:

.. code:: none

  Juri Opitz, Angel Daza, and Anette Frank. 2021. Weisfeiler-Leman in the
  Bamboo: Novel AMR Graph Metrics and a Benchmark for AMR Graph
  Similarity. Transactions of the Association for Computational Linguistics,
  9:1425–1441.
see:

WLK

class zensols.amr.wlk.score.WeisfeilerLemanKernelScoreCalculator(reverse_sents=False, params=<factory>)[source]#

Bases: ScoreMethod

Computes the Weisfeiler-Leman Kernel (WLK) scores of AMR sentences (see module docs).

Juri Opitz, Angel Daza, and Anette Frank. 2021. Weisfeiler-Leman in the Bamboo: Novel AMR Graph Metrics and a Benchmark for AMR Graph Similarity. Transactions of the Association for Computational Linguistics, 9:1425–1441.

See:

WLK

__init__(reverse_sents=False, params=<factory>)#
params: Dict[str, Any]#

Parameters given to the implementation of the Weisfeiler-Leman Kernel scoring method WLK class.

Module contents#

Weisfeiler-Leman Graph Kernels for AMR Graph Similarity

Juri Opitz, Angel Daza, and Anette Frank. 2021. Weisfeiler-Leman in the Bamboo: Novel AMR Graph Metrics and a Benchmark for AMR Graph Similarity. Transactions of the Association for Computational Linguistics, 9:1425–1441.

see:

WLK