zensols.clinicamr package

Submodules

zensols.clinicamr.adm module

Classes used to parse the clinical corpus into an annotated AMR corpus.

class zensols.clinicamr.adm.AdmissionAmrFactoryStash(corpus, amr_annotator, keep_notes, keep_summary_sections)[source]

Bases: ReadOnlyStash

A stash that CRUDs instances of AdmissionAmrFeatureDocument.

__init__(corpus, amr_annotator, keep_notes, keep_summary_sections)
amr_annotator: AnnotationFeatureDocumentParser

Parses, populates and caches AMR graphs in feature documents.

corpus: Corpus

The MIMIC-III corpus.

exists(name)[source]

Return True if data with key name exists.

Implementation note: This Stash.exists() method is very inefficient and should be overriden.

Return type:

bool

keep_notes: Union[List[str], Set[str]]

The note (by category) to keep in each clinical note. The rest are filtered.

keep_summary_sections: Union[List[str], Set[str]]

The sections to keep in each clinical note. The rest are filtered.

keys()[source]

Return an iterable of keys in the collection.

Return type:

Iterable[str]

load(name)[source]

Load an admission from the MIMIC-III package and parse it for language and AMRs.

Parameters:

name (str) – the MIMIC-III admission ID

Return type:

AdmissionAmrFeatureDocument

Returns:

the parsed admission

zensols.clinicamr.app module

Clincial Domain Abstract Meaning Representation Graphs

class zensols.clinicamr.app.Application(config_factory, doc_parser, adm_amr_stash, dumper)[source]

Bases: object

Clincial Domain Abstract Meaning Representation Graphs.

__init__(config_factory, doc_parser, adm_amr_stash, dumper)
adm_amr_stash: Stash

A stash that CRUDs instances of AdmissionAmrFeatureDocument.

config_factory: ConfigFactory

For creating app config resources.

doc_parser: FeatureDocumentParser

The document parser used for the parse() action.

dumper: Dumper

Plots and writes AMR content in human readable formats.

generate(ids, output_dir=None)[source]

Creates samples of generated AMR text by first parsing clinical sentences into graphs.

Parameters:
  • ids (str) – a comma separated list of admission IDs to generate

  • output_dir (Path) – the output directory

show_admission(hadm_id)[source]

Print an admission by ID.

Parameters:

hadm_id (str) – the admission ID

zensols.clinicamr.cli module

Command line entry point to the application.

class zensols.clinicamr.cli.ApplicationFactory(*args, **kwargs)[source]

Bases: ApplicationFactory

__init__(*args, **kwargs)[source]
classmethod get_admission_amr_stash()[source]
Return type:

AdmissionAmrFeatureDocument

classmethod get_corpus()[source]
Return type:

Corpus

classmethod get_doc_parser()[source]
Return type:

FeatureDocumentParser

zensols.clinicamr.cli.main(args=['/Users/landes/opt/lib/pixi/envs/zensols_relpo/bin/sphinx-build', '-M', 'html', '/Users/landes/view/nlp/med/clinicamr/target/doc/stage', '/Users/landes/view/nlp/med/clinicamr/target/doc/build'], **kwargs)[source]
Return type:

ActionResult

zensols.clinicamr.decorator module

Adds concept unique identifiers to the graph.

class zensols.clinicamr.decorator.ClinicTokenAnnotationFeatureDocumentDecorator(name, feature_id, indexed=False, add_none=False, use_sent_index=True, method='attribute', feature_format='[{cui_}]: {pref_name_} ({tui_descs_})')[source]

Bases: TokenAnnotationFeatureDocumentDecorator

Override token feature annotation by adding CUI data.

__init__(name, feature_id, indexed=False, add_none=False, use_sent_index=True, method='attribute', feature_format='[{cui_}]: {pref_name_} ({tui_descs_})')
feature_format: str = '[{cui_}]: {pref_name_} ({tui_descs_})'

The format used for CUI annotated tokens.

zensols.clinicamr.domain module

Object graph classes for EHR notes.

class zensols.clinicamr.domain.AdmissionAmrFeatureDocument(sents, text=None, spacy_doc=None, amr=None, coreference_relations=None, hadm_id=None, _ds_ix=None, _ant_ixs=None, parse_fails=None)[source]

Bases: AmrFeatureDocument

An AMR feature document whose sents consist of all parsed sentences of all notes of an admission.

__init__(sents, text=None, spacy_doc=None, amr=None, coreference_relations=None, hadm_id=None, _ds_ix=None, _ant_ixs=None, parse_fails=None)
clone(cls=None, **kwargs)[source]
Parameters:

kwargs – if copy_spacy is True, the spacy document is copied to the clone in addition parameters passed to new clone initializer

Return type:

TokenContainer

create_discharge_summary()[source]

Return the discharge summary note.

Return type:

NoteDocument

create_note_antecedents()[source]

Return the clinical notes of the admission.

Return type:

Iterable[NoteDocument]

hadm_id: str = None

The MIMIC-III admission ID.

parse_fails: Tuple[ParseFailure, ...] = None

Sentences who have parsed features, but the AMR parse failed.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write the document and optionally sentence features.

Parameters:
  • n_sents – the number of sentences to write

  • n_tokens – the number of tokens to print across all sentences

  • include_original – whether to include the original text

  • include_normalized – whether to include the normalized text

exception zensols.clinicamr.domain.ClinicAmrError[source]

Bases: APIError

Raised for this package’s API errors.

__module__ = 'zensols.clinicamr.domain'
class zensols.clinicamr.domain.NoteDocument(sents, note_ix)[source]

Bases: _IndexedDocument

An index container class that creates AMR clinical note documents.

__init__(sents, note_ix)[source]
property category: str

The category of the note (i.e. discharge-summary).

create_document()[source]

Create an AMR feature document.

Return type:

AmrFeatureDocument

create_sections()[source]

Return the clinical section documents of this section.

Return type:

Iterable[SectionDocument]

property row_id: int

The MIMIC-III unique row ID of the clinical note.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write the contents of this instance to writer using indention depth.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

class zensols.clinicamr.domain.ParseFailure(row_id, sec_id, sec_name, para_idx, sent)[source]

Bases: Writable

A container class for sentences who have parsed features, but the AMR parse failed.

__init__(row_id, sec_id, sec_name, para_idx, sent)
para_idx: int

The index of the paragraph.

row_id: int

The MIMIC-III unique row ID of the clinical note.

sec_id: int

The id.

sec_name: str

The name.

sent: AmrFeatureSentence

The AMR sentence.

See:

is_failure

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write the contents of this instance to writer using indention depth.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

class zensols.clinicamr.domain.SectionDocument(sents, sec_ix)[source]

Bases: _IndexedDocument

An index container class that creates AMR paragraph documents.

__init__(sents, sec_ix)[source]
create_document()[source]

Create an AMR feature document.

Return type:

AmrFeatureDocument

create_paragraphs()[source]

Return the paragraph documents of this section.

Return type:

Iterable[AmrFeatureDocument]

property id: int

The id.

property name: str

The name.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write the contents of this instance to writer using indention depth.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

zensols.clinicamr.parafac module

Parse clinical medical note paragraph AMR graphs and cache using a Stash.

class zensols.clinicamr.parafac.ClinicAmrParagraphFactory(delegate, amr_annotator, stash, document_decorators=(), sentence_decorators=(), id_format='MIMIC3_{note_id}_{sec_id}_{para_id}.{sent_id}', add_is_header=True, remove_empty_sentences=True)[source]

Bases: ParagraphFactory

Parse paragraph AMR graphs by using the the delegate paragraph factory. Then each document is given an AMR graph using a AmrDocument at the document level and a AmrSentence at the sentence level, which are cached using a Stash.

A list of AmrFeatureDocument are returned.

__init__(delegate, amr_annotator, stash, document_decorators=(), sentence_decorators=(), id_format='MIMIC3_{note_id}_{sec_id}_{para_id}.{sent_id}', add_is_header=True, remove_empty_sentences=True)
add_is_header: bool = True

Whether or not to add the is_header AMR metadata indicating if the sentence is part of one of the section headers.

amr_annotator: AnnotationFeatureDocumentParser

Parses, populates and caches AMR graphs in feature documents.

clear()[source]
create(sec)[source]
Return type:

Iterable[FeatureDocument]

delegate: ParagraphFactory

The paragraph factory that chunks the paragraphs.

document_decorators: Sequence[FeatureDocumentDecorator] = ()

A list of decorators that can add, remove or modify features on a document.

id_format: str = 'MIMIC3_{note_id}_{sec_id}_{para_id}.{sent_id}'

Whether to add the id AMR metadata field if it does not already exist.

remove_empty_sentences: bool = True

Whether to remove empty sentences from paragraphs. If True empty paragraphs are skipped.

sentence_decorators: Sequence[FeatureSentenceDecorator] = ()

A list of decorators that can add, remove or modify features on a document.

stash: Stash

Caches full paragraph AmrFeatureDocument instances.

zensols.clinicamr.proto module

Prototyping module.

class zensols.clinicamr.proto.PrototypeApplication(config_factory, app)[source]

Bases: object

CLI_META = {'is_usage_visible': False}
__init__(config_factory, app)
app: Application
config_factory: ConfigFactory
proto(run=0)[source]

Used for rapid prototyping.

zensols.clinicamr.spring module

Use the paper implementation of the SPRING parser.

class zensols.clinicamr.spring.SpringAmrParser(model='noop', add_missing_metadata=True, client=None, doc_parser=None)[source]

Bases: AmrParser

Adapt the zensols.amrspring client to a zensols.amr.model.AmrParser. This is to allow us to use the clinical notes trained THYME parser.

Citation:

Bevilacqua et al. (2021) One SPRING to Rule Them Both: Symmetric AMR Semantic Parsing and Generation without a Complex Pipeline. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 12564–12573, Virtual, May.

__init__(model='noop', add_missing_metadata=True, client=None, doc_parser=None)
client: AmrParseClient = None
doc_parser: FeatureDocumentParser = None

Module contents