zensols.dsprov package#

Submodules#

zensols.dsprov.app#

Inheritance diagram of zensols.dsprov.app

This library provides integrated MIMIC-III with discharge summary provenance of data annotations and Pythonic classes.

class zensols.dsprov.app.Application(stash)[source]#

Bases: object

This library provides integrated MIMIC-III with discharge summary provenance of data annotations and Pythonic classes.

__init__(stash)#
admissions(limit=None)[source]#

Print the annotated admission IDs.

Parameters:

limit (int) – the limit on items to print

show(limit=None, format=Format.text, ids=None, indent=None)[source]#

Print annotated matches

Parameters:
  • limit (int) – the limit on items to print

  • format (Format) – the output format

  • ids (str) – a comma separated list of hospital admission IDs (hadm_id)

  • indent (int) – the indentation (if any)

stash: Stash#

A stash that creates AdmissionMatch instances.

class zensols.dsprov.app.Format(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

json = 2#
text = 1#
yaml = 3#

zensols.dsprov.cli#

Inheritance diagram of zensols.dsprov.cli

Command line entry point to the application.

class zensols.dsprov.cli.ApplicationFactory(*args, **kwargs)[source]#

Bases: ApplicationFactory

__init__(*args, **kwargs)[source]#
classmethod get_stash()[source]#
Return type:

Stash

zensols.dsprov.cli.main(args=['/Users/landes/opt/lib/python/bin/sphinx-build', '-M', 'html', '/Users/landes/view/nlp/med/dsprov/target/doc/src', '/Users/landes/view/nlp/med/dsprov/target/doc/build'], **kwargs)[source]#
Return type:

ActionResult

zensols.dsprov.domain#

Inheritance diagram of zensols.dsprov.domain

Container classes for discharge summary to note antecedent match data.

class zensols.dsprov.domain.AdmissionMatch(note_matches)[source]#

Bases: MatchBase

Contains match data for an admission.

__init__(note_matches)#
property hadm_id: int#

The admission unique identifier.

note_matches: Tuple[NoteMatch]#

Contains match data for notes.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]#

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

class zensols.dsprov.domain.MatchBase[source]#

Bases: Dictable

A base class for match data containers that enforces no pickling/serialization of note spans. This is not supported as subclasses contain complex object graphs.

__init__()#
repr()[source]#
Return type:

str

class zensols.dsprov.domain.NoteMatch(hadm_id, discharge_summary, antecedents, source)[source]#

Bases: MatchBase

A match between a text span in the discharge summary with the semanically similar or copy/pasted text with the note antecedents. This is the analog to the MatchedAnnotation in the reproducibility repo.

STR_SPAN_WIDTH: ClassVar[int] = 30#
__init__(hadm_id, discharge_summary, antecedents, source)#
antecedents: Tuple[NoteSpan]#

The note antecedent note/spans.

desc(width)[source]#

A short description string of the match.

Return type:

str

discharge_summary: NoteSpan#

The discharge summary note and span.

hadm_id: int#

The admission unique identifier.

source: Dict[str, Any]#

The source annotation JSON that was used to construct this instance.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, include_sections=False)[source]#

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

class zensols.dsprov.domain.NoteSpan(lexspan, note)[source]#

Bases: MatchBase

A tie between two spans of semantically similar or copied text segments between a note antecedent and a discharge summary This is the analog to MatchedNote in the reproducibility repo, but use paper terminology.

__init__(lexspan, note)#
lexspan: LexicalSpan#

The 0-index start and end offset in note the demarcates the span lexically.

property norm_text: str#

The normalized as the span text spaced and without newlines.

note: Note#

The note that matches.

property sections: Dict[int, Section]#

The sections coverd by the span.

property span: TokenContainer#

The span as features demarcated by the span.

property text: str#

The span as text demarcated by the span.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, include_sections=False)[source]#

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

zensols.dsprov.stash#

Inheritance diagram of zensols.dsprov.stash

A stash class used for accessing the provenance of data annotations.

class zensols.dsprov.stash.AnnotationStash(installer, corpus)[source]#

Bases: ReadOnlyStash

A stash that create instances of AdmissionMatch.

__init__(installer, corpus)#
corpus: Corpus#
exists(hadm_id)[source]#

Return True if data with key name exists.

Implementation note: This Stash.exists() method is very inefficient and should be overriden.

Return type:

bool

installer: Installer#
keys()[source]#

Return an iterable of keys in the collection.

Return type:

Iterable[str]

load(hadm_id)[source]#

Load a data value from the pickled data with key name. Semantically, this method loads the using the stash’s implementation. For example DirectoryStash loads the data from a file if it exists, but factory type stashes will always re-generate the data.

See:

get()

Return type:

AdmissionMatch

Module contents#