Sha256: eecc000369737983ed792886c32a4145072495ca9816741071d9759b71656ab0

Contents?: true

Size: 1.14 KB

Versions: 24

Compression:

Stored size: 1.14 KB

Contents

sentence key

    collection:  10 random PubMed documents with ASCII text split into
                 sentences by the MedPost sentence splitter
                 
                 Original source ascii.xml

    source:  PubMed

    date:  yyyymmdd. Date documents downloaded from PubMed

    document:  Title and possibly abstract from a PubMed reference

    id:  PubMed id

    passage:  Either title or abstract

    type:  "title" or "abstract"

    offset: The original Unicode byte offsets were not updated after
            the ASCII conversion.

            PubMed is extracted from an XML file, so literal offsets
            would not be useful. Title has an offset of zero, while
            the abstract is assumed to begin after the title and one
            space. These offsets at least sequence the abstract after
            the title.

    sentence:  One sentence of the passage as determined by the
               MedPost sentence splitter

    offset: A document offset to where the sentence begins in the
            passage. The sum of the passage offset and the local
            offset within the passage.

    text: ASCII text of the setence.

Version data entries

24 entries across 24 versions & 1 rubygems

Version Path
simple_bioc-0.0.24 xml/sentence.key
simple_bioc-0.0.23 xml/sentence.key
simple_bioc-0.0.22 xml/sentence.key
simple_bioc-0.0.21 xml/sentence.key
simple_bioc-0.0.20 xml/sentence.key
simple_bioc-0.0.19 xml/sentence.key
simple_bioc-0.0.18 xml/sentence.key
simple_bioc-0.0.17 xml/sentence.key
simple_bioc-0.0.16 xml/sentence.key
simple_bioc-0.0.15 xml/sentence.key
simple_bioc-0.0.14 xml/sentence.key
simple_bioc-0.0.13 xml/sentence.key
simple_bioc-0.0.12 xml/sentence.key
simple_bioc-0.0.11 xml/sentence.key
simple_bioc-0.0.10 xml/sentence.key
simple_bioc-0.0.9 xml/sentence.key
simple_bioc-0.0.8 xml/sentence.key
simple_bioc-0.0.7 xml/sentence.key
simple_bioc-0.0.6 xml/sentence.key
simple_bioc-0.0.5 xml/sentence.key