[![Identifier](https://img.shields.io/badge/doi-10.5438%2Fn138--z3mk-fca709.svg)](https://doi.org/10.5438/n138-z3mk)
[![Gem Version](https://badge.fury.io/rb/bolognese.svg)](https://badge.fury.io/rb/bolognese)
[![Build Status](https://travis-ci.org/datacite/bolognese.svg?branch=master)](https://travis-ci.org/datacite/bolognese)
[![Code Climate](https://codeclimate.com/github/datacite/bolognese/badges/gpa.svg)](https://codeclimate.com/github/datacite/bolognese)
[![Test Coverage](https://codeclimate.com/github/datacite/bolognese/badges/coverage.svg)](https://codeclimate.com/github/datacite/bolognese/coverage)
# Bolognese: a Ruby library for conversion of DOI Metadata
Ruby gem and command-line utility for conversion of DOI metadata from and to different metadata formats, including [schema.org](https://schema.org).
## Features
Bolognese reads and/or writes these metadata formats:
Format |
Name |
Content Type |
Read |
Write |
CrossRef Unixref XML |
crossref |
application/vnd.crossref.unixref+xml |
Yes |
No |
DataCite XML |
datacite |
application/vnd.datacite.datacite+xml |
Yes |
Yes |
DataCite JSON |
datacite_json |
application/vnd.datacite.datacite+json |
Yes |
Yes |
Schema.org in JSON-LD |
schema_org |
application/vnd.schemaorg.ld+json |
Yes |
Yes |
RDF XML |
rdf_xml |
application/rdf+xml |
No |
Yes |
RDF Turtle |
turtle |
text/turtle |
No |
Yes |
Citeproc JSON |
citeproc |
application/vnd.citationstyles.csl+json |
Yes |
Yes |
Codemeta |
codemeta |
application/vnd.codemeta.ld+json |
Yes |
Yes |
BibTeX |
bibtex |
application/x-bibtex |
Yes |
Yes |
RIS |
ris |
application/x-research-info-systems |
Yes |
Yes |
## Installation
Requires Ruby 2.2 or later. Then add the following to your `Gemfile` to install the
latest version:
```ruby
gem 'bolognese'
```
Then run `bundle install` to install into your environment.
You can also install the gem system-wide in the usual way:
```bash
gem install bolognese
```
## Commands
Run the `bolognese` command with either an identifier (DOI or URL) or filename:
```
bolognese https://doi.org/10.7554/elife.01567
```
```
bolognese example.xml
```
Bolognese can read BibTeX files (file extension `.bib`), RIS files (file extension `.ris`), Crossref or DataCite XML files (file extension `.xml`), DataCite JSON files (file extension `Citeproc JSON files ().
The input format (e.g. Crossref XML or BibteX) is automatically detected, but
you can also provide the format with the `--from` or `-f` flag. The supported
input formats are listed in the table above.
The output format is determined by the `--to` or `-t` flag, and defaults to `schema_org`.
Show all commands with `bolognese help`:
```
Commands:
bolognese # convert metadata
bolognese --version, -v # print the version
bolognese help [COMMAND] # Describe available commands or one specific command
```
## Errors
Errors are returned to STDOUT.
All DataCite XML input is validated against the corresponding schema version (kernel 2.1, 2.2, 3, or 4).
## Examples
Read Crossref XML:
```
bolognese https://doi.org/10.7554/elife.01567 -t crossref
eLife
2050-084X
02
11
2014
3
Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth
Martial
Sankar
Kaisa
Nieminen
Laura
Ragni
Ioannis
Xenarios
Christian S
Hardtke
02
11
2014
10.7554/eLife.01567
1
eLifesciences
www.elifesciences.org
false
2013-09-20
2013-12-24
2014-02-11
SystemsX
EMBO
http://dx.doi.org/10.13039/501100003043
Swiss National Science Foundation
http://dx.doi.org/10.13039/501100001711
University of Lausanne
http://dx.doi.org/10.13039/501100006390
http://creativecommons.org/licenses/by/3.0/
http://creativecommons.org/licenses/by/3.0/
http://creativecommons.org/licenses/by/3.0/
10.7554/eLife.01567
http://elifesciences.org/lookup/doi/10.7554/eLife.01567
...
Sankar
2014
10.5061/dryad.b835k
...
...
```
Convert Crossref XML to schema.org/JSON-LD:
```
bolognese https://doi.org/10.7554/elife.01567
{
"@context": "http://schema.org",
"@type": "ScholarlyArticle",
"@id": "https://doi.org/10.7554/elife.01567",
"url": "http://elifesciences.org/lookup/doi/10.7554/eLife.01567",
"additionalType": "JournalArticle",
"name": "Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth",
"author": [{
"@type": "Person",
"givenName": "Martial",
"familyName": "Sankar"
}, {
"@type": "Person",
"givenName": "Kaisa",
"familyName": "Nieminen"
}, {
"@type": "Person",
"givenName": "Laura",
"familyName": "Ragni"
}, {
"@type": "Person",
"givenName": "Ioannis",
"familyName": "Xenarios"
}, {
"@type": "Person",
"givenName": "Christian S",
"familyName": "Hardtke"
}],
"license": "http://creativecommons.org/licenses/by/3.0/",
"datePublished": "2014-02-11",
"dateModified": "2015-08-11T05:35:02Z",
"isPartOf": {
"@type": "Periodical",
"name": "eLife",
"issn": "2050-084X"
},
"citation": [{
"@type": "CreativeWork",
"@id": "https://doi.org/10.1038/nature02100",
"position": "1",
"datePublished": "2003"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1534/genetics.109.104976",
"position": "2",
"datePublished": "2009"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1034/j.1399-3054.2002.1140413.x",
"position": "3",
"datePublished": "2002"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1162/089976601750399335",
"position": "4",
"datePublished": "2001"
}, {
"@type": "CreativeWork",
"position": "5",
"datePublished": "1995"
}, {
"@type": "CreativeWork",
"position": "6",
"datePublished": "1993"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/j.semcdb.2009.09.009",
"position": "7",
"datePublished": "2009"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1242/dev.091314",
"position": "8",
"datePublished": "2013"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1371/journal.pgen.1002997",
"position": "9",
"datePublished": "2012"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1038/msb.2010.25",
"position": "10",
"datePublished": "2010"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/j.biosystems.2012.07.004",
"position": "11",
"datePublished": "2012"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/j.pbi.2005.11.013",
"position": "12",
"datePublished": "2006"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1105/tpc.110.076083",
"position": "13",
"datePublished": "2010"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1073/pnas.0808444105",
"position": "14",
"datePublished": "2008"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/0092-8674(89)90900-8",
"position": "15",
"datePublished": "1989"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1126/science.1066609",
"position": "16",
"datePublished": "2002"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1104/pp.104.040212",
"position": "17",
"datePublished": "2004"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1038/nbt1206-1565",
"position": "18",
"datePublished": "2006"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1073/pnas.77.3.1516",
"position": "19",
"datePublished": "1980"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1093/bioinformatics/btq046",
"position": "20",
"datePublished": "2010"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1105/tpc.111.084020",
"position": "21",
"datePublished": "2011"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.5061/dryad.b835k",
"position": "22",
"datePublished": "2014"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/j.cub.2008.02.070",
"position": "23",
"datePublished": "2008"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1111/j.1469-8137.2010.03236.x",
"position": "24",
"datePublished": "2010"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1007/s00138-011-0345-9",
"position": "25",
"datePublished": "2012"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1016/j.cell.2012.02.048",
"position": "26",
"datePublished": "2012"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.1038/ncb2764",
"position": "27",
"datePublished": "2013"
}],
"funder": [{
"@type": "Organization",
"name": "SystemsX"
}, {
"@type": "Organization",
"@id": "https://doi.org/10.13039/501100003043",
"name": "EMBO"
}, {
"@type": "Organization",
"@id": "https://doi.org/10.13039/501100001711",
"name": "Swiss National Science Foundation"
}, {
"@type": "Organization",
"@id": "https://doi.org/10.13039/501100006390",
"name": "University of Lausanne"
}],
"provider": {
"@type": "Organization",
"name": "Crossref"
}
}
```
Convert Crossref XML to DataCite XML:
```
bolognese https://doi.org/10.7554/elife.01567 -t datacite
10.7554/eLife.01567
Sankar, Martial
Martial
Sankar
Nieminen, Kaisa
Kaisa
Nieminen
Ragni, Laura
Laura
Ragni
Xenarios, Ioannis
Ioannis
Xenarios
Hardtke, Christian S
Christian S
Hardtke
Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth
eLife
2014
JournalArticle
SystemsX
EMBO
https://doi.org/10.13039/501100003043
Swiss National Science Foundation
https://doi.org/10.13039/501100001711
University of Lausanne
https://doi.org/10.13039/501100006390
2014-02-11
2015-08-11T05:35:02Z
https://doi.org/10.1038/nature02100
https://doi.org/10.1534/genetics.109.104976
https://doi.org/10.1034/j.1399-3054.2002.1140413.x
https://doi.org/10.1162/089976601750399335
https://doi.org/10.1016/j.semcdb.2009.09.009
https://doi.org/10.1242/dev.091314
https://doi.org/10.1371/journal.pgen.1002997
https://doi.org/10.1038/msb.2010.25
https://doi.org/10.1016/j.biosystems.2012.07.004
https://doi.org/10.1016/j.pbi.2005.11.013
https://doi.org/10.1105/tpc.110.076083
https://doi.org/10.1073/pnas.0808444105
https://doi.org/10.1016/0092-8674(89)90900-8
https://doi.org/10.1126/science.1066609
https://doi.org/10.1104/pp.104.040212
https://doi.org/10.1038/nbt1206-1565
https://doi.org/10.1073/pnas.77.3.1516
https://doi.org/10.1093/bioinformatics/btq046
https://doi.org/10.1105/tpc.111.084020
https://doi.org/10.5061/dryad.b835k
https://doi.org/10.1016/j.cub.2008.02.070
https://doi.org/10.1111/j.1469-8137.2010.03236.x
https://doi.org/10.1007/s00138-011-0345-9
https://doi.org/10.1016/j.cell.2012.02.048
https://doi.org/10.1038/ncb2764
Creative Commons Attribution 3.0 (CC-BY 3.0)
```
Convert Crossref XML to BibTeX:
```
bolognese https://doi.org/10.7554/elife.01567 -t bibtex
@article{https://doi.org/10.7554/elife.01567,
doi = {10.7554/eLife.01567},
url = {http://elifesciences.org/lookup/doi/10.7554/eLife.01567},
author = {Sankar, Martial and Nieminen, Kaisa and Ragni, Laura and Xenarios, Ioannis and Hardtke, Christian S},
title = {Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth},
journal = {eLife},
year = {2014}
}
```
Read DataCite XML:
```
bolognese 10.5061/DRYAD.8515 -t datacite
10.5061/DRYAD.8515
1
Ollomo, Benjamin
Durand, Patrick
Prugnolle, Franck
Douzery, Emmanuel J. P.
Arnathau, Céline
Nkoghe, Dieudonné
Leroy, Eric
Renaud, François
Data from: A new malaria agent in African hominids.
Dryad Digital Repository
2011
Phylogeny
Malaria
Parasites
Taxonomy
Mitochondrial genome
Africa
Plasmodium
DataPackage
Ollomo B, Durand P, Prugnolle F, Douzery EJP, Arnathau C, Nkoghe D, Leroy E, Renaud F (2009) A new malaria agent in African hominids. PLoS Pathogens 5(5): e1000446.
10.5061/DRYAD.8515/1
10.5061/DRYAD.8515/2
10.1371/JOURNAL.PPAT.1000446
19478877
```
Convert DataCite XML to schema.org/JSON-LD:
```sh
bolognese 10.5061/DRYAD.8515
{
"@context": "http://schema.org",
"@type": "Dataset",
"@id": "https://doi.org/10.5061/dryad.8515",
"additionalType": "DataPackage",
"name": "Data from: A new malaria agent in African hominids.",
"alternateName": "Ollomo B, Durand P, Prugnolle F, Douzery EJP, Arnathau C, Nkoghe D, Leroy E, Renaud F (2009) A new malaria agent in African hominids. PLoS Pathogens 5(5): e1000446.",
"author": [{
"@type": "Person",
"givenName": "Benjamin",
"familyName": "Ollomo"
}, {
"@type": "Person",
"givenName": "Patrick",
"familyName": "Durand"
}, {
"@type": "Person",
"givenName": "Franck",
"familyName": "Prugnolle"
}, {
"@type": "Person",
"givenName": "Emmanuel J. P.",
"familyName": "Douzery"
}, {
"@type": "Person",
"givenName": "Céline",
"familyName": "Arnathau"
}, {
"@type": "Person",
"givenName": "Dieudonné",
"familyName": "Nkoghe"
}, {
"@type": "Person",
"givenName": "Eric",
"familyName": "Leroy"
}, {
"@type": "Person",
"givenName": "François",
"familyName": "Renaud"
}],
"license": "http://creativecommons.org/publicdomain/zero/1.0/",
"version": "1",
"keywords": "Phylogeny, Malaria, Parasites, Taxonomy, Mitochondrial genome, Africa, Plasmodium",
"datePublished": "2011",
"hasPart": [{
"@type": "CreativeWork",
"@id": "https://doi.org/10.5061/dryad.8515/1"
}, {
"@type": "CreativeWork",
"@id": "https://doi.org/10.5061/dryad.8515/2"
}],
"citation": [{
"@type": "CreativeWork",
"@id": "https://doi.org/10.1371/journal.ppat.1000446"
}],
"schemaVersion": "http://datacite.org/schema/kernel-3",
"publisher": {
"@type": "Organization",
"name": "Dryad Digital Repository"
},
"provider": {
"@type": "Organization",
"name": "DataCite"
}
}
```
Convert DataCite XML to schema version 4.0:
```
bolognese 10.5061/DRYAD.8515 -t datacite --schema_version http://datacite.org/schema/kernel-4
10.5061/DRYAD.8515
Ollomo, Benjamin
Benjamin
Ollomo
Durand, Patrick
Patrick
Durand
Prugnolle, Franck
Franck
Prugnolle
Douzery, Emmanuel J. P.
Emmanuel J. P.
Douzery
Arnathau, Céline
Céline
Arnathau
Nkoghe, Dieudonné
Dieudonné
Nkoghe
Leroy, Eric
Eric
Leroy
Renaud, François
François
Renaud
Data from: A new malaria agent in African hominids.
Dryad Digital Repository
2011
DataPackage
Ollomo B, Durand P, Prugnolle F, Douzery EJP, Arnathau C, Nkoghe D, Leroy E, Renaud F (2009) A new malaria agent in African hominids. PLoS Pathogens 5(5): e1000446.
Phylogeny
Malaria
Parasites
Taxonomy
Mitochondrial genome
Africa
Plasmodium
2011
https://doi.org/10.5061/dryad.8515/1
https://doi.org/10.5061/dryad.8515/2
https://doi.org/10.1371/journal.ppat.1000446
1
Public Domain (CC0 1.0)
```
Convert DataCite XML to Codemeta:
```
bolognese https://doi.org/10.5063/f1m61h5x -t codemeta
{
"@context":"https://raw.githubusercontent.com/codemeta/codemeta/master/codemeta.jsonld",
"@type":"SoftwareSourceCode",
"@id":"https://doi.org/10.5063/f1m61h5x",
"identifier":"https://doi.org/10.5063/f1m61h5x",
"title":"dataone: R interface to the DataONE network of data repositories",
"agents":{
"@type":"Person",
"givenName":"Matthew B.",
"familyName":"Jones"
},
"datePublished":"2016",
"publisher":{
"@type":"Organization",
"name":"KNB Data Repository"
}
}
```
Convert DataCite XML to BibTeX:
```
bolognese 10.5061/DRYAD.8515 -t bibtex
@misc{https://doi.org/10.5061/dryad.8515,
doi = {10.5061/DRYAD.8515},
author = {Ollomo, Benjamin and Durand, Patrick and Prugnolle, Franck and Douzery, Emmanuel J. P. and Arnathau, Céline and Nkoghe, Dieudonné and Leroy, Eric and Renaud, François},
keywords = {Phylogeny, Malaria, Parasites, Taxonomy, Mitochondrial genome, Africa, Plasmodium},
title = {Data from: A new malaria agent in African hominids.},
publisher = {Dryad Digital Repository},
year = {2011}
}
```
Convert schema.org/JSON-LD to DataCite XML:
```
bolognese https://blog.datacite.org/eating-your-own-dog-food -t datacite
10.5438/4k3m-nyvg
Fenner, Martin
Martin
Fenner
http://orcid.org/0000-0003-1419-2405
Eating your own Dog Food
DataCite
2016
BlogPosting
MS-49-3632-5083
datacite
doi
metadata
featured
2016-12-20
2016-12-20
2016-12-20
https://doi.org/10.5438/0000-00ss
https://doi.org/10.5438/0012
https://doi.org/10.5438/55e5-t5c0
1.0
Eating your own dog food is a slang term to describe that an organization should itself use the products and services it provides. For DataCite this means that we should use DOIs with appropriate metadata and strategies for long-term preservation for...
```
Convert schema.org/JSON-LD to BibTeX:
```
bolognese https://blog.datacite.org/eating-your-own-dog-food -t bibtex
@article{https://doi.org/10.5438/4k3m-nyvg,
doi = {10.5438/4k3m-nyvg},
url = {https://blog.datacite.org/eating-your-own-dog-food},
author = {Fenner, Martin},
keywords = {datacite, doi, metadata, featured},
title = {Eating your own Dog Food},
publisher = {DataCite},
year = {2016}
}
```
Convert Codemeta to schema.org/JSON-LD:
```
bolognese https://github.com/datacite/maremma
{
"@context":"http://schema.org",
"@type":"SoftwareSourceCode",
"@id":"https://doi.org/10.5438/qeg0-3gm3",
"url":"https://github.com/datacite/maremma",
"name":"Maremma: a Ruby library for simplified network calls",
"author":{
"@type":"person",
"@id":"http://orcid.org/0000-0003-0077-4738",
"name":"Martin Fenner"
},
"description":"Simplifies network calls, including json/xml parsing and error handling. Based on Faraday.",
"keywords":"faraday, excon, net/http",
"dateCreated":"2015-11-28",
"datePublished":"2017-02-24",
"dateModified":"2017-02-24",
"publisher":{
"@type":"Organization",
"name":"DataCite"
}
}
```
Convert Codemeta to DataCite XML:
```
bolognese https://github.com/datacite/maremma -t datacite
10.5438/qeg0-3gm3
Martin Fenner
http://orcid.org/0000-0003-0077-4738
Maremma: a Ruby library for simplified network calls
DataCite
2017
SoftwareSourceCode
faraday
excon
net/http
2015-11-28
2017-02-24
2017-02-24
Simplifies network calls, including json/xml parsing and error handling. Based on Faraday.
```
## Development
We use rspec for unit testing:
```
bundle exec rspec
```
Follow along via [Github Issues](https://github.com/datacite/bolognese/issues).
Please open an issue if conversion fails or metadata are not properly supported.
### Note on Patches/Pull Requests
* Fork the project
* Write tests for your new feature or a test that reproduces a bug
* Implement your feature or make a bug fix
* Do not mess with Rakefile, version or history
* Commit, push and make a pull request. Bonus points for topical branches.
## License
**bolognese** is released under the [MIT License](https://github.com/datacite/bolognese/blob/master/LICENSE.md).