README.md in feedjira-podcast-0.9.8 vs README.md in feedjira-podcast-0.9.9
- old
+ new
@@ -40,13 +40,13 @@
In cases where an element can repeat in a given context, the accessor will be pluralized (e.g. `<category>` becomes `categories`) and the value is an array, even if only a single element exists in a given feed.
#### Typecasting
-In nearly all cases the value types of podcast feed data is predictable. As such, the parser will cast values appropriately. In cases where a feed is malformed, the result of the type cast may be somewhat unpredictable, but no more so than the original data was. A bad value should never prevent parsing of the rest of the document, but it bust a specific value (e.g. a date that is human-readable but not parseable will end up as `nil` rather than fallback to an ambiguous `String` value).
+In nearly all cases the value types of podcast feed data is predictable. As such, the parser will cast values appropriately. In cases where a feed is malformed, the result of the type cast may be somewhat unpredictable, but no more so than the original data was. A bad value should never prevent parsing of the rest of the document, but it could bust a specific value (e.g. a date that is human-readable but not parseable will end up as `nil` rather than fallback to an ambiguous `String` value).
-Date values are parsed as `Time` objects, number values become `Float` objects, hrefs, URIs and URLs are parsed using [Addressable](https://github.com/sporkmonger/addressable), and boolean values will return `true` or `false.`
+Date values are parsed as `Time` objects, number values become `Float` objects, hrefs, URIs and URLs are parsed using [Addressable](https://github.com/sporkmonger/addressable), and boolean values will return `true` or `false`.
In the case of the `<itunes:explicit>` tag, there are three mutually exclusive options, `"yes"`, `"clean"`, and any other value (representing `"no"`). This element gets expanded to two properties through parsing, `explicit?` and `clean?`, allowing both to be simple boolean values.
The spec for the item `<guid>` indicates that the default value if no value is given for the `isPermaLink` attribute is `true`. It also states that when `isPermaLink` is true, the value of `<guid>` represents a URI (and that the client can choose to treat it that way or not). As such, whenever a `<guid>` is acting as a `permaLink`, whether implicitly or explicitly, the `guid` value will return an `Addressable::URI`. Only if `isPermaLink` has a value of `"false"` with `guid` return a string.
@@ -54,14 +54,18 @@
The parser will try its best to parse whatever data it has available, with a strong preference for data that meets the specs commonly used for podcast feeds. The library will provide additional information that you can use to decide how to handle the data produced by the parser, but the parser itself will never refuse a document based on deficiencies or invalid data.
Besides standard typecasting, the parser won't try to clean up any data. For example, many elements (such as `<description>`) explicitly allow only plaintext, but it is common for them to include markup in the wild. The parser will assume that markup to be plaintext, according to the spec. If you want to sanitize those parts of feeds, you should handle that on your end.
+#### Parsing
+
+During parsing, some aspects of the original feed will not be maintained in such a way that a functionally identical feed could be generated from the result. For example. `CDATA` declarations in XML elements are lost, as the parser converts them (correctly) to strings. If the resulting data is being used to generate feeds, wrapping values in CDATA would be the responsibility of the whatever is constructing the feed, based on the values of the strings. Similarly, Apple expects iTunes category tag values to include encoded ampersands, eg `TV & Film`. Parsing the feed will decode those values (to `TV & Film`), so they would have to be re-encoded before being used somewhere that iTunes would be reading from.
+
### More Information
For more detailed information about specific aspects of feeds, how they are spec'd, and how they are handled by the parser, see the [wiki](https://github.com/scour/feedjira-podcast/wiki).
## In Progress
-Coverage of RSS, iTunes, and the other common constituents of podcast feeds is very high, but there are some bits that need to be addressed. Several rarely-used RSS elements (`<cloud>`, `<rating>`, etc) are not supported. Due to how they can be nested `<itunes:category>` is also a work in progress. Currently only top-level categories are available. More esoteric elements, such a host-specific tags, or various parts of Dublin Core, are added based on their prevalence in real world feeds.
+Coverage of RSS, iTunes, and the other common constituents of podcast feeds is very high, but there are some bits that need to be addressed. Several rarely-used RSS elements (`<rating>`, etc) are not supported. More esoteric elements, such a host-specific tags, or various parts of Dublin Core, are added based on their prevalence in real world feeds.
Experimental feed elements may be added over time, but they should be used with caution until they reach a critical mass or become standardized.