Changes in version 3.0¶
Universal Feed Parser 3.0 was released on June 21, 2004.
- don’t try
iso-8859-1(can’t distinguish between
windows-1252anyway, and most incorrectly marked feeds are
- fixed regression that could cause the same encoding to be tried twice (even if it failed the first time)
Universal Feed Parser 3.0fc3 was released on June 18, 2004.
- fixed bug in
_changeEncodingDeclarationthat failed to parse UTF-16 encoded feeds
sourceinto a FeedParserDict
- duplicate admin:generatorAgent/@rdf:resource in
- added support for image
parse()fallback logic to try other encodings if SAX parsing fails (previously it would only try other encodings if re-encoding failed)
unichrmadness in normalize_attrs now that we’re properly tracking encoding in and out of BaseHTMLProcessor
feed.languagefrom root-level xml:lang
Universal Feed Parser 3.0fc2 was released on May 10, 2004.
- added and passed Sam’s amp tests
- added and passed my blink tag tests
Universal Feed Parser 3.0fc1 was released on April 23, 2004.
- fixed typo that could cause the same encoding to be tried twice (even if it failed the first time)
- fixed DOCTYPE stripping when DOCTYPE contained entity declarations
- better textinput and image tracking in illformed RSS 1.0 feeds
Universal Feed Parser 3.0b23 was released on April 21, 2004.
UnicodeDecodeErrorfor feeds that contain high-bit characters in attributes in embedded HTML in description (thanks Thijs van de Vossen)
date_parsedto mapped keys in FeedParserDict
- tweaked FeedParserDict.has_key to return
Trueif asking about a mapped key
Universal Feed Parser 3.0b22 was released on April 19, 2004.
resultsdict to allow getting values with
results.keyas well as
- work around embedded illformed HTML with half a DOCTYPE
- work around malformed
- if character encoding is wrong, try several common ones before falling back to regexes (if this works,
bozo_exceptionis set to
- fixed character encoding issues in BaseHTMLProcessor by tracking encoding and converting from Unicode to raw strings before feeding data to sgmllib.SGMLParser
- convert each value in results to Unicode (if possible), even if using regex-based parsing
Universal Feed Parser 3.0b21 was released on April 14, 2004.
- added Hot RSS support
Universal Feed Parser 3.0b20 was released on April 7, 2004.
- added CDF support
Universal Feed Parser 3.0b19 was released on March 15, 2004.
- fixed bug exploding author information when author name was in parentheses
- removed ultra-problematic
- patch to workaround crash in PyXML/expat when encountering invalid entities (MarkMoraes)
- support for textinput/textInput
Universal Feed Parser 3.0b18 was released on February 17, 2004.
- always map description to
Universal Feed Parser 3.0b17 was released on February 13, 2004.
- determine character encoding as per RFC 3023
Universal Feed Parser 3.0b16 was released on February 12, 2004.
- fixed support for RSS 0.90 (broken in b15)
Universal Feed Parser 3.0b15 was released on February 11, 2004.
- fixed bug resolving relative links in wfw:commentRSS
- fixed bug capturing author and contributor URI
- fixed bug resolving relative links in author and contributor URI
- fixed bug resolving relative links in generator URI
- added support for recognizing RSS 1.0
- passed Simon Fell’s namespace tests, and included them permanently in the test suite with his permission
- fixed namespace handling under Python 2.1
Universal Feed Parser 3.0b14 was released on February 8, 2004.
- fixed CDATA handling in non-wellformed feeds under Python 2.1
Universal Feed Parser 3.0b13 was released on February 8, 2004.
- better handling of empty HTML tags (br, hr, img, etc.) in embedded markup, in either HTML or XHTML form (<br>, <br/>, <br />)
Universal Feed Parser 3.0b12 was released on February 6, 2004.
- fiddled with
decodeEntities(still not right)
- added support to Atom 0.2 subtitle
- added support for Atom content model in copyright
- better sanitizing of dangerous HTML elements with end tags (script, frameset)
Universal Feed Parser 3.0b11 was released on February 2, 2004.
- added rights to list of elements that can contain dangerous markup
- fiddled with
- liberalized date parsing even further
Universal Feed Parser 3.0b10 was released on January 31, 2004.
- incorporated ISO-8601 date parsing routines from
Universal Feed Parser 3.0b9 was released on January 29, 2004.
- fixed check for presence of
- added support for summary
Universal Feed Parser 3.0b8 was released on January 28, 2004.
- added support for contributor
Universal Feed Parser 3.0b7 was released on January 28, 2004.
- support Atom-style author element in
authorcontains name + email address
Universal Feed Parser 3.0b6 was released on January 27, 2004.
- added feed type and version detection,
result['version']will be one of
SUPPORTED_VERSIONS.keys()or empty string if unrecognized
- added support for creativeCommons:license and cc:license
- added support for full Atom content model in title, tagline, info, copyright, summary
- fixed bug with gzip encoding (not always telling server we support it when we do)
Universal Feed Parser 3.0b5 was released on January 26, 2004.
- fixed bug parsing multiple links at feed level
Universal Feed Parser 3.0b4 was released on January 26, 2004.
- fixed xml:lang inheritance
- fixed multiple bugs tracking xml:base URI, one for documents that don’t define one explicitly and one for documents that define an outer and an inner xml:base that goes out of scope before the end of the document
Universal Feed Parser 3.0b3 was released on January 23, 2004.
- parse entire feed with real XML parser (if available)
- added several new supported namespaces
- fixed bug tracking naked markup in description
- added support for enclosure
- added support for source
- re-added support for cloud which got dropped somehow
- added support for expirationDate
Universal Feed Parser 3.0b2 and 3.0b1 have been lost in the mists of time.