Skip to content Skip to sidebar Skip to footer

Parseerror: Undefined Entity While Parsing Xml File In Python

I have a big XML file with several article nodes. I have included only one with the problem. I try to parse it in Python to filter some data and I get the error File '

Solution 1:

The declaration of the Ouml entity is presumably in the DTD (dblp.dtd), but ElementTree does not support external DTDs. ElementTree only recognizes entities declared directly in the XML file (in the "internal subset"). This is a working example:

<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE dblp [
<!ENTITY Ouml'Ö'>
]><dblp><articlemdate="2019-10-25"key="tr/gte/TR-0146-06-91-165"publtype="informal"><author>Alejandro P. Buchmann</author><author>M. Tamer &Ouml;zsu</author><author>Dimitrios Georgakopoulos</author><title>Towards a Transaction Management System for DOM.</title><journal>GTE Laboratories Incorporated</journal><volume>TR-0146-06-91-165</volume><month>June</month><year>1991</year><url>db/journals/gtelab/index.html#TR-0146-06-91-165</url></article></dblp>

To parse the XML file in the question without errors, you need a more powerful XML library that supports external DTDs. lxml is a good choice for that.

Post a Comment for "Parseerror: Undefined Entity While Parsing Xml File In Python"