How To Fetch Content Of Xml Root Element In Python?
I have an XML file, e.g.: First line. Second line. As an output I want to get: '\nFirst lin
Solution 1:
The first that I came up with:
from xml.etree.ElementTree import fromstring, tostring
source = '''<?xml version="1.0" encoding="UTF-8"?>
<root>
First line.<br/>Second line.
</root>
'''
xml = fromstring(source)
result = tostring(xml).lstrip('<%s>' % xml.tag).rstrip('</%s>' % xml.tag)
print result
# output:## First line.<br/>Second line. #
But it's not truly general-purpose approach since it fails if opening root element (<root>
) contains any attribute.
UPDATE: This approach has another issue. Since lstrip
and rstrip
match any combination of given chars, you can face such problem:
# input:
<?xml version="1.0" encoding="UTF-8"?><root><p>First line</p></root>
# result:
p>First line</p
If your really need only literal string between the opening and closing tags (as you mentioned in the comment), you can use this:
from string import index, rindex
from xml.etree.ElementTree import fromstring, tostring
source = '''<?xml version="1.0" encoding="UTF-8"?>
<root attr1="val1">
First line.<br/>Second line.
</root>
'''# following two lines are needed just to cut# declaration, doctypes, etc.
xml = fromstring(source)
xml_str = tostring(xml)
start = index(xml_str, '>')
end = rindex(xml_str, '<')
result = xml_str[start + 1 : -(len(xml_str) - end)]
Not the most elegant approach, but unlike the previous one it works correctly with attributes within opening tag as well as with any valid xml document.
Solution 2:
Parse from file:
from xml.etree.ElementTree import parse
tree = parse('yourxmlfile.xml')
print tree.getroot().text
Parse from string:
from xml.etree.ElementTree import fromstring
print fromstring(yourxmlstr).text
Post a Comment for "How To Fetch Content Of Xml Root Element In Python?"