Merchako Merchako - 2 years ago 94
HTML Question

What's the difference between `<seg>` and `<span>`

What's the difference between a

in XML and
in HTML? Here are two passages from Bibles, one from the English Bible in Christodouloupoulos' and Steedman's massively parallel Bible corpus,

<?xml version="1.0" ?>
<cesDoc version="4">

<body id="Bible" lang="en">
<div id="b.GEN" type="book">
<div id="b.GEN.1" type="chapter">
<seg id="b.GEN.1.1" type="verse">
In the beginning God created the heaven and the earth.
<seg id="b.GEN.1.2" type="verse">
And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.

and the other from the NIV English Bible at Bible Gateway, which is where they got most of their texts from:

<p class="chapter-1">
<span id="en-NIV-27932" class="text Rom-1-1">
<span class="chapternum">1&nbsp;</span>
Paul, a servant of Christ Jesus, called to be an apostle and set apart for the gospel of God—
<span id="en-NIV-27933" class="text Rom-1-2">
<sup class="versenum">2&nbsp;</sup>the gospel he promised beforehand through his prophets in the Holy Scriptures

In the HTML, a it seems a
can replace a
, except that the HTML has added verse numbers in
. Oh, and the chapters are in
. So it's not one-to-one.

Of course, I realize that HTML and XML are different, and this is only one juxtaposition; I'm sure there are others out there. But I'm going to need to be able to display XML as HTML, and I don't want to anger the doctype gods. So, conceptually, how does
different from
in purpose, meaning and usage?

Update: @jim-garrison, says I'm going to need to read the schema to understand the XML, but I'm a neophyte at that, too. In particular, I did find some official-looking documentation for
by TEI that makes me think it's use is a little more than arbitrary, but I have no idea how to interpret this documentation. Should it give us a more specific answer than what Jim has already written?

Answer Source

The difference between XML and HTML generally is that the list of tags that can be present in XML is defined by a DTD or XML Schema, and tags represent document semantics and not presentation. So tags can be named anything. In HTML the set of tags is generally predefined, as if there was a pre-existing HTML DTD or schema, but HTML is not XML and doesn't follow all the rules of XML. While HTML was in some sense derived from the same parent as XML (SGML), and the two are superficially very similar, they are most definitely NOT the same thing.

The answer to your specific question is that the writers of the XML chose to use a tag named <seg> ("segment"?) to represent generalized strings of text, with attributes providing additional semantic information. For more details you'll need to find the DTD or XML schema that governs the content of the XML and read the documentation that goes with it.

But I'm going to need to be able to display XML as HTML, and I don't want to anger the doctype gods. So, conceptually, how does different from in purpose, meaning and usage?

This is where you will use XSLT to transform the input XML into valid HTML. To figure out how to do that transformation you will need to know the full semantics of all the tags that can appear (again, go to the documentation for the DTD/Schema) and decide on a visual representation for the data. There's no one answer to "how should a <seg>" be transformed. That's up to your requirements regarding presentation. One possible transformation converts <seg> tags to <span>, but that may depend on the value of certain attributes (type="verse" vs some other type). It might even differ depending on output medium (desktop vs tablet vs phone vs watch vs ...?)

Once you convert from XML to HTML you have left the realm of the Doctype gods and they have no interest in what you do :-) There's a whole different set of deities such as CSS-Cthulhu, Javascript-Janai'ngo (look it up), et al who will take great pleasure making your life miserable.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download