COMP60370
A Tale of Two Formats

Bijan Parsia

{bparsia@cs.man.ac.uk}

Things "on the Web"

Making such things

The most basic picture

For example, a web browser might interact with two web servers (one for the HTML page and one for an embedded PNG image).

A less basic picture

The browser has to use DNS servers in order to find the Web host serving the HTML and PNGs.

The data

Today: Authoring Web Stuff

Case to Study

A Weblog Workflow

Weblog entries get pushed to a server (with a CMS or just to the file system) which then gets published as an HTML page or an XML feed which is then aggregated.

Weblog Data Formats

A Brief History of (X)HTML

HTML as SSD

HTML as SSD

A simple HTML weblog (1)

Authentic Voice of a Person.  Reverse Chronological Order.  On the web.  These are essential characteristics of a online Journal or weblog.

Given the statements above, a well formed log entry would contain at a minimum an author, a creationDate, and a permaLink.  And, of course, content. -- Sam Ruby

<h1>My Weblog</h1>
<h2>What I Did Today</h2>
<h3>Feb. 11, 2008; Bijan Parsia</h3>
<p>Taught a class and it went <i>very</i> well.</p>

A simple HTML weblog (2)

We can radically change the markup.

<h1>My Weblog</h1>
<ul>
<li>
<b>What I Did Today</b><br/>
<i>Feb. 11, 2008; Bijan Parsia</i></br>
<p>Taught a class and it went <em>very</em> well.
</li>
</ul>

A simple Atom entry

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>My Weblog</title>
<updated>2008-02-13T18:30:02Z</updated>
<id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
<entry>
<author>
<name>Bijan Parisa</name>
</author>
<title>What I Did Today</title>
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2008-02-13T18:30:02Z</updated>
<content type="xhtml" xml:lang="en"
xmlns="http://www.w3.org/1999/xhtml">
<p>Taught a class and it went <em>very</em> well.</p>
</content>
</entry>
</feed>

Validation in the Wild

Seeking Validation

Lesson #1

Error

Schematron

From HTML5: Exclusions

Exclusions Examples

<schema xmlns="http://purl.oclc.org/dsdl/schematron">
  <ns prefix="h" uri="http://www.w3.org/1999/xhtml"/>
<pattern name='dfn cannot nest'>
<rule context="h:dfn">
<report test="ancester::h:dfn">
The "dfn" element cannot contain any nested
"dfn" elements.</report>
</rule>
</pattern>
<pattern name='noscript cannot nest'>
<rule context="h:noscript">
<report test="ancester::h:>noscript">
The "noscript element cannot contain any nested
"noscript" elements.</report>
</rule>
</pattern>
</schema>

Dfn Defined

From common.rnc:

common.elem.embedded = ( notAllowed )
common.elem.phrase = ( common.elem.embedded )
common.inner.phrase =( text & common.elem.phrase* )

From phrase.rnc:

dfn.elem = element dfn { dfn.inner & dfn.attrs }
dfn.attrs =
( common.attrs )
dfn.inner =
( common.inner.phrase )

common.elem.phrase |= dfn.elem

An Atom Example 

<ns uri="http://www.w3.org/2005/Atom" prefix="atom"/> 
<rule context="atom:feed">
<assert test="atom:author or not(atom:entry[not(atom:author)])">
An atom:feed must have an atom:author unless all
of its atom:entry children have an atom:author.
</assert>
</rule>

Schematron Presumes... 

Structure and Style 

Why separate them? 

CSS vs. XSL

CSS Basics

CSS with <div>, <span>

<style type="text/css">
.title {font-weight: bold}
div.title {text-align:center; font-size: 24; }
div.entry div.title {text-align: left; font-variant: normal}
span.date {font-style: italic}
span.date:after{content:" by"}
div.content {font-style: italic}
div.content i {font-style: normal; font-weight: bold}
#one {color: red}</style>
<div class=title>My Weblog</div>
<div class="entry">
<div class=title>What I Did Today</div>
<div class=byline>
<span class=date>Feb. 11, 2008</span> <span class=author>Bijan Parsia</span>
</div>
<div class="content" id="one">
<p>Taught a class and it went <i>very</i> well.</p>
</div>
</div>

Reading & References (1)

Reading & References (2)